The pointer_button_count check added in dc36e2165 uses the client side
view of the implicit grab, which is stale when the button release is
still in flight: glfwStartDrag passes the check, but by the time the
compositor processes wl_data_device.start_drag the implicit grab is
gone and the request is silently dropped. The data source then never
receives any event, so the tab drag state leaks forever, hijacking
mouse handling again, and the already created xdg_toplevel_drag
toplevel gets mapped as a stray undecorated window showing the drag
thumbnail that nothing ever destroys.
Detect this deterministically using protocol ordering: issue a
wl_display.sync right after start_drag. An accepted start_drag
synchronously produces events (wl_pointer.leave from the DND grab
taking over, wl_data_device.enter, wl_data_source events) that are
ordered before the sync callback. If the callback fires with none of
them seen, the request was dropped: cancel the drag, which notifies
the application (clearing the tab drag state) and destroys the drag
toplevel and data source.
Also defer mapping the drag toplevel until the session is confirmed,
so the stray window can never appear, not even for one frame.
Verified against headless GNOME Shell 50.2 (mutter) with injected
pointer timing scans: the unpatched build deadlocks with an orphaned
drag toplevel within ~13 fast flicks, the patched build catches every
race hit and cancels cleanly, with no deadlock, no stray window, and
normal slow drag/reorder/detach behaviour preserved.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
A quick click-and-flick on a tab could leave all of kitty with mouse
input permanently redirected to the tab bar, making every window
unclickable and text selection impossible.
Starting a tab drag is asynchronous: the drag thumbnail is rendered on
the next frame before glfwStartDrag is called. If the button is
released in that window, wl_data_device_start_drag is sent with a stale
serial that no longer matches an active pointer implicit grab, so the
compositor silently ignores it. The wl_data_source then never receives
any event, on_drag_source_finished never runs, and the
tab_being_dragged state is stuck forever, hijacking all mouse events.
Fix in layers:
- glfw/Wayland: track the implicit grab (serial of the first button
press and pressed-button count), use that serial for start_drag and
refuse with EAGAIN when there is no active implicit grab instead of
letting the compositor silently drop the request
- mouse.c: a left button release arriving while a tab drag is marked
started but no system DND is active means the drag never launched
(an active DND consumes the release on all platforms), so clear the
drag state instead of waiting for DND events that will never come
- tabs.py: handle OSError from start_drag_with_data for tab drags the
same way window drags already do; clear the potential-drag state when
the release lands on the new-tab button or empty tab bar area
- tabs.py/boss.py: clear drag state on drag finish/drop even when the
dragged tab has already been closed
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
As is typical with Wayland, the protocol is poorly designed and
implemented even worse. Hyprland 0.53 has completely broken color
management.
https://github.com/hyprwm/Hyprland/discussions/12788
In addition it and mangowc crash when using color management with nouveau drivers.
https://github.com/kovidgoyal/kitty/issues/9030
KDE kwin does not support the sRGB transfer function. And the geniuses
at Wayland are any way planning to deprecate sRGB as a transfer function.
Only GNOME mutter seems to get it right.
Then there are people that are likely going to shoehorn this into EGL
instead of leaving it under application control via the protocol anyway.
https://github.com/KhronosGroup/EGL-Registry/issues/197
Sigh. Wayland.
This does not match X11/macOS behavior. And I see no logical reason why
it should be so. The wheel_scroll_multiplier should be used to adjust
this by end users.
On compositors that support compositor key repeat events, use those, for
complete robustness. Sadly no actual compositor implements these yet.
Otherwise use a timer fd/pipe to queue the repeat events and only
dispatch them after events from the compositor are handled. This means
release events from the compositor will prevent spurious repeat events.
One can, in the worst case lose some repeat events if there is a very
large interval between the start of the timer and the next poll, but
that is unavoidable and is why repeat events should come from the compositor
in the first place.
Fixes#9224
Far more robust. Sadly no actual compositors yet support this. Fifteen
years it takes Wayland developers to correct their most basic mistakes.
See #9224