Skip to content

core/screen: retain removed screens to avoid use-after-free#879

Open
sunshowers wants to merge 1 commit into
quickshell-mirror:masterfrom
sunshowers:retain-removed-screens
Open

core/screen: retain removed screens to avoid use-after-free#879
sunshowers wants to merge 1 commit into
quickshell-mirror:masterfrom
sunshowers:retain-removed-screens

Conversation

@sunshowers

@sunshowers sunshowers commented Jun 29, 2026

Copy link
Copy Markdown

Previously, when a screen was disconnected, QuickshellTracked::updateScreens deleted its QuickshellScreenInfo wrapper. But a copy of that QObject* may have been put into a QVariant as a bare, unguarded pointer. Deleting the wrapper leaves that pointer dangling, and the next time the engine accesses it, the process segfaults with a use-after-free error.

This was observed as intermittent crashes on monitor wakeup under niri with DankMaterialShell, confirmed from two core dumps to be a freed QuickshellScreenInfo inside a Toplevel.screens list retained by a model.

Both core dumps had the following in them:

screens QVariant @ 0x7fecc79e4eb8  metatype='QList<QuickshellScreenInfo*>'
  inline QList {d=0x7fecf1921320, ptr=0x7fecf1921330, size=1}
    screens[0] = 0x7fed0313b6a0  <-- freed (vtable=0x1 d_ptr=0x7)

Fix this by not actually deleting lost screens and instead just putting them in an internal vector. Note that this is a memory leak, in the sense that the list will keep growing in an unbounded fashion. But over the lifetime of a quickshell process there aren't that many screens connected and disconnected, so it seems like a relatively straightforward way to handle this.

Note that deleteLater doesn't quite work here, because these raw pointers can be held across event loop iterations. A more structural fix is likely some kind of generational arena + weak index pattern, but that's outside the scope of this PR.

TODO:

  • Is this the right fix? Any ideas on how to avoid or bound the memory leak?

Previously, when a screen was disconnected, `QuickshellTracked::updateScreens` deleted its `QuickshellScreenInfo` wrapper. But a copy of that `QObject*` may have been boxed into a `QVariant` and left a bare, unguarded pointer. Deleting the wrapper then dangles that pointer, and the next time the engine re-wraps it (the process segfaults).

This was observed as intermittent crashes on monitor wakeup under niri with DankMaterialShell, confirmed from two core dumps to be a freed `QuickshellScreenInfo` inside a `Toplevel.screens` list retained by a model.

Both core dumps had the following in them:

```
screens QVariant @ 0x7fecc79e4eb8  metatype='QList<QuickshellScreenInfo*>'
  inline QList {d=0x7fecf1921320, ptr=0x7fecf1921330, size=1}
    screens[0] = 0x7fed0313b6a0  <-- freed (vtable=0x1 d_ptr=0x7)
```

Fix this by not actually deleting lost screens and instead just putting them in an internal vector. Note that this is a memory leak, in the sense that the list will keep growing in an unbounded fashion. But screens are pretty bounded, so it seems like a relatively straightforward way to handle a somewhat rare occurrence.

Note that `deleteLater` doesn't quite work here, because these raw pointers can be held across event loop iterations. A more structural fix is likely some kind of generational arena + weak index pattern, but that's outside the scope of this PR.
@sunshowers

sunshowers commented Jun 30, 2026

Copy link
Copy Markdown
Author

(I've patched this locally, and am going to run with it for a while to monitor if I see any other crashes.)

@outfoxxed

outfoxxed commented Jun 30, 2026

Copy link
Copy Markdown
Member

I've never managed to get the QML engine to choke on a freed pointer in any prior testing I've done to trigger this class of bug, but I went and did an extra round of testing and didn't find anything for similar cases.

There's really only one point we could reasonably return an invalid QList<QuickshellScreenInfo*> from in the backend which is WindowsetProjection::screens. I just fixed that in 1661ef1, but dms doesn't seem to use it so it may or may not help.

@sunshowers

sunshowers commented Jun 30, 2026

Copy link
Copy Markdown
Author

Thanks. One thing to note here is that my monitor is a Samsung G9 57 which I run at 7680x2160 240Hz with an AMD GPU. I've seen that cause some issues on monitor wakeup, with link training failing for a bit then succeeding.

Both crashes I saw have this stack trace: https://gist.github.com/sunshowers/e439bd41d78fb5f189088471fc8d3e36

@sunshowers

Copy link
Copy Markdown
Author

There's really only one point we could reasonably return an invalid QList<QuickshellScreenInfo*> from in the backend

From what I can tell it's not Quickshell returning a QList<QuickshellScreenInfo*> that is invalid at the time it is returned. Rather, Quickshell returns a valid pointer that DMS caches internally. Then the pointer becomes invalid, and DMS attempts to access that.

@sunshowers

Copy link
Copy Markdown
Author

From my journalctl, both the crashes I saw have the following pattern:

21:48:47.980  kernel  amdgpu [drm] enabling link 0 failed: 15        <- DP link training fails
21:48:48.010  kernel  DM_MST: stopping TM on aconnector [id: 377]
21:48:48.047  niri    disconnecting connector: "DP-5"                <- output removed, screen wrapper freed
21:48:50.587  kernel  DM_MST: starting TM on aconnector [id: 377]
21:48:50.592  kernel  DP14, 4-lane link detected                     <- link training succeeds (~2.5s later)
21:48:51.218  kernel  Downstream port present 0, type 0
21:48:51.280  niri    new connector: DP-5 "Samsung Odyssey G95NC"    <- output re-added, screensChanged
21:48:51.280  niri    connecting connector: DP-5
21:48:51.281  niri    putting output DP-5 at x=0 y=0
21:48:51      ──────  crash (coredumpctl timestamp)                  <- re-wrap of the stale snapshot

So yeah, it seems like this is more likely to happen if a screen is added, removed, then rapidly added again afterwards. This might be why if you're not on a monitor that sometimes fails to do link training (such as a large DisplayPort MST display) you might not see the crash.

I'm not sure what facilities Quickshell or QML has to inject monitor add/remove events, but if there is such a mechanism, reproducing this might be more feasible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants