Skip to content

GC: implement a grow-vs-collect heuristic.#12942

Open
cfallin wants to merge 8 commits intobytecodealliance:mainfrom
cfallin:gc-growth-heuristic
Open

GC: implement a grow-vs-collect heuristic.#12942
cfallin wants to merge 8 commits intobytecodealliance:mainfrom
cfallin:gc-growth-heuristic

Conversation

@cfallin
Copy link
Copy Markdown
Member

@cfallin cfallin commented Apr 2, 2026

This implements the heuristic discussed in #12860: it replaces the existing behavior where Wasmtime's GC, when allocating, will continue growing the GC heap up to its size limit before initiating a collection. That behavior optimizes for allocation performance but at the cost of resident memory size -- it is at one extreme end of that tradeoff spectrum.

There are a number of use-cases where there may be heavy allocation traffic but a relatively small live-heap size compared to the total volume of allocations. For example, lots of temporary "garbage" may be allocated by many workloads. Or, more pertinently to #12860, a C/C++ workload that uses the underlying GC heap only for exceptions, and uses those exceptions in a way that only one exnref is live at a time (no exn objects are stashed away and used later), will also generate a lot of "garbage" during normal execution. These kinds of workloads benefit significantly from more frequent collection to keep the resident-set size small. This also may benefit performance, even accounting for the cost of the collection itself, because it keeps the footprint of touched memory within higher cache-hierarchy levels.

In order to accommodate that kind of workload while also presenting reasonable behavior to large-working-set-size benchmarks, it is desirable to implement an adaptive policy. To that end, this PR implements a scheme similar to our OwnedRooted allocation/collection algorithm (and specified explicitly here by fitzgen): we use the last live-heap size (post-collection) compared to current capacity to decide whether to grow or collect. When the current capacity is more than twice the last live-heap size, we collect first; if we still can't allocate, then we grow. Otherwise, we just grow.

The idea is that (when combined with an exponential heap-growth rule) the continuous-allocation case will collect once at each power-of-two, then grow; this is "amortized constant time" overhead. A case with a stable working-set size but with some ups and downs will never hit a "threshold-thrashing" problem: the heap capacity will tend toward twice the live-heap size, in the steady state (see proof here for the analogous algorithm for OwnedRooted). Thus we have a nice, deterministic bound no matter what, with no bad (quadratic or worse) cases.

This PR adds a test that creates a bunch of almost-immediately-dead garbage (allocates a GC struct that is only live for one iteration of a loop) and checks the heap size at each iteration. To allow this check, it also adds a method to Store to get the current GC heap capacity, which seems like a generally useful kind of observability as well.

This implements the heuristic discussed in bytecodealliance#12860: it replaces the
existing behavior where Wasmtime's GC, when allocating, will continue
growing the GC heap up to its size limit before initiating a
collection. That behavior optimizes for allocation performance but at
the cost of resident memory size -- it is at one extreme end of that
tradeoff spectrum.

There are a number of use-cases where there may be heavy allocation
traffic but a relatively small live-heap size compared to the total
volume of allocations. For example, lots of temporary "garbage" may be
allocated by many workloads. Or, more pertinently to bytecodealliance#12860, a C/C++
workload that uses the underlying GC heap only for exceptions, and
uses those exceptions in a way that only one `exnref` is live at a
time (no `exn` objects are stashed away and used later), will also
generate a lot of "garbage" during normal execution. These kinds of
workloads benefit significantly from more frequent collection to keep
the resident-set size small. This also may benefit performance, even
accounting for the cost of the collection itself, because it keeps
the footprint of touched memory within higher cache-hierarchy levels.

In order to accommodate that kind of workload while also presenting
reasonable behavior to large-working-set-size benchmarks, it is
desirable to implement an *adaptive* policy. To that end, this PR
implements a scheme similar to our OwnedRooted allocation/collection
algorithm (and specified explicitly [here] by fitzgen): we use the
last live-heap size (post-collection) compared to current capacity to
decide whether to grow or collect. When the current capacity is more
than twice the last live-heap size, we collect first; if we still
can't allocate, then we grow. Otherwise, we just grow.

The idea is that (when combined with an exponential heap-growth rule)
the continuous-allocation case will collect once at each power-of-two,
then grow; this is "amortized constant time" overhead. A case with a
stable working-set size but with some ups and downs will never hit a
"threshold-thrashing" problem: the heap capacity will tend toward
twice the live-heap size, in the steady state (see proof [here](proof)
for the analogous algorithm for `OwnedRooted`). Thus we have a nice,
deterministic bound no matter what, with no bad (quadratic or worse)
cases.

This PR adds a test that creates a bunch of almost-immediately-dead
garbage (allocates a GC struct that is only live for one iteration of
a loop) and checks the heap size at each iteration. To allow this
check, it also adds a method to `Store` to get the current GC heap
capacity, which seems like a generally useful kind of observability as
well.

[here]: bytecodealliance#12860 (comment)
[proof]: https://github.com/bytecodealliance/wasmtime/blob/e5b127ccd71dbd7d447a32722b2c699abc46fe61/crates/wasmtime/src/runtime/gc/enabled/rooting.rs#L617-L678
@cfallin cfallin requested a review from fitzgen April 2, 2026 19:19
@cfallin cfallin requested a review from a team as a code owner April 2, 2026 19:19
@github-actions github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:ref-types Issues related to reference types and GC in Wasmtime labels Apr 2, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

Subscribe to Label Action

cc @fitzgen

Details This issue or pull request has been labeled: "wasmtime:api", "wasmtime:ref-types"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: wasmtime:ref-types

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@cfallin cfallin force-pushed the gc-growth-heuristic branch from af844e1 to 02d471e Compare April 3, 2026 22:20
@cfallin
Copy link
Copy Markdown
Member Author

cfallin commented Apr 3, 2026

Updated; thanks!

Copy link
Copy Markdown
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@cfallin
Copy link
Copy Markdown
Member Author

cfallin commented Apr 8, 2026

@fitzgen I looked into the test failure here and it seems to be kind of a fundamental mismatch between the null collector's inline path and Store::gc's contract (here) that allocation may not succeed even after GC.

The DRC collector's allocation path goes through retry_after_gc_async (via the gc_alloc_raw libcall), which implements the new two-level retry behavior in this PR, which (given the algorithm) is guaranteed to grow if more space is needed. However, the null collector's inline path (NullCompiler::emit_inline_alloc) only calls the grow_gc_heap libcall (which calls Store::gc with bytes_needed) and then retries allocation once. That's not sufficient to cause a growth to occur with the new algorithm, as long as your requested heuristic) is present.

I think the actual bug here appears to be the mismatch in contracts between the inline alloc path and Store::gc's contract, but it also seems like we shouldn't be duplicating the growth heuristic or retry logic multiple times. I kind of prefer restoring the old Store::gc behavior to keep the null allocator working but could either (i) refactor the null allocator to have a double-retry inline or (ii) remove the inline allocation path altogether, and make the same libcall as the DRC collector, if you prefer.

@fitzgen
Copy link
Copy Markdown
Member

fitzgen commented Apr 8, 2026

However, the null collector's inline path (NullCompiler::emit_inline_alloc) only calls the grow_gc_heap libcall (which calls Store::gc with bytes_needed) and then retries allocation once. That's not sufficient to cause a growth to occur with the new algorithm, as long as your requested heuristic) is present

I think the grow_gc_heap libcall should just grow the GC heap directly, not try to do it indirectly through store.gc(Some(bytes_needed)) or store.grow_or_collect_gc_heap. We should just make StoreOpaque::grow_gc_heap in crates/wasmtime/src/runtime/store/gc.rs be pub(crate) and call it from the grow_gc_heap libcall.

How does that sound?

@cfallin
Copy link
Copy Markdown
Member Author

cfallin commented Apr 8, 2026

Ah, right, that's way simpler -- done!

@cfallin cfallin enabled auto-merge April 8, 2026 23:37
Copy link
Copy Markdown
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@cfallin cfallin disabled auto-merge April 9, 2026 02:16
@cfallin
Copy link
Copy Markdown
Member Author

cfallin commented Apr 9, 2026

There's a weird failure with the GC oom test that wasn't there before -- will try to bottom this out later when I have some more time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:ref-types Issues related to reference types and GC in Wasmtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants