interp: avoid repeated object clones on partial stores#5368
Open
jakebailey wants to merge 3 commits into
Open
Conversation
memoryView.store cloned the entire destination object on every partial
store. Compiling a program that pulls in golang.org/x/text/collate hit
this hard: its generated tables are initialized one element at a time,
and each element store copied the whole global, making interp quadratic
in the table size.
Switch to copy-on-first-write per memory view: clone the object the
first time this view writes to it, then mutate that private copy in
place on later partial stores. Rollback is unaffected because revert
still discards the view's objects wholesale.
Two extra safety tweaks fall out of this:
- Full overwrites now clone the incoming value, so the stored buffer
can never alias state held by the caller.
- Partial in-place stores snapshot the source buffer, so a store
whose source is a load from the same object (overlapping or not)
still sees the pre-store bytes.
Add an interp golden test exercising both the overlapping load/store
case and a cross-object aliased load/store case.
Member
Author
|
I mistakenly benchmarked on non-wasip1 but the effect is about the same in absolute time (Wasm just has extra steps I have optimized in other places but not on this branch; other PRs coming) |
Member
Member
Author
|
This also fixes #2083!
|
Member
|
This also fixes an internal build we were having trouble with \o/ |
Member
|
Test corpus failure (passes on dev): |
Member
Author
|
Bummer, I will look into this when I have time |
Adds a third init function to the store testdata: an initial partial store puts the buffer into the current memory view, then a partial load is followed by another partial store at the same offset, and finally the loaded value is written to a separate global. The expected output captures the current (incorrect) behavior of the in-place partial store optimization: the loaded value gets corrupted by the subsequent in-place store. The next commit fixes this.
The optimization to skip cloning the destination object on partial stores when the buffer is already owned by the current memory view mutated obj.buffer.buf in place. The previous v.buf clone only handled the case where the store value itself aliased that buffer; it did not cover the more common case where an earlier partial load had returned a slice into the same buffer. The in-place store would then corrupt the previously loaded value, breaking code such as cloudflare/bn256 where the gfP arithmetic loads, computes, and stores within the same global during package initialization. Fix this in load instead: when the source object is owned by the current memory view, copy the loaded slice so the returned value is independent of the live buffer. The v.buf clone in store is no longer needed and is removed. Updates the regression test added in the previous commit to reflect the corrected behavior.
Member
Author
|
"when I have time" is apparently now |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
memoryView.storecloned the entire destination object on every partial store.tsgoimportsgolang.org/x/text/collateand hits this badly. The generated tables are initialized one element at a time, and each element store copied the whole global, makinginterpquadratic in the table size.By avoiding this copy until needed, we can greatly improve the perf.
devhttps://jakebailey.dev/how-much-faster/#old=369&new=232 1.59x faster 😄
Fixes #2083