Add array eval copy API#822
Draft
He-Pin wants to merge 2 commits intodatabricks:masterfrom
Draft
Conversation
Motivation: Avoid copying large array slices and remove/removeAt intermediates after the lazy-array work. This follows jrsonnet's indexed slice-view idea while keeping JVM retention under control for small sub-slices. Modifications: - add Val.Arr.sliced and SliceArr for large or compact-source slices - route array slicing and std.remove/removeAt through slice/concat views - let large concat decisions use total length, with overflow protection - add correctness coverage and a slice/remove benchmark resource Results: - ./mill --no-server 'sjsonnet.jvm[3.3.7].compile' - ./mill --no-server 'sjsonnet.jvm[2.13.18].compile' - ./mill --no-server 'sjsonnet.jvm[2.12.21].compile' - ./mill --no-server 'sjsonnet.jvm[3.3.7].test.testOnly' sjsonnet.ValArrayViewTests sjsonnet.Std0150FunctionsTests - ./mill --no-server 'sjsonnet.jvm[3.3.7].test' - ./mill --no-server 'sjsonnet.jvm[3.3.7].checkFormat' - ./mill --no-server bench.checkFormat - JMH runRegressions: lazy_array_slice_remove 5.890 -> 1.089 ms/op - hyperfine macro slice/remove: 498.6 ms -> 335.5 ms
Motivation: Several stdlib consumers fully copy array elements after the lazy-array work. Centralizing that path avoids repeated directBackingArray/range/view branches and lets concat, repeat, slice, range, and byte arrays expose cheap bulk Eval copies without forcing Val values. Modifications: - add Arr.copyEvalTo overloads for ArrayBuilder and preallocated Array[Eval] - teach concat materialization/eager concat to copy through the new API - add specialized copy implementations for repeat, slice, reversed lazy views, range, and byte arrays - route std.flattenArrays, array flatMap, and array-separator std.join through the API - add correctness coverage and an array_copy_views regression benchmark Results: - ./mill --no-server 'sjsonnet.jvm[3.3.7].compile' - ./mill --no-server 'sjsonnet.jvm[2.13.18].compile' - ./mill --no-server 'sjsonnet.jvm[2.12.21].compile' - ./mill --no-server 'sjsonnet.jvm[3.3.7].test.testOnly' sjsonnet.ValArrayViewTests sjsonnet.Std0150FunctionsTests - ./mill --no-server 'sjsonnet.jvm[3.3.7].test' - ./mill --no-server 'sjsonnet.jvm[3.3.7].checkFormat' - ./mill --no-server bench.checkFormat - ./mill --no-server 'sjsonnet.native[3.3.7].nativeLink' - JMH runRegressions vs slice baseline: array_copy_views 16.871 -> 13.937 ms/op - Scala Native hyperfine vs slice baseline: array_copy_views 26.1 ms -> 10.9 ms, 2.39x faster
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
#821 adds slice views, but stdlib consumers still need one narrow way to bulk-copy arrays without each call site reimplementing
directBackingArray/ range / view branches.This PR adds a shared
Evalcopy API that preserves laziness and gives fully-consumed array paths a simple JVM/Scala Native friendly implementation.Constraints:
Modification
Stacked on #821.
Add a narrow
Arr.copyEvalToAPI for bulk copyingEvalreferences without forcingValvalues.This PR routes these consumers through the API:
std.flattenArraysstd.flatMapstd.joinSpecialized copy paths cover concat, repeat, slice, reversed lazy views, range, and byte arrays.
Result
Verification passed:
./mill --no-server 'sjsonnet.jvm[3.3.7].compile'./mill --no-server 'sjsonnet.jvm[2.13.18].compile'./mill --no-server 'sjsonnet.jvm[2.12.21].compile'./mill --no-server 'sjsonnet.jvm[3.3.7].test.testOnly' sjsonnet.ValArrayViewTests sjsonnet.Std0150FunctionsTests./mill --no-server 'sjsonnet.jvm[3.3.7].test'./mill --no-server 'sjsonnet.jvm[3.3.7].checkFormat'./mill --no-server bench.checkFormat./mill --no-server 'sjsonnet.native[3.3.7].nativeLink'JMH, JVM harness, compared with #821 slice baseline:
array_copy_viewsrealistic2lazy_array_slice_removelarge_string_joinScala Native hyperfine, compared with #821 slice baseline, using Scala Native binaries, not JVM jars:
array_copy_viewscopy-apiran 2.39 +/- 0.08 times faster thanslice-baseline.External performance diff, against jrsonnet built from source at
80cd36awithcargo build --release -p jrsonnet(jrsonnet 0.5.0-pre98):array_copy_viewsJIT / GC review:
Eval, notVal, so laziness and callback/error retry behavior stay intact.copyEvalTo(Array[Eval], offset)lets hot consumers write into one preallocated array with indexed while-loops.Rollback boundary:
References