Skip to content

perf(crypto): stream field-element bytes into hashers and transcript#773

Open
Oppen wants to merge 2 commits into
perf/rkyv-serializationfrom
perf/fixed-fe-bytes
Open

perf(crypto): stream field-element bytes into hashers and transcript#773
Oppen wants to merge 2 commits into
perf/rkyv-serializationfrom
perf/fixed-fe-bytes

Conversation

@Oppen

@Oppen Oppen commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Eliminates the per-field-element heap allocation in Merkle leaf hashing and Fiat-Shamir transcript appends: AsBytes gains a stream_bytes sink method, overridden zero-alloc for the Goldilocks base field and its degree-3 extension, replacing as_bytes()/to_bytes_be() returning a fresh Vec<u8> per element per hash.
  • Adds FieldElementVectorBackend::hash_data_parts + verify_merkle_path_from_hash so row-pair openings (verify_opening_pair, verify_composition_poly_opening, verify_fri_layer_openings) hash two slices directly instead of concatenating them into a throwaway Vec first; verify_composition_poly_opening and hash_data now delegate to their row-pair/parts siblings instead of duplicating the hashing loop.
  • Follow-up: the degree-3 extension's stream_bytes was calling the base-field override three times (one per limb), landing as three separate Digest::update calls on the guest instead of one. Disassembly confirmed the dyn FnMut sink itself was fully devirtualized (no indirect-call cost), so the actual waste was call count. Now reuses the existing zero-alloc write_bytes_be override into a stack buffer and sinks once.
  • Recursion guest: single-query 89.7M → 71.1M cycles, multi-query 2.21B → 1.78B cycles (handoff's baselines were 89.7M / 2.21B; target was ~1.85–1.95B multi-query, beaten by a further margin after the follow-up).

Test plan

  • cargo test --workspace --exclude math-cuda (492 prover + 137 stark tests, all green) after each commit
  • make test-ethrex
  • cargo test -p lambda-vm-prover --lib test_recursion_execute_1query -- --ignored --nocapture (in-VM verify accepts, guest commits vk_digest ‖ output, byte-identical across every change)
  • make test-profile-recursion-single / -multi cycle counts confirmed against baseline after each commit
  • Guest ELF disassembly (llvm-objdump) confirming stream_bytes devirtualizes and the extension-field batching removes redundant BlockBuffer::digest_blocks/memcpy calls
  • Multiple rounds of independent adversarial review (correctness, performance incl. a dedicated missed-optimization hunt, implementation simplicity) against the diff

Oppen added 2 commits July 3, 2026 15:41
Eliminates the per-element heap allocation in Merkle leaf hashing and
Fiat-Shamir transcript appends. AsBytes gains a stream_bytes sink method,
overridden zero-alloc for Goldilocks base and degree-3 extension; the
Merkle backends and DefaultTranscript use it instead of as_bytes()/
to_bytes_be(). Adds hash_data_parts + verify_merkle_path_from_hash so
row-pair openings hash two slices directly instead of concatenating them
into a throwaway Vec first; verify_composition_poly_opening and
FieldElementVectorBackend::hash_data now delegate to their row-pair/parts
siblings instead of duplicating the same hashing loop.

Recursion guest: single-query 89.7M -> 73.7M cycles, multi-query
2.21B -> 1.82B cycles.
stream_bytes was calling the base-field override three times, one per
limb, each landing as its own Digest::update on the guest — three
BlockBuffer::digest_blocks + memcpy calls instead of one. Disassembly
of the guest ELF showed dyn dispatch itself was fully devirtualized by
the #[inline(always)] chain, so the actual cost was the call count, not
indirection. Reuse the existing write_bytes_be override (already
zero-alloc, byte-identical layout) into a stack buffer and sink once.

Recursion guest: single-query 73.9M -> 71.1M cycles, multi-query
1.82B -> 1.78B cycles.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant