feat(storage): diff-layer state storage with bounded pruning#444
Draft
MegaRedHand wants to merge 8 commits into
Draft
feat(storage): diff-layer state storage with bounded pruning#444MegaRedHand wants to merge 8 commits into
MegaRedHand wants to merge 8 commits into
Conversation
d9436b6 to
7e76c36
Compare
Store every non-genesis state as a parent-linked diff (StateDiffs, never pruned) plus full-state snapshots (States) at anchors and hot states, so the full state history stays reconstructable cheaply and aggressive state pruning is no longer needed. - StateDiff stores slot, checkpoints, and the justification fields in full, plus the appended historical_block_hashes tail; config/validators come from the nearest snapshot and latest_block_header from BlockHeaders. - get_state reconstructs by walking parent diffs to the nearest snapshot; 1024-slot anchors (StateAnchors) bound the walk. - Snapshot eviction (prune_old_states) keeps the last 300 slots + anchors + finalized/justified/head; evicted snapshots leave their diff behind. - Block signatures are pruned only for old finalized blocks, keeping a recent SIGNATURE_PRUNING_RANGE window and all non-finalized signatures; block headers and bodies are kept forever. Claude-Session: https://claude.ai/code/session_01RnSujepExeyvKWRsSdZxFN
7e76c36 to
b97129f
Compare
- Bundle the parent base of a diff into a `DiffBase { root, hbh_len, slot }`
struct, shrinking `insert_state_with_diff` from five positional args to three.
- Register the anchor bootstrap snapshot in `StateAnchors` from `init_store`, so
genesis / checkpoint-sync (the base of every diff chain) is never evicted.
Previously it could be pruned once finality advanced past the hot window,
making the first 1024-slot window unreconstructable.
- Enforce the invariant that a `States` snapshot is never written alone: it is
always paired with a `StateDiffs` (parented states) or `StateAnchors`
(bootstrap) entry. Plain `insert_state` now writes `States` alone only in
tests, so it is gated `#[cfg(test)]`.
Claude-Session: https://claude.ai/code/session_01RnSujepExeyvKWRsSdZxFN
…ktree-feat+state-diff-layers
Replace the verbose field-by-field DiffBase literal at the block-import call site with `DiffBase::from_state(parent_root, &parent_state)`: the caller passes the already-known parent root and the constructor reads hbh_len/slot from the state. Claude-Session: https://claude.ai/code/session_01RnSujepExeyvKWRsSdZxFN
StateDiff and DiffBase are storage-persistence details, not shared consensus types, so move them from `ethlambda-types` into a new `state_diff` module in the storage crate (adds the `libssz-types` dep). `DiffBase` fields are now crate-internal (`pub(crate)`); external callers construct via `DiffBase::from_state`. Claude-Session: https://claude.ai/code/session_01RnSujepExeyvKWRsSdZxFN
… to state_diff - Drop the `#[cfg(test)]` `Store::insert_state` method (no production caller); tests seed a base snapshot via a plain `insert_snapshot` test helper. - Move the snapshot + diffs -> State assembly out of `Store::reconstruct_state` into `state_diff::reconstruct`; the store keeps only the diff-walk and header fetch. - Inline the one-line `get_state_snapshot` / `get_state_diff` wrappers into their call sites via `get_ssz`. Claude-Session: https://claude.ai/code/session_01RnSujepExeyvKWRsSdZxFN
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces aggressive state pruning with a diff-layer storage model so the full state history stays available cheaply, and relaxes block pruning to keep headers/bodies forever while only dropping old finalized signatures.
State storage
StateDiff(StateDiffstable, never pruned) plus a full-state snapshot (States) at anchors and hot states.StateDiffstoresslot, both checkpoints, and the justification fields in full, plus the appendedhistorical_block_hashestail.config/validatorscome from the nearest snapshot (they never change);latest_block_headeris read back fromBlockHeaders(the stored state caches the realstate_rootthere, so it matches byte-for-byte).get_statereturns a snapshot directly, else reconstructs by walkingbase_rootto the nearest ancestor snapshot and replaying appended tails forward.StateAnchors, permanent) bound the reconstruction walk.prune_old_states) keeps the lastSNAPSHOT_HOT_WINDOW = 300slots + anchors + finalized/justified/head; evicted snapshots leave their diff behind.Block pruning
prune_old_block_signatures(finalized_slot, tip_slot): withcutoff = tip_slot - SIGNATURE_PRUNING_RANGE, prune signatures forslot < cutoffonly whencutoff <= finalized_slot(healthy finality); during deep non-finality (non-finalized range > window) prune nothing.BlockHeadersandBlockBodiesare kept forever; all non-finalized signatures are always retained.get_signed_blockreturnsNonefor a pruned finalized block (deep historical signed-block serving via BlocksByRoot is lost; peers use checkpoint sync).Tests
StateDiffbuild/SSZ round-trip; state reconstruction (single + multi-diff after eviction); anchor recording; snapshot eviction (window/protected/anchors); signature pruning (healthy window / deep non-finality / early chain).-D warningsclean.Status / follow-ups
prune_old_dataruns on the node's finalization path (blockchain/src/lib.rs), so snapshot eviction + reconstruction are exercised after ~300 slots.BlockBodiesnow grows unbounded; pruning bodies on a longer window is a possible follow-up.https://claude.ai/code/session_01RnSujepExeyvKWRsSdZxFN