diff --git a/docs/design/benchmarking.md b/docs/design/benchmarking.md new file mode 100644 index 0000000..9ca2aba --- /dev/null +++ b/docs/design/benchmarking.md @@ -0,0 +1,222 @@ +# Benchmarking + +> Status: design rationale for the benchmark suite under [`benches/`](../../benches) +> and shared benchmark support under [`bench-support/`](../../bench-support). +> Companion to [`design.md`](design.md) §10 and the benchmark reference docs. + +cachekit benchmarks are designed to answer cache questions, not just produce +fast-looking numbers. A cache policy can be excellent on uniform keys and weak +under scans, or fast on micro-operations and poor at preserving hit rate. The +benchmark suite therefore separates micro-operation cost, policy effectiveness, +trace-shaped workloads, reporting, and machine-readable artifacts. + +## Goals + +- Compare policies under workload shapes that resemble real cache traffic. +- Keep measured loops free of allocator noise and dynamic dispatch. +- Produce both human-readable reports and stable JSON artifacts. +- Preserve enough metadata to reproduce a run: git commit, branch, dirty bit, + rustc version, host triple, CPU model, capacity, universe, operations, seed. +- Make adding a policy or workload a registry edit, not a benchmark rewrite. + +## Benchmark Layers + +The benchmark suite has four layers: + +| Layer | Files | Purpose | +|---|---|---| +| Criterion measurements | `benches/workloads.rs`, `benches/ops.rs`, `benches/comparison.rs`, `benches/policy/*.rs` | statistically sampled latency and throughput | +| Console reports | `benches/reports.rs` | fast, readable tables without Criterion overhead | +| JSON artifact runner | `benches/runner.rs` | structured output for docs, charts, CI, historical comparison | +| Shared support crate | `bench-support/` | policy registry, workloads, metrics, JSON schema, doc renderer | + +This split is deliberate. Criterion is good for micro-benchmark statistics; the +artifact runner is good for automation; console reports are good while tuning a +policy locally. No single binary is forced to serve every audience. + +## Monomorphic Policy Registry + +Benchmarks iterate policies through `for_each_policy!` in +[`bench-support/src/registry.rs`](../../bench-support/src/registry.rs): + +```rust,ignore +for_each_policy! { + with |policy_id, display_name, make_cache| { + let mut cache = make_cache(CAPACITY); + // measured workload... + } +} +``` + +The macro expands to one block per concrete policy type. This avoids dynamic +dispatch in the measured loop while keeping policy iteration centralized. +`POLICIES` in the same module provides presentation metadata (stable id, +display name, chart color) for renderers and reports. + +The trade-off is that adding a policy touches the macro and metadata table. A +test (`policies_metadata_matches_macro`) keeps the two from drifting. This is +the same explicit-boilerplate-over-magic choice as `DynCache`: more arms in +source, fewer surprises in hot code. + +## Workload Registry + +Workload definitions live in `bench-support/src/registry.rs`; generators live in +[`bench-support/src/workload.rs`](../../bench-support/src/workload.rs). The +current standard workloads cover: + +- Uniform random keys for raw overhead baselines. +- Hot-set access for explicit skew. +- Sequential scan for scan-pollution stress. +- Zipfian and scrambled Zipfian for power-law access. +- Latest / recency-biased access. +- Shifting hotspots and flash crowds for adaptation. +- Composite scan-resistance mixes. + +[`docs/benchmarks/workloads.md`](../benchmarks/workloads.md) is the catalog. It +also contains a large roadmap of workloads that should not be confused with +implemented cases. New workloads should land first in the support crate, then in +the docs, then in reports. + +## Value Construction Discipline + +`benches/runner.rs` pre-allocates one `Arc` per key in the universe and +passes a closure that returns `Arc::clone`: + +```rust,ignore +fn preallocate_values() -> Vec> { + (0..UNIVERSE).map(Arc::new).collect() +} +``` + +The rule is: **do not allocate values inside the measured operation loop**. +Allocating on every miss makes the benchmark measure the allocator and value +constructor, not the policy. A cheap `Arc::clone` isolates hit/miss behaviour, +eviction order, and policy metadata overhead. + +This is especially important because policies store values differently: +`FastLru` stores `V` directly, while LRU / LFU / Heap-LFU use `Arc` in some +paths. Pre-allocation keeps those representation differences from dominating +the benchmark. + +## Artifact Schema + +`bench-support/src/json_results.rs` defines the stable JSON schema for results: + +- `SCHEMA_VERSION` follows semantic schema rules. +- Major bumps remove or rename required fields. +- Minor bumps add optional fields. +- Renderers accept any artifact with a matching major. + +Each `BenchmarkArtifact` contains: + +- `metadata`: timestamp, git commit, branch, dirty bit, rustc, host, CPU, + benchmark config. +- `results`: rows keyed by policy, workload, and `case_id`. +- `metrics`: optional typed sections for hit rate, throughput, latency, + eviction, scan resistance, adaptation speed. + +The schema is presentation-neutral. Markdown tables and charts are rendered +later by `bench-support/src/bin/render_docs.rs`, so measurement and presentation +can evolve independently. + +## Case IDs + +Use `case_id::*` constants from `json_results.rs` instead of string literals: + +- `hit_rate` +- `comprehensive` +- `scan_resistance` +- `adaptation` + +This catches typos at compile time and prevents a result section from silently +disappearing from rendered docs. Adding a new case means adding a constant, +teaching the runner to populate it, and teaching the renderer how to display it. + +## What Each Benchmark Answers + +| Benchmark | Question | +|---|---| +| `ops.rs` | What is the raw cost of `get` / `insert` / policy-specific operations? | +| `workloads.rs` | Which policies preserve hit rate under standard workloads? | +| `comparison.rs` | How does cachekit compare with external crates (`lru`, `quick_cache`)? | +| `policy/*.rs` | What is the cost of each policy's unique operations? | +| `reports.rs` | What should a human inspect while tuning? | +| `runner.rs` | What should CI and docs consume? | + +Do not overload one benchmark to answer all questions. If you need policy +micro-cost, use `ops.rs`; if you need hit rate under scans, use `workloads.rs` +or `runner.rs`. + +## Reproducibility Rules + +- Seed every workload. Default seed is 42 unless a benchmark is explicitly + sweeping seeds. +- Record the git dirty bit. Dirty runs are useful locally but should not be + published as release baselines without a note. +- Keep capacity, universe, and operation count visible in the artifact. +- Prefer `ScrambledZipfian` over raw `Zipfian` for cross-policy comparison when + hardware prefetch could bias hot-key locality. +- Do not compare results across machines without CPU metadata. Tail latency and + pointer-heavy policy cost are machine-sensitive. + +## CI and Documentation Flow + +The docs pipeline runs the benchmark suite, writes +`target/benchmarks//results.json`, and renders +`docs/benchmarks/latest/` plus charts. Release-tag snapshots live under +`docs/benchmarks/vX.Y.Z/`. + +Manual workflow: + +```bash +cargo bench --bench runner +./scripts/update_benchmark_docs.sh +``` + +The script is the high-level path for refreshing published benchmark docs. Use +individual benches (`cargo bench --bench ops`, `cargo bench --bench reports -- scan`) +while developing a policy. + +## Adding a Policy to Benchmarks + +1. Add the policy to `for_each_policy!` with a concrete constructor. +2. Add matching `PolicyMeta` in `POLICIES`. +3. Run the registry drift test. +4. Run `cargo bench --bench reports -- hit_rate` for a quick sanity check. +5. Run `cargo bench --bench runner` before publishing docs. + +Keep constructors comparable. If one policy needs `Arc` and another stores +`u64`, choose the value shape that preserves fairness and document the exception +in the registry comment. + +## Adding a Workload + +1. Implement the generator in `bench-support/src/workload.rs`. +2. Add a `WorkloadCase` in the registry with stable id and display name. +3. Add docs in [`docs/benchmarks/workloads.md`](../benchmarks/workloads.md). +4. Add renderer support if the workload needs a custom section. +5. Run at least one policy family expected to behave differently (for example, + LRU vs S3-FIFO for scan-heavy workloads). + +Do not add a workload just because it is mathematically interesting. It should +answer a policy-selection question. + +## Non-goals + +- Benchmarks are not formal proofs of policy optimality. +- Benchmarks are not stable ABI. The JSON schema is versioned, but Criterion + names and report formatting can change. +- Benchmarks do not hide hardware effects. They record enough metadata for the + reader to judge them. +- Benchmarks do not replace fuzzing or invariant tests; they measure behaviour + under selected workloads. + +## See Also + +- [Design overview](design.md) - §10 frames benchmarking at the principles level +- [Metrics](metrics.md) - recorder / snapshot / exporter split +- [Benchmark docs](../benchmarks/README.md) +- [Workload catalog](../benchmarks/workloads.md) +- [`bench-support/src/registry.rs`](../../bench-support/src/registry.rs) +- [`bench-support/src/json_results.rs`](../../bench-support/src/json_results.rs) +- [`benches/runner.rs`](../../benches/runner.rs) diff --git a/docs/design/builder-and-dyn-dispatch.md b/docs/design/builder-and-dyn-dispatch.md new file mode 100644 index 0000000..c7d8032 --- /dev/null +++ b/docs/design/builder-and-dyn-dispatch.md @@ -0,0 +1,454 @@ +# Builder and Runtime Dispatch + +> Status: design rationale for [`CacheBuilder`](../../src/builder.rs), +> [`CachePolicy`](../../src/builder.rs), and [`DynCache`](../../src/builder.rs). +> Companion to [`design.md`](design.md) §13, [`trait-hierarchy.md`](trait-hierarchy.md), +> and [`concurrency.md`](concurrency.md). + +cachekit ships 18 implemented eviction policies. The runtime dispatcher +currently wires 17 of them; CAR exists as a concrete policy but is not yet a +`CachePolicy` / `DynCache` variant. Most application code wants to pick a +policy — possibly at runtime, based on configuration — without writing one +monomorphized call site per policy. This document explains why that runtime +choice is delivered through an enum dispatcher rather than a `Box`, +what the user-visible cost is, and how to extend the surface when a new policy +lands. + +## The problem + +A user with a `policy: String` configuration value wants to write: + +```rust,ignore +let mut cache = build_cache_from_config(config); +cache.insert(key, value); +cache.get(&key); +``` + +without enumerating every builder-wired policy at each call site. The cache +type must therefore be **uniform across policies** — the concrete type the +caller holds cannot depend on which policy was chosen. + +Two Rust mechanisms give a uniform type: + +1. **Trait objects** — `Box>`, with dispatch through a + vtable per method call. +2. **Enum dispatch** — a closed sum of every policy, with dispatch + through a `match` per method call. + +cachekit picks mechanism 2. The rest of this document explains why and +what it costs. + +## Enum dispatch vs `Box` + +`Cache` is deliberately object-safe (see +[`trait-hierarchy.md`](trait-hierarchy.md#object-safety)) precisely so +`Box>` *can* be used; cachekit consumers can still take +that route in their own code. But the **library-provided** runtime +dispatcher is an enum, for five reasons: + +| Property | `Box>` | `DynCache` (enum) | +|---|---|---| +| Dispatch cost per call | Indirect call via vtable | Branch-predicted `match` | +| Devirtualization | No (opaque) | Yes (compiler sees the arm) | +| Inlining of policy body | No | Yes when the arm is statically reachable | +| Heap allocation per cache | One `Box` per cache | None (enum lives inline) | +| Closed vs open extension | Open (any `impl Cache`) | Closed (`#[non_exhaustive]` enum) | +| API stability for new policies | Adding a method is a breaking change | Adding a variant is a non-breaking change with `#[non_exhaustive]` | + +The dominant terms are dispatch cost and devirtualization. A `match` on +an enum tag is a single branch that predicts well in tight loops; the +optimizer often hoists it out entirely when the enum tag is invariant +across a benchmark inner loop. A vtable call cannot be devirtualized +without inlining context and forces the policy body to live behind an +opaque indirection. + +The cost is in extensibility. `Box` accepts any +out-of-tree policy that implements `Cache`; `DynCache` does not. +Users with their own policy implementations still use them directly — +`MyCache::new(…)` returns a concrete `MyCache` and works with any +code generic over `Cache`. The enum is only the **library-provided +dispatcher**, not a general substrate. + +## `CachePolicy` — config-carrying tag + +`CachePolicy` ([`src/builder.rs`](../../src/builder.rs)) is the +user-facing enum that selects a policy. It is **separate** from the +internal `CacheInner` enum, and it carries per-policy configuration: + +```rust,ignore +#[non_exhaustive] +#[derive(Debug, Clone, Copy, PartialEq)] +pub enum CachePolicy { + Fifo, + Lru, + FastLru, + LruK { k: usize }, + Lfu { bucket_hint: Option }, + HeapLfu, + TwoQ { probation_frac: f64 }, + S3Fifo { small_ratio: f64, ghost_ratio: f64 }, + Arc, + Lifo, + Mfu, + Mru, + Random, + Slru { probationary_frac: f64 }, + Clock, + ClockPro, + Nru, +} +``` + +Three design decisions are worth naming: + +- **`#[non_exhaustive]`.** Adding a new variant (e.g. when LIRS lands + off the roadmap) is a **minor** version bump rather than a major + one. Downstream `match` statements over `CachePolicy` must include a + `_ =>` arm, which is the standard `non_exhaustive` discipline. +- **Config carried inline.** `LruK { k }` rather than `LruK` + separate + `set_k`. The variant is the place where the parameter is + type-checked, and `CachePolicy` stays `Copy` because every payload + is `Copy`. This makes `let policy: CachePolicy = config.into();` + trivial and lets callers pass `CachePolicy` by value without + ceremony. +- **Tag separated from implementation.** `CachePolicy::Lru` is a + user-facing intent; `CacheInner::Lru(LruCore)` is the + internal storage. Keeping them separate means the internal type can + change (e.g. swap `LruCore` for a new implementation) without + touching the public enum. + +## `DynCache` — uniform runtime type + +The public dispatcher: + +```rust,ignore +pub struct DynCache +where + K: Copy + Eq + Hash + Ord, + V: Clone + Debug, +{ + inner: CacheInner, +} + +enum CacheInner /* same bounds */ { + #[cfg(feature = "policy-fifo")] Fifo(FifoCache), + #[cfg(feature = "policy-lru")] Lru(LruCore), + #[cfg(feature = "policy-fast-lru")] FastLru(FastLru), + #[cfg(feature = "policy-lru-k")] LruK(LrukCache), + #[cfg(feature = "policy-lfu")] Lfu(LfuCache), + #[cfg(feature = "policy-heap-lfu")] HeapLfu(HeapLfuCache), + #[cfg(feature = "policy-two-q")] TwoQ(TwoQCore), + #[cfg(feature = "policy-s3-fifo")] S3Fifo(S3FifoCache), + #[cfg(feature = "policy-arc")] Arc(ArcCore), + #[cfg(feature = "policy-lifo")] Lifo(LifoCore), + #[cfg(feature = "policy-mfu")] Mfu(MfuCore), + #[cfg(feature = "policy-mru")] Mru(MruCore), + #[cfg(feature = "policy-random")] Random(RandomCore), + #[cfg(feature = "policy-slru")] Slru(SlruCore), + #[cfg(feature = "policy-clock")] Clock(ClockCache), + #[cfg(feature = "policy-clock-pro")] ClockPro(ClockProCache), + #[cfg(feature = "policy-nru")] Nru(NruCache), +} +``` + +`CacheInner` is **private**. Users only see `DynCache`. Two consequences: + +- Internal policy structs (`LruCore`, `S3FifoCache`, …) do not leak + into the public type system through the dispatcher. They can be + refactored without breaking SemVer. +- Pattern-matching on the variant from outside the crate is + impossible, which forces feature requests through method additions + rather than match-arm proliferation in user code. + +### CAR builder gap + +CAR is implemented as a concrete policy (`src/policy/car.rs`) and has a +`policy-car` feature flag, but this branch does **not** currently expose it +through `CachePolicy` / `DynCache`. Users who want CAR instantiate the concrete +`CarCore` type directly. Closing the gap means adding a +`CachePolicy::Car` variant, a `CacheInner::Car(CarCore)` variant, and the +usual method / builder / test arms listed in [Adding a new policy](#adding-a-new-policy). + +Until that lands, read "implemented policies" and "`DynCache` variants" as two +different sets: + +- **Implemented concrete policies:** 18. +- **Runtime-dispatch variants:** 17. + +## Type bounds: heavier than `Cache` + +`Cache` requires only what each individual policy implementation +needs (typically `K: Eq + Hash`, sometimes `K: Copy`). `DynCache` +requires the **union** of all policies' bounds: + +```rust,ignore +K: Copy + Eq + Hash + Ord +V: Clone + Debug +``` + +Each bound exists because at least one variant needs it: + +- `K: Copy` — many policies rely on cheap key copies in eviction paths. +- `K: Eq + Hash` — every hashmap-backed lookup. +- `K: Ord` — `HeapLfuCache` orders keys in a min-heap. +- `V: Clone` — variants that store `Arc` internally (LRU, LFU, + HeapLFU) fall back to `(*arc).clone()` when `Arc::try_unwrap` fails + on `insert` / `remove` (see below). +- `V: Debug` — `DynCache: Debug` delegates to the variant's `Debug`. + +This is the **library-provided dispatcher tax**. Users who do not want +to pay `K: Ord` can call `LruCore::new(…)` directly and bypass +`DynCache`; the tax only applies when crossing the runtime-dispatch +boundary. The tax is documented at the `DynCache` doc comment so +users picking the dispatcher route know what to expect. + +If a future policy adds a heavier bound (e.g. `K: Serialize` for a +persistent-cache policy), it forces every `DynCache` user to satisfy +that bound. The mitigation, when that happens, is a separate +dispatcher type (`DynPersistentCache`) rather than tightening +the existing `DynCache` bounds — preserving SemVer for users who +don't need persistence. + +## The `Arc` round-trip + +Three policies — `LruCore`, `LfuCache`, `HeapLfuCache` — internally +store `Arc` rather than `V`. The rationale lives in those modules +(zero-copy sharing between `peek` and `get`, predictable eviction-time +move, alignment with the concurrent wrappers' `Arc` returns). At +the `DynCache` boundary this creates a small impedance: + +```rust,ignore +CacheInner::Lru(lru) => { + let arc_value = Arc::new(value); + lru.insert(key, arc_value) + .map(|arc| Arc::try_unwrap(arc).unwrap_or_else(|arc| (*arc).clone())) +}, +``` + +`insert` wraps the value in `Arc` for the policy and tries to unwrap +the returned `Arc` on the way out. `try_unwrap` is O(1) when the +refcount is 1 (the common case for sequential `DynCache`); it falls +back to `(*arc).clone()` only when another reference outlived the +caller's, which happens on iteration paths where the policy held a +secondary reference. The fallback is the reason `V: Clone` is required +on `DynCache`. + +The cost is one `Arc::new` per insert and one branch (`try_unwrap`) per +return on Arc-storing variants. It does not affect FIFO, LIFO, MFU, +MRU, 2Q, S3-FIFO, ARC, Clock, Clock-PRO, NRU, Random, SLRU, LRU-K, +or FastLru, which store `V` directly. Users sensitive to this round +trip should pick a `V`-storing policy or use the concrete type +directly. + +## Feature gating discipline + +Every `CachePolicy` variant, every `CacheInner` variant, every match +arm in every `DynCache` method, every `CacheBuilder::build` arm, and +every `validate_policy` arm is gated by `#[cfg(feature = "policy-X")]`. +The discipline: + +- A user building with `default-features = false, features = ["policy-lru"]` + gets a `CachePolicy` enum with **one variant** and a `DynCache` enum + with **one inner variant**. Match exhaustiveness still holds because + every arm vanishes with its variant. +- The internal `match` in each `DynCache` method is **always + exhaustive** at the active feature set, because every arm and every + variant share the same set of `cfg` predicates. +- `policy-all` is a convenience feature that turns on every + `policy-*` feature at once. The default is a curated subset + (`policy-s3-fifo`, `policy-lru`, `policy-fast-lru`, `policy-lru-k`, + `policy-clock`) chosen to cover the most-recommended workloads from + [`docs/policies/README.md`](../policies/README.md). + +The cost is that adding a new policy involves edits in *six* +synchronized locations (see [Adding a new policy](#adding-a-new-policy)). +The benefit is that a "policy-lru-only" build is genuinely small — +none of the other 16 policies appear in the resulting binary. + +## Validation: panic vs `Result` + +`CacheBuilder::build` panics on invalid configuration: + +```rust,ignore +assert!(self.capacity > 0, "cache capacity must be greater than 0"); +// … +match policy { + CachePolicy::LruK { k } => assert!(*k > 0, "LruK: k must be greater than 0"), + CachePolicy::TwoQ { probation_frac } => + check_frac("TwoQ: probation_frac", *probation_frac), + // … +} +``` + +This is consistent with cachekit's broader error model +([`src/error.rs`](../../src/error.rs)): panics for **programming +errors** (programmer hands the builder a `k = 0`, which has no sensible +behavior), `Result<_, ConfigError>` reserved for **user-supplied +configuration** that arrives through deserialization or external +input. + +Callers that need to validate untrusted configuration before calling +`build` should branch on the `CachePolicy` variant and inspect the +payload themselves, or use the per-policy fallible constructors +(`S3FifoCache::try_with_ratios`, future `LrukCache::try_with_k`) +directly. The builder deliberately does not provide a `try_build` — +adding one would split the API surface for marginal gain when the +panic path already catches the bug at the call site. + +## `Send + Sync` is conditional + +`DynCache: Send + Sync` is **not** unconditional. The +`FastLru` policy uses `NonNull` for single-threaded performance +and is therefore `!Send + !Sync`. The test in +[`src/builder.rs`](../../src/builder.rs) encodes this: + +```rust,ignore +#[cfg(all(feature = "policy-lru", not(feature = "policy-fast-lru")))] +const _: () = { + fn assert_send() {} + fn check() { assert_send::>(); } +}; +``` + +In words: `DynCache` is `Send + Sync` whenever no +`!Send`-or-`!Sync` policy variant is enabled. With the default feature +set (which includes `policy-fast-lru`), `DynCache` is **not** +`Send + Sync`. Users who want a sendable `DynCache` should disable +`policy-fast-lru` and use `policy-lru` for the LRU path. + +This is a known sharp edge. The alternative — making `FastLru: Send` +via an unsafe impl — would invalidate `FastLru`'s entire design +premise (raw-pointer recency list without atomics). The current +trade prioritises `FastLru`'s single-threaded speed over `DynCache`'s +universal sendability, on the grounds that callers wanting concurrent +access should use a `Concurrent*` wrapper directly (see +[`concurrency.md`](concurrency.md)), not `DynCache`. + +## Maintenance cost + +The dispatcher's runtime cost is small. The **maintenance** cost is +real: + +- **17 inner variants** × **~10 `DynCache` methods** = **~170 match + arms** that must stay in sync today. CAR will make this 18 variants + once it is wired into the dispatcher. +- A `Debug` impl, a `default()` (where applicable), and a + `validate_policy` arm per variant. +- A `Cargo.toml` feature flag per variant. +- A documentation entry per variant in `docs/policies/`. + +The mitigations in place: + +1. **A single regression test** (`test_all_policies_basic_ops` in + [`src/builder.rs`](../../src/builder.rs)) loops over every enabled + policy and exercises `insert` / `get` / `contains` / `len` / + update / `clear`. Adding a variant immediately surfaces if any arm + was missed. +2. **Compile-time exhaustiveness** in the inner `match`. Forgetting an + arm is a build error, not a runtime bug. +3. **`#[non_exhaustive]` on `CachePolicy`** keeps downstream code + from depending on the full set of variants. + +Even with those, the line count of `src/builder.rs` (~1300) is +disproportionate to its semantic content. A `macro_rules!` to generate +per-method dispatchers has been considered and rejected — the +explicit `match` is grep-friendly, readable in source review, and +each arm sometimes diverges from the boilerplate (the `Arc` round +trip is the visible case; future TTL integration is another). Macros +would compress the file but obscure the points where the dispatcher +intervenes. + +## Adding a new policy + +Checklist for landing a new policy, ordered to minimise compile-time +churn: + +1. Implement the policy core: `MyPolicyCache` with a `Cache` + impl. Add `MyPolicyCache::new(capacity: usize)` and any config + constructors. Land this with its own tests. +2. Add a `policy-my-policy` feature in [`Cargo.toml`](../../Cargo.toml). + Add it to `policy-all`. Decide whether it joins `default = […]`. +3. Add the `CachePolicy::MyPolicy { … }` variant, gated by the new + feature. Include any config fields as inline payload. +4. Add the `CacheInner::MyPolicy(MyPolicyCache)` variant under + the same `cfg`. +5. Add a match arm in every `DynCache` method (`insert`, `get`, `peek`, + `contains`, `len`, `capacity`, `remove`, `clear`, `Debug` impl). +6. Add a `CachePolicy::MyPolicy { … } => CacheInner::MyPolicy(…)` arm + in `CacheBuilder::build`. Add validation in `validate_policy` if + the variant has constraints (frac in 0..=1, non-zero K, etc.). +7. Add the variant to `all_enabled_policies()` in the test module so + the regression sweep covers it. +8. Document the policy in `docs/policies/my-policy.md`; if it's a + roadmap policy graduating to implementation, move the doc from + `docs/policies/roadmap/` per the rule in + [`docs/policies/roadmap/README.md`](../policies/roadmap/README.md). +9. Update [`docs/policies/README.md`](../policies/README.md) and + [`docs/guides/choosing-a-policy.md`](../guides/choosing-a-policy.md). + +The work is mechanical. A CR template that lists these nine steps as +checkboxes would reduce the chance of missed updates further. + +## Future: `DynExpiringCache` + +When the `ttl` feature lands ([`ttl.md`](ttl.md) §4(c)), TTL **does +not** modify `DynCache`. Instead, `with_default_ttl` on the builder +returns a sibling type: + +```rust,ignore +let mut cache = CacheBuilder::new(1024) + .with_default_ttl(Duration::from_secs(60)) + .build::(CachePolicy::Lru); +// `cache: DynExpiringCache`, not DynCache. +``` + +`DynExpiringCache` mirrors `DynCache`'s match-arm boilerplate +one level out: each method threads the expiry check through the +inner policy's `Cache` call. The key design choice — argued in detail +in [`ttl.md`](ttl.md) §1, §4(c) — is that `DynExpiringCache` is a +**distinct type**, not `impl Cache for DynCache` plus a wrapper. +Distinctness makes `Expiring>` structurally +unrepresentable, which prevents the "two clocks, two indexes" +double-wrapping bug at the type level. + +The duplication is real: a parallel ~170 arms today, rising with the +dispatcher variant count. It is bounded (one type per cross-cutting +capability) and the trade favours type-level safety over deduplication. + +## When not to use `DynCache` + +`DynCache` is the right tool when: + +- The policy is chosen at runtime from configuration. +- The caller wants a single concrete type that can hold any policy. +- The dispatch cost is amortised over enough work that the `match` + doesn't dominate. + +It is the wrong tool when: + +- The policy is known at compile time. Use the concrete type + (`LruCache::new(…)`, `S3FifoCache::new(…)`) and let monomorphization + do its work. +- The hottest inner loop is `get`-bound and devirtualization matters + beyond what enum dispatch provides. Concrete types still win for + raw throughput on benchmarks (see + [`benches/comparison.rs`](../../benches/comparison.rs)). +- The caller needs `Send + Sync` and the build includes + `policy-fast-lru`. See [`Send + Sync`](#send--sync-is-conditional) + above; use the relevant `Concurrent*` wrapper instead. +- A user wants to plug in their own policy. `DynCache` is closed; + generic code over `Cache` is open. + +## See also + +- [Design overview](design.md) — §13 frames compile-time and runtime + composition at the principles level +- [Cache trait hierarchy](trait-hierarchy.md) — kernel trait and + capability traits +- [Concurrency](concurrency.md) — `Send + Sync` interaction, why + `Concurrent*` is a separate path +- [TTL design](ttl.md) — `DynExpiringCache` as a worked extension of + the dispatcher pattern +- [Error model](../../src/error.rs) — `ConfigError` vs panic discipline +- [`src/builder.rs`](../../src/builder.rs) — the canonical + implementation diff --git a/docs/design/concurrency.md b/docs/design/concurrency.md new file mode 100644 index 0000000..e418976 --- /dev/null +++ b/docs/design/concurrency.md @@ -0,0 +1,366 @@ +# Concurrency + +> Status: design rationale for the concurrent surface that ships today +> behind the `concurrency` feature flag. Companion to the cross-cutting +> principles in [`docs/design/design.md`](design.md) §3 and the trait +> rationale in [`docs/design/trait-hierarchy.md`](trait-hierarchy.md). + +cachekit's default surface is single-threaded. Concurrency is opt-in, +delivered through a parallel set of types and traits gated by the +`concurrency` Cargo feature. This document explains why the concurrent +surface looks the way it does, what invariants the wrappers promise, +and where the gaps are. + +## Non-goals + +- **`no_std`.** Concurrency relies on `parking_lot`, `std::sync::Arc`, + and `std::sync::atomic`. No `loom`/`no_std` support is planned. +- **Lock-free policies.** Mostly-lock-free or strictly lock-free + policies are out of scope today; see [Future directions](#future-directions). +- **Async-native traits.** `AsyncCacheFuture` is a Phase 2 placeholder + ([`src/traits.rs`](../../src/traits.rs)); no policy implements it + meaningfully yet. + +## The dominant pattern: sequential core, concurrent wrapper + +cachekit's concurrent types all keep the sequential core unaware of locking, +but they do **not** all have the same struct shape. There are three families. + +### Cloneable policy handles + +Policy-level wrappers are shared handles around a locked policy core: + +```text +ConcurrentPolicy { inner: Arc>> } +``` + +This shape is used by: + +- `ConcurrentLruCache` — [`src/policy/lru.rs`](../../src/policy/lru.rs) +- `ConcurrentFifoCache` — [`src/policy/fifo.rs`](../../src/policy/fifo.rs) +- `ConcurrentS3FifoCache` — [`src/policy/s3_fifo.rs`](../../src/policy/s3_fifo.rs) + +These types implement `Clone` via `Arc::clone`, so callers can hand cheap +handles to threads. They expose owned / `Arc` returns instead of borrowed +`&V` because no reference can safely outlive the lock guard it came from. + +### Owning store and data-structure wrappers + +Store and data-structure wrappers usually own the lock directly: + +```text +ConcurrentX { inner: RwLock>, ... } +``` + +Examples: + +- `ConcurrentHashMapStore`, `ShardedHashMapStore` — [`src/store/hashmap.rs`](../../src/store/hashmap.rs) +- `ConcurrentSlabStore` — [`src/store/slab.rs`](../../src/store/slab.rs) +- `ConcurrentWeightStore` — [`src/store/weight.rs`](../../src/store/weight.rs) +- `ConcurrentHandleStore` — [`src/store/handle.rs`](../../src/store/handle.rs) +- `ConcurrentSlotArena` — [`src/ds/slot_arena.rs`](../../src/ds/slot_arena.rs) +- `ConcurrentIntrusiveList` — [`src/ds/intrusive_list.rs`](../../src/ds/intrusive_list.rs) +- `ConcurrentClockRing` — [`src/ds/clock_ring.rs`](../../src/ds/clock_ring.rs) + +These wrappers are not necessarily cloneable handles. If a caller wants shared +ownership, they can wrap the whole type in `Arc<_>`. Keeping the `Arc` out of +the struct avoids an unnecessary refcount on users who only need a single owner. + +### Sharded primitives + +Sharded types own multiple independently locked shards: + +```text +ShardedX { + shards: Vec>>, + selector: ShardSelector, +} +``` + +Examples: + +- `ShardedSlotArena` — [`src/ds/slot_arena.rs`](../../src/ds/slot_arena.rs) +- `ShardedFrequencyBuckets` — [`src/ds/frequency_buckets.rs`](../../src/ds/frequency_buckets.rs) +- `ShardedHashMapStore` — [`src/store/hashmap.rs`](../../src/store/hashmap.rs) + +The common design is not "`Arc>` everywhere"; it is **lock at the +wrapper boundary and keep the sequential core lock-free**. The exact ownership +shape depends on whether the type is intended to be a cloneable cache handle, +an owning concurrent store, or a sharded primitive. + +## Why `Concurrent*` does not implement `Cache` + +`Cache` is the sequential trait. Its method signatures encode +sequential ownership: + +```rust +fn peek(&self, key: &K) -> Option<&V>; +fn get(&mut self, key: &K) -> Option<&V>; +fn insert(&mut self, key: K, value: V) -> Option; +``` + +Three of these are unimplementable on `Arc>`: + +- **`peek` and `get` return `&V`.** A borrowed reference cannot + outlive the `RwLockReadGuard`/`RwLockWriteGuard` it was extracted + from. There is no safe lifetime that ties `&V` to `&self` rather + than to the (anonymous) guard. Returning `&V` would force the + caller to hold the lock across the borrow, which serializes readers + and defeats `RwLock`. +- **`get` takes `&mut self`.** With shared ownership through + `Arc>` the wrapper only ever holds `&self`. Forcing + `&mut self` would require `Arc::make_mut` or external locking, + defeating the point of the inner lock. + +The concurrent wrappers therefore expose their own concrete API: + +```rust +pub fn get(&self, key: &K) -> Option>; +pub fn peek(&self, key: &K) -> Option>; +pub fn insert(&self, key: K, value: V) -> Option>; +pub fn insert_arc(&self, key: K, value: Arc) -> Option>; +pub fn remove(&self, key: &K) -> Option>; +``` + +Returning `Arc` is the contract. It costs one atomic refcount bump +on hit, which is cheap relative to the lock acquisition itself, and it +lets callers hold the value past lock release, send it across threads, +or stash it in another structure without lifetime gymnastics. + +For uniformity across the store layer there is a parallel trait family +that **does** model the `&self` + `Arc` shape: + +| Sequential ([`src/store/traits.rs`](../../src/store/traits.rs)) | Concurrent ([`src/store/traits.rs`](../../src/store/traits.rs)) | +|---|---| +| `StoreRead` (`&mut self`, `&V`) | `ConcurrentStoreRead` (`&self`, `Arc`) | +| `StoreMut` (`&mut self`) | `ConcurrentStore` (`&self`) | +| `StoreFactory` | `ConcurrentStoreFactory` | + +The policy layer does not yet have a counterpart family — see +[Future directions](#future-directions). + +## Lock primitive choice + +Every concurrent wrapper uses **`parking_lot::RwLock`**. Two things +drove this: + +- **Reader / writer split matches the access pattern.** `peek` / + `contains` / `len` only need shared access. `get` (which mutates + recency or frequency state) and `insert` / `remove` need exclusive + access. `Mutex` would serialize all of these. +- **Fairness and uncontended speed.** `parking_lot::RwLock` is small + (one `AtomicUsize` on 64-bit), uncontended-fast, and tunable via + fairness traits. The `RwLock>>` and + `RwLock>` shapes throughout the codebase rely on this. + +`Mutex` is intentionally absent from the wrappers. The few `Mutex` +references in the source tree are in doctests and rustdoc prose +describing how a user would wrap a non-concurrent cache themselves — +they are not on any hot path. + +The `parking_lot` choice is **not** absolute. On Rust 1.85+ the +futex-based `std::sync::Mutex` is competitive for the uncontended +single-writer case on Linux/macOS, and revisiting this is reasonable +if `parking_lot` ever becomes a build burden. The `RwLock` advantage +is more durable: `std::sync::RwLock` still has writer-starvation +hazards on some platforms that `parking_lot` avoids by default. + +## The `get` / `peek` lock-level asymmetry + +`peek` and `get` both look up by key, but they differ in what they +mutate: + +- **`peek`** is side-effect-free. The wrapper takes a **read lock** + and clones the `Arc`. Multiple readers proceed in parallel. +- **`get`** updates policy state (LRU recency, LFU frequency, Clock + reference bit, …). The wrapper takes a **write lock**. Only one + thread proceeds. + +This asymmetry is the single most important reason `peek` and `get` +are distinct methods at all (see +[`trait-hierarchy.md`](trait-hierarchy.md) for the rationale at the +trait level). Without `peek`, every read would serialize through the +write lock. With `peek`, read-heavy workloads — buffer pools, immutable +metadata caches — scale linearly across cores. + +The cost is that callers must choose, and choosing `get` on a +read-heavy workload silently kills scalability. The rustdoc on each +wrapper's `peek` and `get` says so explicitly; benchmarks under +[`benches/`](../../benches) compare the two. + +## Atomic check-and-act + +Compound operations must stay inside a single lock acquisition. The +rule is **check, decide, mutate, release** — all under the same write +lock. Splitting the steps across two acquisitions allows a concurrent +writer to invalidate the decision between them. + +The pattern shows up in three places worth naming: + +- **Insert-on-full.** Capacity check + eviction + insert must be one + critical section. `WeightStore::try_insert` and the policy `insert` + methods both follow this. +- **Replace-and-return.** `insert` returns the previous value if one + existed. The "did this key exist?" check and the replace must + happen under the same write lock; otherwise two concurrent inserts + can both observe "key absent" and both return `None`. +- **Future: expiry + remove.** TTL (see + [`docs/design/ttl.md`](ttl.md) §4(e)) requires the expiry check and + the removal to be one atomic operation under a write lock. A + read-locked fast path that observes `expires_at <= now` and + escalates to a write lock is safe **only** if the write-locked + path re-checks the deadline before acting, because a concurrent + `set_ttl` may have renewed the entry in between. + +The atomicity rule is a wrapper-level discipline, not a trait-level +one. The single-threaded core can't enforce it because it doesn't +know about locks. + +## Cloning the wrapper + +Every `Concurrent*` type implements `Clone` via `Arc::clone(&self.inner)`. +Cloning the wrapper is cheap (one atomic increment) and produces a +second handle to the **same** underlying cache. This is the intended +way to share a cache across threads: + +```rust,ignore +let cache = ConcurrentLruCache::::new(1_000); +let cache2 = cache.clone(); +std::thread::spawn(move || { + cache2.insert(1, "hello".into()); +}); +cache.get(&1); +``` + +There is no separate `Arc>` wrapping needed; +the inner `Arc` is the sharing primitive. Callers who want +`Arc` for type erasure are still free to wrap, but +in practice the concrete clone is what's used in the codebase. + +## `ConcurrentCache`: marker trait, not capability trait + +`ConcurrentCache` lives in [`src/traits.rs`](../../src/traits.rs) and +is declared `unsafe trait ConcurrentCache: Send + Sync {}`. It has +no methods. Its job is to **promise**, at the type system level, +that "this type is safe to share across threads in the cache sense" — +specifically that its `Cache`-like operations (whatever those happen +to be — concrete `Concurrent*` types do not implement `Cache`) +take care of internal synchronization. + +The `unsafe` is load-bearing. Implementing `ConcurrentCache` +incorrectly cannot be caught by the type system; it's an +implementer-side soundness claim, which is why only the wrappers +implement it (`ConcurrentFifoCache`, `ConcurrentS3FifoCache` today; +`ConcurrentLruCache` is a candidate but does not yet have the impl). + +Users writing generic code that requires a thread-safe cache should +bound on `ConcurrentCache + Send + Sync`. They should **not** bound on +`Cache + Send + Sync` and expect that to suffice — that bound +is satisfied by single-threaded caches whose user is responsible for +external locking. + +## Sharded primitives + +For data structures where a single `RwLock` becomes the bottleneck, +cachekit ships sharded variants: + +- **`ShardedHashMapStore`** — N independent shards, each its + own `RwLock>`. Shard selected by hashing the key with + the store's `BuildHasher`. +- **`ShardedSlotArena`** — N independent arenas with sharded + `SlotId`s. Same shape, applied to slab-style storage. +- **`ShardedFrequencyBuckets`** — N independent frequency-bucket + shards for LFU-family policies that want concurrent frequency + updates. + +Sharding lives at the **data-structure** layer, not the policy layer, +because the shard count, hash function, and shard-aware key type +(`ShardedSlotId`) all need to be visible to the policy that uses the +primitive. A `ShardedLruCache` does not yet exist as a single type; +it would be built by composing a `ShardedHashMapStore` with sharded +recency lists, and that composition is roadmap. + +When sharding is **not** what you want: + +- A single concurrent wrapper is simpler and faster for caches that + fit on one or two cores' worth of contention. +- Sharding multiplies the working-set fragmentation across shards. + A 1 M-entry cache split 16 ways has 16 caches of ~62 K each, and + evictions on one shard cannot rescue items on another. +- Per-shard eviction is correct for capacity bookkeeping (each shard + tracks its own capacity) but **not** globally optimal — a single- + shard LRU strictly dominates a sharded LRU on hit rate. + +## Concurrent policy coverage + +Of the 18 implemented policies, **3 ship with a `Concurrent*` wrapper +today**: LRU, FIFO, S3-FIFO. The remaining 15 require external locking +by the caller — typically `Arc>`. The +relevant rustdoc on those policies (e.g. `LfuCache`, `HeapLfuCache`, +`MfuCache`) calls this out. + +This is a coverage gap rather than a design choice. The pattern is +mechanical: wrap the sequential core in `Arc>`, expose the +`&self` API with `Arc` returns, decide read-lock vs. write-lock per +method, implement `Clone` via `Arc::clone`, and implement +`unsafe impl ConcurrentCache`. The work is bounded; what's missing is +the discipline to do it consistently across all 18 policies. + +## Failure modes + +Three failure modes worth naming: + +- **Poisoning.** `parking_lot` does **not** poison locks on panic. + A panic inside a critical section unwinds, releases the lock, and + leaves the inner core in whatever state the panic interrupted. + The single-threaded cores are designed to be panic-safe for + `Cache::insert` / `get` / `remove` — invariants are restored + before any potentially-panicking operation (allocation, user + hashing). This is a property of each core, not of the wrapper. +- **Deadlock.** Cachekit never holds two locks at once in the + current code. Sharded primitives acquire exactly one shard lock + per operation. Any future work that composes locks (e.g. a sharded + LRU that touches a shared recency list) must document its locking + order. +- **Starvation.** `parking_lot::RwLock` defaults to writer-friendly + fairness; readers do not starve writers. Heavy `get`-dominated + workloads still serialize through the write lock, which is the + underlying constraint, not a fairness bug. + +## Future directions + +Tracked roughly in priority order: + +1. **Coverage parity.** `Concurrent*` wrappers for the remaining 14 + policies (LFU, Heap-LFU, MFU, LRU-K, 2Q, ARC, CAR, Clock, + Clock-PRO, NRU, SLRU, MRU, LIFO, Random). Mechanical work; the + pattern is fixed. +2. **`ConcurrentExpiring`.** TTL's concurrent wrapper, per + [`docs/design/ttl.md`](ttl.md) §4(e). Distinct from `Concurrent*` + policies because the expiry-check + remove must be atomic across + *both* the inner cache and the expiration index. +3. **Sharded `Cache` wrappers.** A generic `Sharded>` + that hashes keys to N independent inner caches. The design + question is how to model capacity: per-shard capacity (simple, + imperfect global behaviour) vs. global capacity with cross-shard + victim selection (correct, requires inter-shard locking). +4. **Lock-free reads.** `peek` and `contains` paths that avoid the + `RwLock` entirely — `arc-swap` or seqlock-style techniques — + for caches whose recency state can tolerate eventual consistency. + Out of scope until benchmarks show the read lock is the bottleneck. +5. **Loom testing.** Once concurrent coverage stabilises, model-check + the wrapper invariants under `loom`. Particularly valuable for the + atomic check-and-act sequences in TTL and sharded composition. + +## See also + +- [Design overview](design.md) — §3 frames concurrency at the + principles level +- [TTL design](ttl.md) — applied case for `ConcurrentExpiring` +- [Cache trait hierarchy](trait-hierarchy.md) — read/mutate split and + object-safety rationale +- [Stores](../stores/README.md) — `ConcurrentStoreRead` / + `ConcurrentStore` trait family +- [`src/store/traits.rs`](../../src/store/traits.rs) — concurrent + store traits +- [`src/traits.rs`](../../src/traits.rs) — `ConcurrentCache` marker diff --git a/docs/design/design.md b/docs/design/design.md index 92f84d6..dcc01e7 100644 --- a/docs/design/design.md +++ b/docs/design/design.md @@ -1,14 +1,23 @@ -Designing high-performance caches in Rust is a multi-disciplinary problem: data structures, memory layout, concurrency, workload modeling, and systems-level performance all matter. The points below reflect what moves the needle in practice across systems, services, and libraries. +# Design Overview -For interface and API decisions, the [Rust API Guidelines checklist](https://rust-lang.github.io/api-guidelines/checklist.html) is a useful companion for consistent, ergonomic design. +This document collects the design principles that shape `cachekit`. Each +section pairs a principle with the concrete artifact in the source tree +that realizes it, so the prose stays grounded in the code rather than +floating as advice. + +For a worked example that applies every principle below to one feature, +see the [TTL design doc](ttl.md). For interface conventions, the +[Rust API Guidelines checklist](https://rust-lang.github.io/api-guidelines/checklist.html) +is the companion reference; module-level documentation follows the +[doc style guide](style-guide.md). ## 1. Workload First, Policy Second Cache policy only matters relative to workload. Identify access patterns: -- Hotset-heavy traffic: skewed keys, high churn. -- Scan-heavy traffic: large working sets, weak locality. +- Hot-set traffic: skewed keys, low churn on the hot set, high churn at the tail. +- Scan-heavy traffic: large working sets, weak temporal locality. - Mixed traffic: bursts of hot data over large cold sets. Measure: @@ -17,29 +26,43 @@ Measure: - Temporal vs spatial locality. Choose policies accordingly: -- LRU: good for temporal locality, bad for scans. -- LRU-K / 2Q (roadmap): better at filtering one-off accesses. -- Clock / ARC (roadmap): lower overhead, more adaptive. +- `LRU` / `Clock`: good for temporal locality, vulnerable to scans. +- `LRU-K` / `2Q` / `SLRU`: better at filtering one-off accesses. +- `ARC` / `CAR`: adaptive recency/frequency balance without manual tuning. +- `S3-FIFO` / `Heap-LFU`: strong general-purpose defaults under scans. + +All of the above ship today; see [`docs/policies/`](../policies/README.md) +for the implemented catalog and [`docs/policies/roadmap/`](../policies/roadmap/README.md) +for planned policies (LIRS, TinyLFU, SIEVE, GDS/GDSF, etc.). -Never design a "general purpose" cache first; design for the workload you expect. +When picking a policy or tuning a cache, design for the workload you +expect — not the average of all workloads. ## 2. Memory Layout Matters More Than Algorithms In a cache, memory layout often dominates policy. Prefer: -- Contiguous storage (Vec, slabs, arenas). +- Contiguous storage (`Vec`, slabs, arenas). - Index-based indirection over pointer chasing. Avoid: -- Excessive Box, Arc, linked lists. -- HashMap lookups in hot paths if avoidable. +- Excessive `Box`, `Arc`, linked lists with heap-allocated nodes. +- `HashMap` lookups in hot paths if avoidable. Techniques: - Store metadata (recency, freq, flags) in tightly packed structs. - Separate hot metadata from cold payloads. - Use slab allocators for fixed-size entries. +cachekit realizes this through reusable building blocks under +[`src/ds/`](../../src/ds): [`SlotArena`](../../src/ds/slot_arena.rs) +hands out stable `Handle`s backed by a `Vec`, [`IntrusiveList`](../../src/ds/intrusive_list.rs) +threads recency lists through those slots without per-node allocation, +and [`ClockRing`](../../src/ds/clock_ring.rs) keeps Clock-style state in +a single contiguous array. See [`docs/policy-ds/`](../policy-ds/README.md) +for the full primitive catalog. + Cache misses caused by your own data structure are as bad as upstream misses. ## 3. Concurrency Strategy Is Core Design, Not a Wrapper @@ -48,35 +71,54 @@ Locking strategy shapes everything. Options: - Global lock: simple, often fast enough for small cores, dies under high contention. -- Sharded caches: hash key -> shard, each shard independently locked. +- Sharded caches: hash key → shard, each shard independently locked. - Lock-free or mostly-lock-free: hard in Rust, only worth it if contention dominates. +cachekit ships the first option today via the `concurrency` feature: +`Concurrent*` wrappers (e.g. `ConcurrentLruCache`, `ConcurrentSlotArena`, +`ConcurrentClockRing`) place a `parking_lot::RwLock` around the +single-threaded core. The wrappers deliberately do **not** implement +`Cache` directly when that would force returning `&V` across a +lock boundary — they expose `Option>` style APIs instead. See +[`src/policy/lru.rs`](../../src/policy/lru.rs), +[`src/ds/slot_arena.rs`](../../src/ds/slot_arena.rs), and +[`src/ds/clock_ring.rs`](../../src/ds/clock_ring.rs). + Rust-specific notes: -- When `std` is available, prefer `parking_lot` locks over `std::sync` for lower overhead and better ergonomics. -- Avoid Arc> in hot paths. -- Consider per-thread caches with periodic merge. -- Consider RCU-style read paths for read-heavy caches. +- For `RwLock`, prefer `parking_lot` for fairness control and lower + uncontended overhead. For `Mutex`, the futex-based `std::sync::Mutex` + on Rust 1.85+ is competitive on Linux/macOS; `parking_lot::Mutex` + still wins on raw uncontended speed and offers nicer guard ergonomics. +- Avoid `Arc>` in hot paths. + +Future directions worth exploring but **not currently implemented**: +sharded caches (hash key → shard, per-shard lock), per-thread caches with +periodic merge, and RCU-style read paths for read-heavy workloads. ## 4. Avoid Per-Operation Allocation Allocations kill throughput. Pre-allocate: -- Entry pools. -- Node arrays. +- Entry pools — see [`SlotArena`](../../src/ds/slot_arena.rs) and the + free-list discipline in [`src/store/slab.rs`](../../src/store/slab.rs). +- Node arrays — intrusive lists thread through arena slots rather than + allocating per-node (see [`src/ds/intrusive_list.rs`](../../src/ds/intrusive_list.rs)). Reuse: -- Free lists. -- Slabs. +- Free lists (slab-backed). +- Slabs sized once at construction time via `CacheBuilder::new(capacity)`. Use: -- Vec with capacity management. -- Custom allocators if necessary. +- `Vec` with explicit capacity management. +- `rustc-hash` (via the `rustc-hash` dep) for cheap key hashing in + hot-path lookups. Avoid: -- Creating new Arc, String, Vec per lookup. +- Creating new `Arc`, `String`, `Vec` per lookup. +- Hidden clones of `K` on the eviction path. -If malloc shows up in your flamegraph, your cache is already slow. +If `malloc` shows up in your flamegraph, your cache is already slow. ## 5. Eviction Must Be Predictable and Cheap @@ -87,12 +129,17 @@ O(1) eviction is the goal. Avoid unbounded tree walks or scans in eviction paths. Maintain: -- Direct pointers/indices to eviction candidates. -- Eviction lists or clock hands. +- Direct indices / `Handle`s to eviction candidates (see + [`src/store/handle.rs`](../../src/store/handle.rs) and the + [`Cache`](../../src/store/traits.rs) trait). +- Eviction lists or clock hands (intrusive list head, `ClockRing` hand). +- Lazy heaps where amortized O(log n) is acceptable + ([`LazyMinHeap`](../../src/ds/lazy_heap.rs); used by Heap-LFU and TTL). Be careful with: - Background eviction threads (synchronization overhead). -- Lazy cleanup that grows unbounded. +- Lazy cleanup that grows unbounded; bound it with rebuild thresholds + (e.g. `LazyMinHeap::with_auto_rebuild`). Eviction cost must be comparable to lookup cost, not orders of magnitude higher. @@ -102,13 +149,21 @@ You cannot tune what you do not measure. Track at least: - Hit / miss rate. -- Eviction count and reason. +- Eviction count and reason (capacity vs. expiration). - Insert/update rate. + +cachekit exposes these through [`StoreMetrics`](../../src/store/traits.rs) +and per-policy metric structs (e.g. `LruMetrics`), gated behind the +`metrics` feature so non-instrumented builds pay nothing. The +`expirations` counter on `Expiring` follows the same pattern (see +[`src/policy/expiring.rs`](../../src/policy/expiring.rs)). + +Roadmap counters: - Scan pollution rate. -- Lock contention or wait time (roadmap). +- Lock contention or wait time. Expose: -- Lightweight counters in hot path. +- Lightweight counters in the hot path. - Optional detailed metrics behind feature flags. Metrics should guide design decisions, not justify them afterward. @@ -116,14 +171,24 @@ Metrics should guide design decisions, not justify them afterward. ## 7. Separate Policy From Storage Design in layers: -- Storage layer: how entries live in memory, allocation, layout, indexing. -- Policy layer: LRU, FIFO, LFU, LRU-K (roadmap: Clock/ARC/2Q, etc; see [Policy roadmap](../policies/roadmap/README.md)); only manipulates metadata and ordering. -- Integration layer: ties application objects, payloads, or IDs into cache entries. +- Storage layer: how entries live in memory, allocation, layout, + indexing — [`src/store/`](../../src/store). +- Policy layer: LRU, FIFO, LFU, LRU-K, 2Q, ARC, CAR, Clock, Clock-PRO, + S3-FIFO, … — manipulates metadata and ordering only + ([`src/policy/`](../../src/policy)). +- Capability layer: opt-in extension traits ([`RecencyTracking`](../../src/traits.rs), + `FrequencyTracking`, `HistoryTracking`, `ExpiringCache`) that policies + implement when the underlying signal exists. This is how `Expiring` + composes over any policy without touching policy code. +- Integration layer: ties application objects, payloads, or IDs into + cache entries via [`CacheBuilder`](../../src/builder.rs) and the + `DynCache` runtime dispatcher. Related docs: - [Policy overview](../policies/README.md) - [Policy roadmap](../policies/roadmap/README.md) - [Policy data structures](../policy-ds/README.md) +- [Read-only traits](../guides/read-only-traits.md) This makes: - Benchmarking easier. @@ -135,15 +200,16 @@ This makes: Ergonomics often cost performance. Avoid in critical loops: -- Heavy generics causing code bloat. +- Heavy generics causing code bloat across many monomorphizations. - Trait objects for hot dispatch. - Closures capturing state. -- Iterator chains instead of simple loops. +- Iterator chains where a plain `for` loop would do. Prefer: - Explicit loops. -- Concrete types. -- Monomorphized fast paths. +- Concrete types and monomorphized fast paths. +- Enum dispatch over `Box` when polymorphism is needed at the + edges — this is exactly the trade `DynCache` makes (see §13). You can wrap fast internals in nice APIs at the edges. @@ -154,15 +220,17 @@ In scan-heavy workloads: Large sequential reads destroy LRU-style caches. Solutions: -- Scan-resistant policies (LRU-K, 2Q/ARC are roadmap). +- Scan-resistant policies: `LRU-K`, `2Q`, `SLRU`, `ARC`, `CAR`, + `Clock-PRO`, `S3-FIFO`, `Heap-LFU` — all implemented today. - Explicit "scan mode" hints from the caller or workload layer. - Bypass cache for known one-shot reads. -If you ignore scans, your cache will look great in microbenchmarks and terrible in production. +If you ignore scans, your cache will look great in microbenchmarks and +terrible in production. ## 10. Benchmark Like a System, Not a Library -Do not rely on random key benchmarks. +Do not rely on uniform-random key benchmarks. Use: - Zipfian distributions. @@ -176,22 +244,31 @@ Measure: - Memory overhead. - Eviction cost. -A cache that is 5% faster on random keys but 50% worse under scans is a bad cache. +cachekit's benchmark harness covers these dimensions; see +[`docs/benchmarks/workloads.md`](../benchmarks/workloads.md) and the +runners under [`benches/`](../../benches). + +A cache that is 5 % faster on uniform-random keys but 50 % worse under +scans is a bad cache. -## 11. Rust-Specific Pitfalls +## 11. Rust Hot-Path Hazards Beyond Allocation -Arc is expensive in hot paths. +`Arc` is expensive in hot paths; minimize it and lift `Arc::clone` out +of inner loops. -Borrow checker can push you toward indirection—fight it with: -- Index-based access. -- Interior mutability only where unavoidable. +The borrow checker can push you toward indirection — fight it with: +- Index-based access (`Handle`s, slot indices) instead of `&mut` chains. +- Interior mutability only where unavoidable; prefer `Cell` over + `RefCell` when `T: Copy`, and atomics when the value lives behind + a shared reference. Beware of: -- Hidden clones. -- Trait object dispatch. -- Over-generic designs. +- Hidden clones, particularly of keys on the eviction path. +- Trait object dispatch on read/insert. +- Over-generic designs whose monomorphization cost dwarfs their benefit. -Rust can be as fast as C, but only if you design like a systems programmer, not a library author. +Rust can match C on hot paths, but only when systems-level discipline +survives contact with the type system. ## 12. Design for Failure Modes @@ -207,13 +284,79 @@ Add: A cache that collapses under stress is worse than no cache. +## 13. Compile-Time and Runtime Composition + +cachekit's externally visible surface is shaped by two composition +mechanisms that together let users pay only for what they use. + +**Per-policy feature flags.** Every policy is behind a Cargo feature +(`policy-lru`, `policy-s3-fifo`, …), with `policy-all` for "everything" +and a small default of `policy-s3-fifo`, `policy-lru`, `policy-fast-lru`, +`policy-lru-k`, `policy-clock`. Optional capabilities are gated the +same way: `metrics`, `concurrency`, `serde`, and `ttl`. Downstream +crates can disable defaults and select the minimum surface they need; +see [`Cargo.toml`](../../Cargo.toml). + +**Capability traits + runtime dispatch.** Extension traits +([`RecencyTracking`](../../src/traits.rs), `FrequencyTracking`, +`HistoryTracking`, `ExpiringCache`) keep optional behavior off the +core `Cache` trait. For ergonomic builder construction without +forcing trait objects on the user, [`CacheBuilder`](../../src/builder.rs) +returns a [`DynCache`](../../src/builder.rs) that dispatches via +an internal enum match rather than `Box`. When TTL is +enabled, the builder returns a sibling `DynExpiringCache` that +threads the expiry check around each variant's `Cache` call — a worked +example of capability composition. See [`docs/design/ttl.md`](ttl.md) +for the full design and [`src/policy/expiring.rs`](../../src/policy/expiring.rs) +for the decorator itself. + ## Bottom Line -High-performance caches are not about clever algorithms—they are about: +High-performance caches are not about clever algorithms — they are about: - Memory layout. - Allocation discipline. - Contention control. - Eviction predictability. - Workload realism. -In Rust, your main enemy is not safety—it is abstraction overhead and accidental allocation. Design from the metal upward, then wrap it in something pleasant to use. +In Rust, your main enemy is not safety — it is abstraction overhead and +accidental allocation. Design from the metal upward, then wrap it in +something pleasant to use. + +## See Also + +Design docs: +- [Concurrency](concurrency.md) — `Concurrent*` wrappers, `RwLock` + discipline, sharded primitives, `ConcurrentCache` marker +- [Cache trait hierarchy](trait-hierarchy.md) — `Cache` kernel, + capability traits, read/mutate split, object safety +- [Builder and runtime dispatch](builder-and-dyn-dispatch.md) — + `CachePolicy`, `DynCache`, enum-vs-`Box` trade-off, adding new + policies +- [Weighted eviction](weighted-eviction.md) — `WeightStore` dual + limits, weight function contract, GDS/GDSF pre-staging +- [Metrics](metrics.md) — recorder / snapshot / exporter split, + `MetricsCell`, Prometheus exporter, feature gating +- [Error model](error-model.md) — panic vs `Result` discipline, + four error types, debug-only invariant checks +- [Benchmarking](benchmarking.md) — benchmark layers, monomorphic policy + registry, JSON artifact schema, reproducibility rules +- [Hashing and key identity](hashing.md) — hasher choices, `KeyInterner`, + `ShardSelector`, HashDoS trade-offs +- [Sharding](sharding.md) — current sharded primitives, routing, + capacity semantics, roadmap for sharded caches +- [Serialization](serialization.md) — current `serde` surface, cache-state + persistence boundaries, TTL and hash-seed rules +- [Non-goals](non-goals.md) — explicit boundaries for what cachekit does + not try to be +- [TTL](ttl.md) — applied example of every principle above +- [Doc style guide](style-guide.md) + +Reference docs: +- [Policy overview](../policies/README.md) and [roadmap](../policies/roadmap/README.md) +- [Policy data structures](../policy-ds/README.md) +- [Stores](../stores/README.md) +- [Read-only traits](../guides/read-only-traits.md) +- [Choosing a policy](../guides/choosing-a-policy.md) +- [Benchmarks overview](../benchmarks/overview.md) and [workloads](../benchmarks/workloads.md) +- [Rust API Guidelines checklist](https://rust-lang.github.io/api-guidelines/checklist.html) diff --git a/docs/design/error-model.md b/docs/design/error-model.md new file mode 100644 index 0000000..4f73a8f --- /dev/null +++ b/docs/design/error-model.md @@ -0,0 +1,341 @@ +# Error Model + +> Status: design rationale for cachekit's panic-vs-`Result` discipline, +> the four error types in the public API, and the debug-only invariant +> checks. Companion to [`design.md`](design.md) and [`src/error.rs`](../../src/error.rs). + +cachekit treats error handling as a design question, not an ergonomics +question. The rule is: + +> **Panic on programming errors. Return `Result` for user-supplied +> input. Reserve invariant checks for `debug_assertions`.** + +This document explains where each side of that rule applies, why the +four shipped error types each exist as separate types, and what +discipline a new error type needs to follow. + +## The three tiers + +cachekit divides every failure mode into one of three tiers, each with +its own response: + +| Tier | Cause | Response | Example | +|---|---|---|---| +| 1. Programming error | Bug in the caller's code, statically detectable in principle | Panic | `LruK::with_k(10, 0)` (k = 0) | +| 2. User-supplied input | Configuration arriving from outside the program | `Result<_, ErrorType>` | `S3FifoCache::try_with_ratios(_, 2.0, _)` | +| 3. Invariant violation | Internal data-structure corruption (cannot reach in normal use) | `debug_assert` + `InvariantError` (test/debug only) | `pop_front` while queue length is zero | + +The tiers are not opinions — they map to specific Rust constructs and +runtime behaviours. Mixing them (panicking on tier 2, returning +`Result` from tier 3) produces APIs that are either ergonomically +heavy or operationally unsafe. + +## Tier 1: panic on programming errors + +A "programming error" is a precondition violation the caller could +have prevented with a `if` or a type. cachekit panics in this case +rather than returning `Result`, because: + +- The bug is in **the caller's code**, not in untrusted input the + caller is forwarding. +- The right fix is for the caller to fix their code, not to handle + an error path at the call site. +- Forcing every call site to handle `Result<_, "you passed 0 for capacity">` + for a bug they could have prevented adds friction without + catching anything new. + +The shipped examples: + +- `CacheBuilder::build` panics on `capacity == 0`, `k == 0` for LRU-K, + and `probation_frac > 1.0` for 2Q. The validation is centralised in + `validate_policy` ([`src/builder.rs`](../../src/builder.rs)). +- Direct constructors (`LruCore::new`, `S3FifoCache::new`) panic on + invalid arguments. The fallible counterparts (`try_with_ratios`, + `try_with_capacity`) exist for tier 2. +- `assert!(*k > 0, "LruK: k must be greater than 0")` in + `CacheBuilder::validate_policy` is the canonical shape: a clear + message that identifies the parameter and the constraint. + +The cost is that a panicking call site terminates under the crate's +default `panic = "abort"` release profile. This is intentional — +cachekit's `panic = "abort"` is documented in the +[`Cargo.toml`](../../Cargo.toml) release profile, and the rationale +is that a panic in cache code under load is a bug worth surfacing +through the supervisor / restart strategy, not unwinding. + +## Tier 2: `Result` for user-supplied input + +When the failure mode is "user passes us configuration we don't +recognise as valid," return `Result`. The shipped error types each +cover a specific surface: + +### `ConfigError` — invalid configuration parameters + +```rust,ignore +pub struct ConfigError(String); +``` + +Defined in [`src/error.rs`](../../src/error.rs). Returned by fallible +constructors that accept user-tunable knobs: + +- `S3FifoCache::try_with_ratios(capacity, small_ratio, ghost_ratio)` +- Future `try_build` variants on `CacheBuilder` + +The contained `String` carries a human-readable description of which +parameter failed validation. By convention messages are lowercase, +unpunctuated, and identify the parameter: `"capacity must be greater +than zero"`, `"small_ratio must be in 0.0..=1.0"`. + +`ConfigError`'s presence on a constructor signals that the parameter +set can legitimately come from outside the program — a config file, +a CLI flag, an HTTP request — and the caller should handle invalid +input gracefully rather than crashing the process. + +### `StoreFull` — capacity-bound failure + +```rust,ignore +pub struct StoreFull; +``` + +Zero-sized type defined in +[`src/store/traits.rs`](../../src/store/traits.rs). Returned by +`StoreMut::try_insert` and `ConcurrentStore::try_insert` when the +store is at capacity and the insert would exceed it. The contract: + +- **`StoreFull` is not a panic.** A full store under capacity + pressure is the **expected** outcome of `try_insert`. The caller — + typically a policy layered on top — must respond by evicting and + retrying. +- **The store does not evict on its own.** `StoreFull` is the + signal that says "you, policy, decide who to evict." This is the + core of the policy/storage separation rule from + [`design.md`](design.md) §7. +- **The error carries no data.** The caller knows what they tried + to insert; `StoreFull` adds nothing useful by retaining it. + +`StoreFull` is **not** in `src/error.rs` despite being an error +type. It lives alongside the trait that returns it because the +two are co-evolving and the surface is small enough that the +co-location aids readability. + +### `LazyMinHeapError` — `ds`-layer fallible construction + +```rust,ignore +pub enum LazyMinHeapError { + CapacityTooLarge { requested: usize, max: usize }, + Allocation(std::collections::TryReserveError), +} +``` + +Defined in [`src/ds/lazy_heap.rs`](../../src/ds/lazy_heap.rs). +Returned by `LazyMinHeap::try_with_capacity` when: + +- The requested capacity exceeds the internal `MAX_CAPACITY` bound, + or +- The allocator cannot satisfy the reservation. + +The enum exposes both failure modes distinctly because a caller may +want to retry on `Allocation` (transient memory pressure) but not on +`CapacityTooLarge` (logic bug or genuinely-too-big request that +won't recover). + +The pattern generalises: a future "fallible-construction" error type +on any `ds` primitive that pre-allocates should distinguish "you +asked for too much" from "we couldn't get what you asked for." + +### `std::collections::TryReserveError` — passthrough + +Some `try_new` constructors (`HashMapStore::try_new`, +`ConcurrentHashMapStore::try_new`) return the standard +`TryReserveError` directly rather than wrapping it. The reason: the +only failure mode is allocator pressure, and `TryReserveError` +already says exactly that. Wrapping it would add a layer for no +information. + +The shape is: if cachekit has a distinct failure mode of its own +(`CapacityTooLarge`, `StoreFull`), wrap or define a new type; if the +only failure mode is "the allocator said no," return the standard +type and let the caller's error-handling stack absorb it. + +## Tier 3: invariant checks (debug-only) + +```rust,ignore +pub struct InvariantError(String); +``` + +Defined in [`src/error.rs`](../../src/error.rs). Returned by +`check_invariants` methods on internal data structures: + +```rust,ignore +impl S3FifoCache { + #[cfg(any(debug_assertions, test))] + pub fn check_invariants(&self) -> Result<(), InvariantError> { + if self.small.len() + self.main.len() != self.map.len() { + return Err(InvariantError::new("queue length mismatch")); + } + // … + Ok(()) + } +} +``` + +Three properties define the tier: + +- **Off the hot path.** `check_invariants` is called from tests, + fuzz harnesses, and `debug_assertions` paths. It is never called + from normal `insert` / `get` / `evict`. +- **Internal-only.** The invariants are about data-structure + integrity: "the queue length matches the map length", "the heap + is in heap order", "the ghost list hasn't grown past its bound." + No caller program would meaningfully react to one of these + failing — the cache is corrupted, the right response is to + capture state and bail. +- **Returns `Result`, not panics.** Counter-intuitive given the + tier-1 rule. The reason: `check_invariants` is called by + diagnostic code that wants to **report** the violation (in a test + failure message, a fuzz reproducer, a debug-mode assertion's + output) rather than crash. Returning `Result` lets the caller + format the failure; if they want to panic, they `unwrap()`. + +`InvariantError` carries the same `String`-message shape as +`ConfigError`, by the same convention: lowercase, unpunctuated, +identifying the specific invariant. + +## Why four error types, not one + +A single `CachekitError` enum could in principle subsume all four. +cachekit doesn't ship one, deliberately. Three reasons: + +- **Each surface has different recovery semantics.** `StoreFull` + means "evict and retry"; `ConfigError` means "fix your config"; + `LazyMinHeapError::Allocation` means "back off and retry"; + `InvariantError` means "we have a bug, capture state." A unified + enum forces every caller to either match exhaustively (most of + which can't happen at their call site) or use a catch-all that + loses information. +- **Each lives near the trait that uses it.** `StoreFull` lives in + `src/store/traits.rs`; `LazyMinHeapError` lives in + `src/ds/lazy_heap.rs`; `ConfigError` and `InvariantError` live + in `src/error.rs`. Co-location helps maintenance — adding a new + failure mode to one surface doesn't ripple through the others. +- **Sum types compose poorly across abstractions.** A unified + enum would propagate every variant up through every layer that + touched it. The current shape lets a layer convert (or + re-wrap) only the errors it cares about. + +The cost is that downstream code wanting to catch "any cachekit +error" has to enumerate all four. The mitigation is that no +realistic downstream code wants that — each call site touches one +surface at a time and handles that surface's error. + +## Operational contract: panic profile + +The crate's release profile sets `panic = "abort"`: + +```toml +[profile.release] +panic = "abort" +``` + +Two implications worth naming: + +- **A panic terminates the process.** No unwind, no destructors, + no observer recovery. A panicking weight function in + `ConcurrentWeightStore` (see + [`weighted-eviction.md`](weighted-eviction.md)) kills the + process; a `parking_lot` lock-poisoning concern is moot under + `panic = "abort"` because the process is gone before any + observer can read poisoned state. +- **Callers who override the profile take on more contract.** + Callers building with `panic = "unwind"` get unwind safety up + to the documented invariants. The + [`weighted-eviction.md`](weighted-eviction.md) clear-ordering + rule and the + [`concurrency.md`](concurrency.md#failure-modes) panic-safety + notes apply only to this mode. + +The interplay matters for error model design: under `abort`, tier 1 +panics are terminal and need to be debugged at development time; +under `unwind`, they are catchable but should still be treated as +bugs because the cache may be in an unspecified-but-not-corrupt +state. + +## What `Result` does **not** cover + +Three failure modes are deliberately not represented as `Result`: + +- **OOM in non-`try_*` constructors.** `LruCore::new(huge)` aborts + on allocator failure. Use `try_with_capacity` to get a `Result` + surface (where available). +- **Logic errors in policy code.** Eviction picking the wrong + victim is a bug, not a return value. Detected (when detected) by + `check_invariants` or by the policy's tests. +- **Concurrent contention.** `parking_lot::RwLock` doesn't poison, + doesn't time out by default, and doesn't return `Result`. A + contended cache blocks until it can proceed. Callers who need + timeouts wrap the cache themselves with a wider locking + discipline. + +## Adding a new error + +Checklist for a new failure mode: + +1. **Decide the tier.** Programming error, user-supplied input, or + internal invariant? +2. **Pick or define the type.** + - Tier 1: use `assert!` / `debug_assert!` / `panic!`. No new + type needed. + - Tier 2: define a new type if the failure has data the caller + needs and no existing type fits. Otherwise reuse `ConfigError` + (with a clear message) or pass through `TryReserveError`. + - Tier 3: add a `check_invariants` method on the affected type + that returns `Result<(), InvariantError>`. +3. **Co-locate.** Types specific to a trait live with the trait + (`StoreFull` in `src/store/traits.rs`). Types specific to a + primitive live with the primitive (`LazyMinHeapError`). + Cross-cutting types (`ConfigError`, `InvariantError`) live in + `src/error.rs`. +4. **Implement `Display` and `Error`.** Both are required for + `?` interop with `Box`. The convention is: + ```rust,ignore + impl fmt::Display for MyError { … } + impl std::error::Error for MyError {} + ``` + `Display` writes the message; `Error` is empty unless the type + wraps another error (then `source` returns the inner error). +5. **`Send + Sync + Clone`.** All existing error types satisfy this. + The convention is `#[derive(Debug, Clone, PartialEq, Eq, Hash)]` + for value types and matching impls for enums. Errors that flow + between threads must be `Send + Sync`; errors that get cloned + into snapshots / test fixtures must be `Clone`. + +## Compatibility with `?` and `anyhow`/`thiserror` + +The cachekit error types are intentionally **plain types, not +`thiserror`-derived**, to avoid forcing a `thiserror` dependency on +downstream users. They implement `std::error::Error` directly, so +they work with `?`, `Box`, and any error-aggregation +crate (including `anyhow` and `thiserror::Error` in user code). + +A downstream `thiserror`-derived enum that includes a `#[from] +cachekit::ConfigError` works. A downstream `anyhow::Result<_>` that +absorbs cachekit errors via `?` works. The choice not to bundle +either crate keeps the error layer dependency-free and gives +downstream the standard `From` and `Display` shape they expect. + +## See also + +- [Design overview](design.md) — §12 frames failure modes at the + principles level +- [Concurrency](concurrency.md) — `parking_lot` non-poisoning, + atomic check-and-act, lock-acquisition failure modes +- [Builder and runtime dispatch](builder-and-dyn-dispatch.md) — + panic-in-`build` validation, `try_build`-deliberately-absent + rationale +- [Weighted eviction](weighted-eviction.md) — `StoreFull`'s role + and unwind-safety in `clear` +- [`src/error.rs`](../../src/error.rs) — `ConfigError`, + `InvariantError` +- [`src/store/traits.rs`](../../src/store/traits.rs) — `StoreFull` +- [`src/ds/lazy_heap.rs`](../../src/ds/lazy_heap.rs) — + `LazyMinHeapError` diff --git a/docs/design/hashing.md b/docs/design/hashing.md new file mode 100644 index 0000000..5af6b4b --- /dev/null +++ b/docs/design/hashing.md @@ -0,0 +1,166 @@ +# Hashing and Key Identity + +> Status: design rationale for hasher choices, key interning, and hash-based +> routing. Companion to [`concurrency.md`](concurrency.md), [`sharding.md`](sharding.md), +> and the security notes in store/data-structure modules. + +cachekit uses hashing in three different roles: + +- Lookup indexes (`HashMapStore`, policy maps, ghost indexes). +- Compact key identity (`KeyInterner`). +- Shard routing (`ShardSelector`). + +Those roles have different threat models. Some code paths choose `FxHash` for +speed on trusted keys; others default to `RandomState` or keyed SipHash because +untrusted keys can create HashDoS or single-shard contention. This document +explains those choices and when callers should override them. + +## The Decision Matrix + +| Component | Default hasher | Why | Caller override? | +|---|---|---|---| +| `HashMapStore` | `RandomState` | public store API, safer default | yes, `with_hasher` | +| `ClockRing` | `RandomState` | can be keyed by user input | yes, with explicit trust acknowledgement | +| `KeyInterner` | `FxBuildHasher` | hot internal mapping, trusted-key bias | yes, `with_hasher` | +| `WeightStore` | `FxHashMap` | speed, large-value target | no generic hasher today | +| Policy internals | mostly `FxHashMap` | hot metadata paths | generally no | +| `ShardSelector` | keyed SipHash-1-3 | routing must resist shard pinning | seed or randomized constructor | + +The rule: **default to DoS-resistant hashing at public boundaries; use faster +hashing inside policy metadata when keys are trusted or already admitted.** + +## `RandomState`: Safe Public Default + +`HashMapStore` and `ClockRing` default to +`std::collections::hash_map::RandomState`. This is the right public default +because callers often pass keys derived from request paths, tenant ids, URLs, +or filenames. Randomized hashing prevents an attacker from precomputing many +keys that collide in one bucket. + +The cost is per-hash overhead. For workloads with fully trusted keys (for +example, dense integer ids generated by the process), callers can use +`with_hasher` to opt into a faster hasher. That opt-in is intentionally explicit: +the call site documents the threat-model decision. + +`ClockRing` goes further by using a `KeysAreTrusted` acknowledgement for faster +non-randomized hashers. The extra marker makes the security trade visible in +review rather than hidden in a type alias. + +## `FxHash`: Hot Internal Default + +Many policy internals use `rustc_hash::FxHashMap`: + +- LRU-family maps from key to node pointer / slot id. +- LFU/MFU frequency maps. +- 2Q / SLRU / Clock-PRO resident and ghost indexes. +- `WeightStore`'s index. +- `KeyInterner`'s default index. + +`FxHash` is fast and deterministic. It is also non-cryptographic and not +HashDoS-resistant. The intended use is trusted, already-admitted keys where the +hash map is not directly exposed as an unbounded public endpoint. + +The sharp edge is `WeightStore`: its target use case (variable-size objects +like images, documents, blobs) often has user-derived keys. Its module docs call +this out directly: pre-hash keys with a keyed hash or use `HashMapStore` if the +key source is adversarial. + +## `KeyInterner`: Identity Compression, Not Security + +`KeyInterner` maps external keys to compact `u64` handles: + +```text +index: HashMap keys: Vec +"user:123" -> 0 keys[0] = "user:123" +``` + +The design goals: + +- Avoid repeated key cloning in hot paths. +- Use compact handles in policy metadata and frequency maps. +- Resolve a handle back to a key in O(1). + +Handles are **not capability tokens**. They are sequential integers. A handle +from one interner can silently resolve to a different key in another interner, +and handles are reused after `clear`. Callers that store handles externally +must pair them with `generation()` and reject stale generations. + +Security implications: + +- The default `FxBuildHasher` is for trusted input. +- Use `with_hasher` / `with_capacity_and_hasher` with `RandomState` when keys + are derived from untrusted input. +- `KeyInterner` is append-only until `clear`, so unique-key attacks can drive + memory growth. Use `try_intern` and your own admission bound for untrusted + keys. +- `Debug` intentionally omits interned keys to avoid leaking URLs, user ids, or + auth material into logs. + +## `ShardSelector`: Hashing for Routing + +Shard routing has a different failure mode than lookup maps. A lookup hash +collision slows one map; a routing collision pins the whole workload to one +shard and defeats concurrency. + +`ShardSelector` therefore uses keyed SipHash-1-3: + +- `ShardSelector::randomized(shards)` draws key material from `RandomState`. + Use this for normal production sharding. +- `ShardSelector::new(shards, seed)` is deterministic and reproducible. Treat + `seed` as secret key material if adversaries can influence keys. + +The selector reduces hash output to `[0, shards)` using fast range reduction +rather than `%`, keeping distribution unbiased and cheap. The shard count is +clamped to `[1, MAX_SHARDS]` to prevent user-controlled configs from allocating +an unbounded number of locks or vectors. + +## Custom Hasher Rules + +When adding a hasher parameter to a public type: + +1. Default to `RandomState` unless the type is clearly internal-only. +2. Expose `with_hasher` and `try_with_hasher` if callers have legitimate + trusted-key fast paths. +3. Document the threat model at the constructor, not only at module level. +4. Never hide a non-randomized hasher behind a harmless-sounding `new`. +5. If the hasher affects shard routing, prefer `ShardSelector` over ad hoc + hashing so the keyed-routing contract stays centralized. + +When using `FxHashMap` internally: + +1. Keep it behind the policy or data-structure boundary. +2. Do not expose arbitrary insertions from untrusted users without a separate + capacity/admission guard. +3. Mention the assumption in the module's security notes if keys may be user + controlled. + +## Serialization and Hash Seeds + +Do not serialize hash seeds or hasher state unless the type is explicitly a +deterministic routing artifact. `ShardSelector::new(shards, seed)` is the one +place where reproducible routing is part of the public contract. `RandomState` +and policy-internal hash maps should be reconstructed on deserialization. + +Serializing raw hash-map order is also wrong. Hash-map iteration order changes +with seeds and implementation details; serialized cache state should use stable +semantic fields (keys, values, policy order) rather than map buckets. + +## Future Direction: Hasher Audit + +The codebase intentionally mixes `RandomState`, `FxHashMap`, and SipHash. That +mix is valid only while every use site has a documented threat model. A useful +future hardening pass: + +- List every public constructor that accepts a key type. +- Classify whether keys are trusted, user-supplied, or mixed. +- Ensure user-supplied defaults are randomized. +- Add `KeysAreTrusted`-style acknowledgement to any public non-randomized path. + +## See Also + +- [Sharding](sharding.md) - shard routing and contention trade-offs +- [Weighted eviction](weighted-eviction.md) - `WeightStore` HashDoS caveat +- [`src/ds/interner.rs`](../../src/ds/interner.rs) +- [`src/ds/shard.rs`](../../src/ds/shard.rs) +- [`src/store/hashmap.rs`](../../src/store/hashmap.rs) +- [`src/ds/clock_ring.rs`](../../src/ds/clock_ring.rs) diff --git a/docs/design/metrics.md b/docs/design/metrics.md new file mode 100644 index 0000000..88ace2b --- /dev/null +++ b/docs/design/metrics.md @@ -0,0 +1,510 @@ +# Metrics + +> Status: design rationale for the metrics infrastructure under +> [`src/metrics/`](../../src/metrics), gated by the `metrics` Cargo +> feature. Companion to [`design.md`](design.md) §6. + +cachekit's metrics surface is bigger than "two counters behind a +feature flag." It mirrors the cache trait hierarchy — recorder / +snapshot / exporter — so each concern lives in the smallest trait +that captures it, and policy code stays free of monitoring plumbing. +This document explains the three-trait separation, the +`&self`-vs-`&mut self` split, the `MetricsCell` interior-mutability +escape hatch, the Prometheus exporter contract, and what guarantees +counters do and do not provide. + +## Goals and non-goals + +The metrics module is shaped for: + +- **Lightweight in-process counters** that a policy can increment on + its hot path without measurable overhead when enabled. +- **Zero overhead when disabled.** The entire `metrics` module + compiles away under `#[cfg(feature = "metrics")]`. +- **Decoupled consumption.** Tests, benchmarks, and production + monitoring should each consume metrics in the shape they need + without dragging recording concerns along. +- **Per-policy specificity.** A Clock policy's `hand_advance` count + matters; a FIFO's `pop_oldest_empty_or_stale` count matters. The + trait surface preserves these signals rather than flattening to + one shape. + +It is **not** shaped for: + +- **High-cardinality labels.** Counters are flat scalars. Tag + dimensions (per-key, per-tenant) are out of scope. +- **Histograms or sliding windows.** Counters and gauges only. + Latency distributions live in the user's monitoring stack via + external instrumentation. +- **Audit-grade accounting.** Counters use `Relaxed` atomics + ([`src/store/weight.rs`](../../src/store/weight.rs)) and wrap on + overflow in release. Best-effort observability, not financial + ledger. + +## Three-trait separation + +```text + ┌─────────────────────────────┐ + │ CoreMetricsRecorder │ + │ record_get_hit, _miss, │ + │ _insert_*, _evict_*, │ + │ _clear │ + └──────────────┬──────────────┘ + │ extends + ┌──────────┬───────────┬───────────────┼───────────┬────────────┐ + ▼ ▼ ▼ ▼ ▼ ▼ + FifoRec LruRec LfuRec ArcRec ClockRec S3FifoRec + │ … + ▼ + LruKRec + (further extends LruRec) + + Consumption (decoupled from recording): + ┌──────────────────────────────┐ ┌──────────────────────────────┐ + │ MetricsSnapshotProvider │ │ MetricsExporter │ + │ + MetricsReset │ │ PrometheusTextExporter │ + │ (bench / test) │ │ (production monitoring) │ + └──────────────────────────────┘ └──────────────────────────────┘ +``` + +Three responsibilities, three trait families: + +- **Record.** Per-policy `*MetricsRecorder` traits live in + [`src/metrics/traits.rs`](../../src/metrics/traits.rs). Every + policy-specific recorder extends `CoreMetricsRecorder` and adds + policy-specific methods (`record_hand_advance` for Clock, + `record_b1_ghost_hit` for ARC, etc.). The policy itself calls + these methods on its hot path. +- **Snapshot.** `MetricsSnapshotProvider` returns a `Copy` + `*MetricsSnapshot` struct ([`src/metrics/snapshot.rs`](../../src/metrics/snapshot.rs)) + — a point-in-time scalar copy of every counter. Snapshots are + `#[non_exhaustive]` for SemVer headroom and gated on `serde` for + cross-process transport. +- **Export.** `MetricsExporter` consumes a snapshot and pushes it + to an external system. The shipped implementation, + `PrometheusTextExporter` ([`src/metrics/exporter.rs`](../../src/metrics/exporter.rs)), + writes Prometheus exposition format to any `W: Write + Send`. + +Splitting these three lets: + +- **Policy code stay minimal.** A policy needs only the recorder + trait. It does not import snapshots or exporters. +- **Tests bypass production.** Bench harnesses use + `MetricsSnapshotProvider` + `MetricsReset` and never touch + `MetricsExporter`. Production code does the inverse. +- **Exporters multiply without policy churn.** Adding a StatsD or + OpenTelemetry exporter is a new `impl MetricsExporter` for the + snapshot types — no policy changes. + +## Per-policy recorder traits + +Every policy gets its own recorder trait extending +`CoreMetricsRecorder`. The shipped set: + +| Trait | Adds counters for | +|---|---| +| `FifoMetricsRecorder` | scan steps, stale skips, `pop_oldest` calls | +| `LruMetricsRecorder` | `pop_lru`, `peek_lru`, `touch`, `recency_rank` | +| `LruKMetricsRecorder` | extends `LruMetricsRecorder` + K-distance counters | +| `LfuMetricsRecorder` | `pop_lfu`, `peek_lfu`, frequency reads / mutates | +| `MfuMetricsRecorder` | mirrors LFU for most-frequent eviction | +| `ArcMetricsRecorder` | T1→T2 promotions, B1/B2 ghost hits, `p` movement | +| `CarMetricsRecorder` | recent→frequent, ghost hits, hand sweeps | +| `ClockMetricsRecorder` | hand advances, ref-bit resets | +| `ClockProMetricsRecorder` | cold↔hot transitions, test entries | +| `NruMetricsRecorder` | sweep steps, ref-bit resets | +| `SlruMetricsRecorder` | probationary→protected, protected evictions | +| `TwoQMetricsRecorder` | A1in→Am promotions, A1out ghost hits | +| `S3FifoMetricsRecorder` | promotions, main reinserts, ghost hits | + +Two design principles drive the granularity: + +- **Each counter answers a tuning question.** "Are my LRU-K + promotions worth the metadata?" "Is my ARC ghost list catching + meaningful hits?" Generic `evictions: u64` cannot answer either. +- **Counters live near their semantics.** `record_a1in_to_am_promotion` + belongs to 2Q because A1in/Am are 2Q concepts. Putting it on + `CoreMetricsRecorder` would force every other policy to either + implement a meaningless method or document a no-op. + +The trade is API surface: 14 recorder traits with ~5-10 methods +each. The mitigation is that **users do not implement them** — they +implement the shipped `*Metrics` structs through inherent methods on +each policy, and they read snapshots, not recorders. + +## The `&self`-vs-`&mut self` split + +Several `Cache` methods take `&self`: +[`trait-hierarchy.md`](trait-hierarchy.md#peek-vs-get--the-readmutate-split) +explains why. The metrics system has to honour this — a `&self` +read path cannot call a `&mut self` recorder. The shipped solution +is a parallel `*MetricsReadRecorder` family for each policy whose +read paths increment counters: + +| Mutable trait | Read-only counterpart | +|---|---| +| `FifoMetricsRecorder` | `FifoMetricsReadRecorder` | +| `LruMetricsRecorder` | `LruMetricsReadRecorder` | +| `LruKMetricsRecorder` | `LruKMetricsReadRecorder` | +| `LfuMetricsRecorder` | `LfuMetricsReadRecorder` | +| `MfuMetricsRecorder` | `MfuMetricsReadRecorder` | + +The read-only traits take `&self` on every method. They are +implemented through interior mutability on the concrete metrics +struct — specifically `MetricsCell`, the internal type that wraps +`Cell` with an `unsafe impl Sync` (covered below). + +Two questions this design avoided: + +- **"Why not put `Cell` directly on the metrics struct?"** + Because `Cell` is `!Sync`, which propagates and prevents + every policy struct that embeds metrics from being `Sync`. The + thin `MetricsCell` wrapper makes the synchronisation discipline + explicit at one site instead of N. +- **"Why not just `AtomicU64` for everything?"** Because counters + on `&mut self` paths (the majority — `insert`, `get`, `evict`) + do not need atomic semantics; the policy already holds exclusive + access. However, `MetricsCell` is only sound when `&self` metric + increments are protected by exclusive synchronization or are known + to be single-threaded. It is **not** a substitute for atomics under + shared `RwLock::read` access. + +## `MetricsCell`: interior mutability under external lock + +```rust,ignore +#[repr(transparent)] +#[derive(Debug, Default, Clone, PartialEq, Eq)] +pub(crate) struct MetricsCell(Cell); + +unsafe impl Sync for MetricsCell {} +unsafe impl Send for MetricsCell {} +``` + +This is the only `unsafe impl Sync` in the metrics surface, so its +contract must be narrow: + +- **Exclusive external synchronization is required.** A shared + `RwLock::read` guard does **not** serialize readers, so it is not + sufficient protection for `Cell`. `MetricsCell` may be used + on single-threaded policy paths, or behind a write lock / mutex, + but not for counters mutated concurrently through read-locked + `&self` methods. +- **Observation-only does not relax Rust's aliasing rules.** It is + acceptable for metrics to be approximate; it is not acceptable for + approximation to be implemented as unsynchronized `Cell` mutation. + Concurrent read-path counters must use `AtomicU64`, take an + exclusive lock, or be disabled for that path. +- **`pub(crate)`.** The type does not escape the crate. + Down-stream code can read counters through the snapshot API but + cannot construct `MetricsCell` itself, which prevents misuse from + outside the codebase. + +The alternatives considered and rejected: + +- `Mutex` — cost dominates the counter increment. +- `AtomicU64` — the correct choice for counters that can be + incremented concurrently through shared references; unnecessary + for single-threaded or exclusively locked counters. +- `RefCell` — runtime borrow checking with panic on contention; + not desirable on a metrics increment path. + +`MetricsCell` is the smallest tool for single-threaded or exclusively +locked metric counters. Any policy or wrapper that records metrics +from a read-locked path must not rely on `MetricsCell` for soundness. + +## Snapshots: cheap, copyable, optionally serializable + +Every snapshot struct in [`src/metrics/snapshot.rs`](../../src/metrics/snapshot.rs) +follows the same shape: + +```rust,ignore +#[derive(Debug, Default, Clone, Copy, PartialEq, Eq)] +#[non_exhaustive] +#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))] +pub struct LruMetricsSnapshot { + pub get_calls: u64, + pub get_hits: u64, + pub get_misses: u64, + pub insert_calls: u64, + pub insert_updates: u64, + pub insert_new: u64, + pub evict_calls: u64, + pub evicted_entries: u64, + pub pop_lru_calls: u64, + pub pop_lru_found: u64, + pub peek_lru_calls: u64, + pub peek_lru_found: u64, + pub touch_calls: u64, + pub touch_found: u64, + pub recency_rank_calls: u64, + pub recency_rank_found: u64, + pub recency_rank_scan_steps: u64, + pub cache_len: usize, + pub insertion_order_len: usize, + pub capacity: usize, +} +``` + +Five intentional properties: + +- **`Copy`.** A snapshot is a flat block of `u64`s and `usize`s. + Copying is a `memcpy` and snapshots can flow through channels, + futures, and test assertions without ceremony. +- **`Default`.** Equivalent to "no operations recorded." Useful for + test fixtures and explicit reset comparisons. +- **`#[non_exhaustive]`.** Adding a new counter (e.g. when a + policy variant gains a new internal step) is a minor version + bump. Downstream code matching on the struct must accept new + fields gracefully — the standard `non_exhaustive` discipline. +- **`PartialEq + Eq`.** Snapshot equality is well-defined and + useful in tests. Two snapshots compare equal iff every counter + matches. +- **Optionally `serde`.** Gated on `serde`, not unconditional, so + the metrics module doesn't drag serde into builds that don't + want it. + +Gauges (`cache_len`, `insertion_order_len`, `capacity`) live +alongside counters and snapshot together. The Prometheus exporter +writes the right `# TYPE` line for each, which matters for the +scraper. + +## Recording is push, consumption is pull + +Two operating models coexist: + +- **Recording is push from the policy.** The policy calls + `m.record_get_hit()` directly. The recorder method has the + cheapest possible body (one `+= 1`). This is the hot-path + contract. +- **Consumption is pull from the consumer.** Tests / benches / + exporters call `m.snapshot()` whenever they want a value, and + `MetricsReset::reset_metrics(&self)` when they want to clear. + Nothing about the policy timing depends on consumption. + +Specifically, the policy does **not** push to the exporter. There +is no observer-pattern hook from the recorder to the exporter, no +synchronous flush on every increment, and no async channel between +them. The pull model lets benches consume at known checkpoints +(once per iteration), and lets production scrapers poll on their +own cadence (every 10 s, every minute, etc.). + +The cost of the pull model is that an exporter cannot react to a +specific event (e.g. "evictions spiked above N"). cachekit users +who need event-driven reactions instrument at the application +layer, not the metrics layer. + +## Prometheus text exporter + +The shipped exporter (`PrometheusTextExporter` in +[`src/metrics/exporter.rs`](../../src/metrics/exporter.rs)) writes +the Prometheus text exposition format to any `W: Write + Send`: + +```rust,ignore +let exporter = PrometheusTextExporter::new("myapp_cache", io::stdout()); +let snapshot = lru_cache.snapshot(); +exporter.export(&snapshot); +``` + +Three design choices worth naming: + +- **Per-prefix instance.** The prefix (`myapp_cache`) is set at + construction, not per call. This keeps the call site simple and + enforces a single metric namespace per exporter instance. +- **I/O errors are silently dropped.** A failing write does not + panic the cache or surface a `Result`. The contract is + "fire-and-forget monitoring" — a transient `EPIPE` to a metrics + socket must not interrupt cache operations. Callers who need + guaranteed delivery should wrap their writer in something with + retry semantics and accept the cost. +- **The writer is `Mutex`, not `RwLock`.** Writing is + always exclusive; there's no read path. Using `Mutex` here is + the right primitive even though most of cachekit uses + `parking_lot::RwLock`. (Note: this is `std::sync::Mutex`, + poisoning-aware. `export` panics on poisoning. This is a + deliberate divergence from `parking_lot` — the exporter is on + the cold path and the std mutex's poisoning behaviour is fine + there.) + +Other exporters (StatsD, OpenTelemetry, custom) plug in by +implementing `MetricsExporter` for each snapshot type they +care about. No changes elsewhere in the crate are required. + +## Feature gating: all-or-nothing at compile time + +The entire metrics subsystem is gated on the `metrics` Cargo +feature: + +```rust +// src/lib.rs +#[cfg(feature = "metrics")] +pub mod metrics; +``` + +Inside each policy, recorder calls are wrapped: + +```rust,ignore +#[cfg(feature = "metrics")] +self.metrics.record_get_hit(); +``` + +When `metrics` is **off**: + +- The entire `metrics` module disappears from the build. +- Every `record_*` call site becomes a no-op (the `#[cfg]` block + compiles away). +- Snapshot types are not in the public API. +- Build time drops; binary size drops; no runtime cost. + +When `metrics` is **on**: + +- Recording costs one `u64 += 1` per call (or one `Cell::set` for + read-only counters). For a 17-policy `DynCache` that records on + every `get` / `insert`, the overhead is sub-nanosecond and shows + up in benches as flat regression. +- The `metrics::snapshot` and `metrics::exporter` modules are in + the public API and exporting infrastructure is available. + +The trade-off is deliberate. No "low-cardinality always-on, +detailed-on-demand" two-tier scheme exists — every counter is +either always present (feature on) or absent (feature off). The +discipline that keeps "always present" cheap is the recorder +contract: methods do no work beyond incrementing a counter. + +## What about `StoreMetrics`? + +`StoreMetrics` ([`src/store/traits.rs`](../../src/store/traits.rs)) +is a **separate**, simpler structure that ships unconditionally +(not behind `metrics`). It carries the universal counters every +store-layer implementation tracks: + +```rust,ignore +pub struct StoreMetrics { + pub hits: u64, + pub misses: u64, + pub inserts: u64, + pub updates: u64, + pub removes: u64, + pub evictions: u64, +} +``` + +The two systems coexist: + +- `StoreMetrics` is the store-layer baseline. Always present, always + cheap, six counters. +- `src/metrics/` (feature-gated) is the policy-layer detailed + metrics — recorder traits, snapshots, exporter, per-policy signals. + +A store typically backs `StoreMetrics` with `AtomicU64` counters +(see `StoreCounters` in [`src/store/weight.rs`](../../src/store/weight.rs)), +because stores are often behind concurrent wrappers and the +increment paths can be `&self`. The split mirrors the +sequential-vs-concurrent split at the trait level +([`concurrency.md`](concurrency.md)). + +## Counter discipline + +Three rules every recorder method follows: + +1. **No allocation.** Counter increments are O(1) and allocation-free. +2. **No fallible operations.** A counter must not be in a position + where it can fail — `+=` always succeeds; saturation is + acceptable for u64 wrap (it takes years at billions/sec). +3. **No conditional logic beyond the counter itself.** A recorder + method that branches on cache state belongs in the policy, not + in metrics. + +The corollary: a policy that wants a derived counter ("number of +evictions where the victim's recency rank was > 10") computes the +condition itself and calls one of two existing methods accordingly. +Putting the branching inside the recorder would couple metrics to +policy state. + +## Adding a new metric + +Checklist for adding a per-policy counter: + +1. **Add the field.** Plain `u64` if it's updated on `&mut self` + paths; `MetricsCell` if it's updated on `&self` paths. Place it + in the corresponding `*Metrics` struct under + [`src/metrics/metrics_impl.rs`](../../src/metrics/metrics_impl.rs). +2. **Add the recorder method.** On the relevant `*MetricsRecorder` + trait (or its `*ReadRecorder` counterpart for `&self`). +3. **Implement on the policy's metrics struct.** One-line + `+= 1` body. +4. **Wire the call site in the policy.** Wrap with + `#[cfg(feature = "metrics")]`. +5. **Add the field to the snapshot.** In + [`src/metrics/snapshot.rs`](../../src/metrics/snapshot.rs). The + snapshot's `From<&*Metrics>` (or equivalent) needs the new + field. +6. **Update the exporter.** Add a `write_counter` / + `write_gauge` call in `PrometheusTextExporter::export` for the + new field. + +Six locations is a lot of friction for a new counter. The friction +is intentional — adding a counter is rarely the right answer to a +debugging question, and the friction encourages reuse of existing +counters where possible. + +## Adding a new metric **type** (gauge vs counter, histogram) + +Histograms and sliding windows are deliberately out of scope. Adding +either is a wider design change: + +- The recorder traits assume `&mut u64 += 1` semantics. A histogram + needs `observe(value)` semantics and an aggregation strategy. +- The snapshot types assume `Copy` and `u64` fields. A histogram + snapshot needs bucket arrays. +- The Prometheus exporter writes counters and gauges only. + +If histograms become needed (the most likely use case is latency +distribution per policy), the design has space: introduce a +`HistogramRecorder` trait alongside `CoreMetricsRecorder` and a +matching `HistogramSnapshot`. The existing exporter stays counter- +and-gauge-only; a new `PrometheusHistogramExporter` handles the +new shape. The current omission is a coverage decision, not a +foundation problem. + +## Guarantees and non-guarantees + +What the metrics system guarantees: + +- **Eventual consistency in single-threaded builds.** Every recorded + event eventually appears in `snapshot()` for the same thread. +- **Snapshot atomicity per counter.** A snapshot reads each + counter as a single load; no torn `u64` reads on 64-bit + platforms. +- **No cache correctness impact.** Metrics never block, panic + (except `PrometheusTextExporter` on poisoned mutex), or alter + cache state. + +What it does **not** guarantee: + +- **Cross-counter snapshot consistency.** A snapshot reads counters + sequentially. A reader can observe `hits = 100, misses = 99` + while a concurrent writer is mid-update; the next snapshot may + show `hits = 100, misses = 101`. There is no "snapshot epoch." +- **Concurrent `MetricsCell` recording.** `MetricsCell` must not be + incremented from multiple read-locked callers. Shared read locks do + not serialize readers, so those paths must use atomics or acquire an + exclusive lock before recording. Metrics may be best-effort, but + the implementation still has to be data-race-free. +- **Wrap-safe arithmetic in release.** Release profile sets + `overflow-checks = false`. Counters wrap silently. At one billion + events per second, `u64` wraps in ~585 years — practically a + non-issue, formally not a guarantee. + +## See also + +- [Design overview](design.md) — §6 frames metrics at the + principles level +- [Cache trait hierarchy](trait-hierarchy.md) — `&self` / `&mut self` + split that drives the read-vs-mutate recorder fork +- [Concurrency](concurrency.md) — read/write lock model that + constrains where `MetricsCell` may be used +- [Error model](error-model.md) — panic discipline shared by the + exporter's poisoning behaviour +- [`src/metrics/`](../../src/metrics) — the canonical implementation +- [`src/store/traits.rs`](../../src/store/traits.rs) — + `StoreMetrics`, the unconditional store-layer counterpart diff --git a/docs/design/non-goals.md b/docs/design/non-goals.md new file mode 100644 index 0000000..369e28c --- /dev/null +++ b/docs/design/non-goals.md @@ -0,0 +1,166 @@ +# Non-Goals + +> Status: explicit boundaries for cachekit's design. Companion to +> [`design.md`](design.md), which states what the crate optimizes for. + +Good design needs negative space. This document records what cachekit is **not** +trying to be, so future features can be judged against the same boundaries. + +## Not a Distributed Cache + +cachekit is an in-process cache library. It does not provide: + +- network protocols; +- replication; +- cluster membership; +- consistent hashing across processes; +- cross-node invalidation; +- persistence guarantees. + +Use Redis, Memcached, or a database/cache service when those are the problem. +cachekit can still be useful inside a node in front of those systems. + +## Not a Full Application Cache Framework + +cachekit does not manage: + +- request coalescing / singleflight; +- background refresh; +- cache stampede suppression; +- application-specific invalidation rules; +- loader functions or read-through APIs as the primary abstraction. + +The library provides cache primitives and policies. Application frameworks can +compose them into higher-level behaviours. + +## Not Async-Native Today + +`AsyncCacheFuture` exists as a placeholder, but the shipped policies are +synchronous. Async-native traits are not currently implemented. + +The reason is not that async is unimportant. It is that async cache APIs need +owned values, cancellation semantics, loader lifetime rules, and executor +integration. Adding `async fn get_or_insert_with` to the core trait would break +object safety and pull async choices into every policy. + +Future async support should be a separate layer, not a mutation of +`Cache`. + +## Not `no_std` + +cachekit uses: + +- `std::collections`; +- `std::sync::Arc`; +- `std::time` in planned TTL work; +- `parking_lot` for concurrent wrappers; +- benchmark and metrics tooling built around `std`. + +`no_std` would require a different allocator story, different synchronization +surface, and feature-gated alternatives for large parts of the crate. It is not +a current target. + +## Not Lock-Free + +The concurrency design is explicit and lock-based: + +- `Concurrent*` wrappers use `parking_lot::RwLock`; +- sharded structures use one lock per shard; +- future lock-free reads are a research direction, not current design. + +Lock-free structures would need a separate memory reclamation strategy, +different value ownership rules, and a much larger unsafe surface. The current +crate favours predictable, reviewable lock boundaries. + +## Not a HashDoS Firewall + +Some public surfaces use DoS-resistant hashing by default (`HashMapStore`, +`ClockRing`, `ShardSelector::randomized`). Other hot internal surfaces use +`FxHashMap` for speed. + +cachekit documents those choices, but it is not a general-purpose security +boundary. Callers with adversarial keys must choose safe constructors, bound +admission, and avoid exposing interned handles or `total_weight` across trust +boundaries. + +## Not a Serialization Format for Live Caches + +The `serde` feature supports metrics snapshots and `StoreMetrics`, not live +cache state. Serializing a policy means deciding what to do with recency lists, +ghost history, clock hands, hash seeds, `Arc` identity, and TTL deadlines. + +Until a policy has an explicit restore contract, do not derive serde for it. + +## Not a General Metrics Platform + +The metrics layer provides counters, gauges, snapshots, reset, and a Prometheus +text exporter. It does not provide: + +- high-cardinality labels; +- histograms; +- sampling; +- streaming events; +- tracing spans; +- alerting. + +Use your monitoring stack for those. cachekit exposes enough counters to make +policy tuning possible without making the cache own observability. + +## Not a Policy Research Playground at the Cost of Hot Paths + +New policies are welcome, but they must fit the crate's constraints: + +- no per-operation allocation in hot paths; +- predictable eviction cost; +- feature-gated implementation; +- docs and benchmarks; +- clear workload motivation. + +A clever algorithm that needs tree walks, heap allocation on every access, or +opaque trait-object dispatch in the hot loop belongs in a research branch until +benchmarks justify it. + +## Not a Replacement for Workload Analysis + +cachekit ships many policies, but it cannot choose your workload for you. +`CachePolicy::Lru` or `CachePolicy::S3Fifo` are defaults, not guarantees. Users +still need to measure reuse distance, scan rate, write ratio, object sizes, and +tail latency under representative traffic. + +The benchmark suite provides workload generators to help, but it cannot infer +production behaviour automatically. + +## Not a Stability Promise for Internal Layout + +Public traits and documented constructors follow SemVer. Internal layout does +not: + +- slot ids; +- intrusive-list node fields; +- heap tombstone representation; +- ghost-list internals; +- metric recorder implementation details; +- `DynCache`'s private `CacheInner` enum. + +If downstream code depends on private layout, it is outside the compatibility +contract. + +## How To Use This Doc + +When proposing a feature, ask: + +1. Does it violate one of these non-goals? +2. If yes, is it a new layer that keeps the core intact? +3. Can it be feature-gated so users who do not need it pay nothing? +4. Does it preserve hot-path constraints? +5. Does it belong in cachekit, or in an application/framework crate above it? + +If the answer is unclear, write a design doc before implementation. + +## See Also + +- [Design overview](design.md) +- [Concurrency](concurrency.md) +- [Serialization](serialization.md) +- [Metrics](metrics.md) +- [Benchmarking](benchmarking.md) diff --git a/docs/design/serialization.md b/docs/design/serialization.md new file mode 100644 index 0000000..a6fc80f --- /dev/null +++ b/docs/design/serialization.md @@ -0,0 +1,195 @@ +# Serialization + +> Status: design rationale for the current `serde` feature and the boundaries +> around future cache-state persistence. Companion to [`metrics.md`](metrics.md), +> [`ttl.md`](ttl.md), and [`builder-and-dyn-dispatch.md`](builder-and-dyn-dispatch.md). + +cachekit has a narrow serialization surface today. The `serde` feature derives +`Serialize` / `Deserialize` for metrics snapshots and `StoreMetrics`; it does +**not** serialize cache contents, policy metadata, hash-map state, locks, or +builder dispatchers. + +That boundary is intentional. Metrics are stable observations. Cache state is +live data with policy invariants, hash seeds, pointer-like handles, and optional +time semantics. + +## Current Surface + +With `features = ["serde"]`, these public value types derive serde: + +- `StoreMetrics` in [`src/store/traits.rs`](../../src/store/traits.rs). +- Every metrics snapshot in [`src/metrics/snapshot.rs`](../../src/metrics/snapshot.rs). + +Properties: + +- They are flat value types (`u64`, `usize`, optional nested stats). +- They are `#[non_exhaustive]`, so new fields are SemVer-compatible at the Rust + API level but still require schema discipline for serialized consumers. +- They carry observations, not live handles into cache internals. + +No policy type implements serde today. No store type serializes entries today. + +## Why Metrics Are Safe To Serialize + +Metrics snapshots are point-in-time copies: + +```rust,ignore +#[derive(Debug, Default, Clone, Copy, PartialEq, Eq)] +#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))] +pub struct LruMetricsSnapshot { + pub get_calls: u64, + pub get_hits: u64, + // ... +} +``` + +Serializing a snapshot cannot corrupt a cache on restore because there is no +restore into a running policy. At most, a downstream dashboard sees old or +partial counters. That matches the metrics contract: best-effort observability. + +## Why Cache State Is Not Serialized + +Serializing a cache is not just "serialize a map." A policy may contain: + +- Intrusive list pointers or slot ids. +- Ghost-list history. +- Clock hand position and reference bits. +- ARC/CAR adaptive target parameters. +- Lazy heap tombstones. +- Hash seeds and randomized map order. +- `Arc` sharing state. +- TTL deadlines based on monotonic time. + +Restoring only keys and values discards policy warm state. Restoring every +internal field exposes private representation and risks accepting corrupted +state from disk. + +The default position: **do not serialize policy internals until there is a +specific restore contract for that policy.** + +## Two Possible Future Modes + +If cache-state serialization lands later, it should choose one of two modes per +type. + +### Data-only restore + +Serialize only entries (`K`, `V`) plus capacity/config. On restore, rebuild the +policy as if entries were inserted in serialized order. + +Pros: + +- Simple and robust. +- No private invariants exposed. +- Cross-version friendly. + +Cons: + +- Loses recency/frequency/ghost history. +- Warm cache may behave cold after restore. +- Restore order becomes a semantic choice. + +### Warm-state restore + +Serialize policy metadata too: list order, frequency counters, clock hand, +ghost lists, ARC target, etc. + +Pros: + +- Better post-restore hit rate. +- Useful for long-lived caches that restart often. + +Cons: + +- Representation becomes part of the serialization contract. +- Every restore must validate invariants. +- Version migration becomes policy-specific. + +Warm-state restore should be opt-in per policy, not a blanket derive. + +## TTL and Time + +TTL is the hardest serialization case because monotonic ticks are not portable +across process restarts. The TTL design doc recommends serializing **relative +remaining duration**, not raw `Instant`-derived ticks. + +Rules for future TTL serialization: + +- Never serialize raw monotonic `Tick` as if it were wall time. +- Capture remaining duration at serialization time. +- Restore by adding remaining duration to the new process clock. +- Expired-at-serialization entries should either be omitted or restored as + expired and immediately purged. Prefer omission for data-only restore. +- Wall-clock deadlines require a separate API and explicit drift semantics. + +This keeps `Clock` pluggable and avoids replaying meaningless old monotonic +values. + +## Hash Seeds and Map Order + +Do not serialize: + +- `RandomState` seeds. +- `ShardSelector::randomized` key material. +- Hash-map bucket order. +- Internal `FxHashMap` iteration order. + +Serialize semantic data only: keys, values, capacity, policy config, and, if +warm restore is explicitly chosen, policy metadata in a stable schema. + +`ShardSelector::new(shards, seed)` is the exception because deterministic +routing is its public contract. If a type exposes deterministic sharding as +part of serialized config, the seed is config data and must be treated as +secret if keys are attacker-controlled. + +## `Arc` and Sharing + +Several policies and stores use `Arc`. Serialization should treat `Arc` +as `V`, not as identity-preserving shared ownership: + +- Do not attempt to preserve `Arc::ptr_eq` relationships. +- Do not serialize refcounts. +- Do not serialize weak references. + +If multiple keys point at the same `Arc`, data-only serialization will +duplicate the value unless the caller provides a higher-level interning scheme. +That is acceptable; cachekit should not infer value identity. + +## Schema Discipline + +For serialized artifacts controlled by cachekit (benchmark JSON, metrics +snapshots), use explicit schema rules: + +- Additive optional fields are minor schema changes. +- Removing or renaming required fields is a major schema change. +- Stable identifiers should be constants, not string literals. +- Include enough metadata for interpretation: version, feature set where + relevant, timestamp, and config. + +For serde-derived Rust structs, `#[non_exhaustive]` is not enough for external +JSON compatibility. A downstream JSON consumer still sees fields. If stable +wire compatibility matters, introduce an explicit versioned artifact type +rather than serializing internal structs directly. + +## What Not To Derive + +Do not add `#[derive(Serialize, Deserialize)]` to a policy type just because it +compiles. Check: + +- Does the serialized form expose private pointers, slot ids, or tombstones? +- Can deserialization validate every invariant? +- What happens if the target version has different metadata layout? +- Are hash seeds or time ticks being persisted accidentally? +- Does restoring this type produce a live, safe cache or only a bag of entries? + +If the answer is not clear, add a separate DTO (`SerializableLruCache`) and a +fallible `try_from` restore path. + +## See Also + +- [Metrics](metrics.md) - current serde-supported snapshot types +- [TTL design](ttl.md) - relative TTL serialization recommendation +- [Hashing and key identity](hashing.md) - hash seeds and map order +- [Error model](error-model.md) - fallible restore should use `Result` +- [`src/metrics/snapshot.rs`](../../src/metrics/snapshot.rs) +- [`bench-support/src/json_results.rs`](../../bench-support/src/json_results.rs) diff --git a/docs/design/sharding.md b/docs/design/sharding.md new file mode 100644 index 0000000..ef5c3ea --- /dev/null +++ b/docs/design/sharding.md @@ -0,0 +1,157 @@ +# Sharding + +> Status: design rationale for sharded data structures that exist today and +> roadmap notes for sharded cache policies. Companion to +> [`concurrency.md`](concurrency.md) and [`hashing.md`](hashing.md). + +Sharding reduces contention by splitting one shared structure into N independent +substructures, each with its own lock and capacity accounting. cachekit already +uses this pattern at the data-structure and store layers. It does **not** yet +ship a generic `ShardedCache` or sharded policy wrapper. + +## Current Sharded Primitives + +| Type | Layer | Purpose | +|---|---|---| +| `ShardedHashMapStore` | store | N locked hash maps with global size counter | +| `ShardedSlotArena` | data structure | N arenas addressed by `ShardedSlotId` | +| `ShardedFrequencyBuckets` | data structure | N frequency bucket sets for concurrent LFU-style metadata | +| `ShardSelector` | helper | keyed hash routing from key to shard | + +The sharded primitives are building blocks, not full cache policies. A future +`ShardedLruCache` would have to compose a sharded key index, per-shard recency +metadata, and global capacity semantics. That composition is where the hard +policy questions live. + +## Why Shard? + +A single `RwLock` wrapper is simple and often fast enough. It fails when: + +- many threads mutate policy metadata (`get` on LRU, LFU, Clock); +- read paths still need atomics or lock acquisition; +- one hot lock dominates profile samples; +- cores spend more time waiting than doing cache work. + +Sharding turns one contended lock into N less-contended locks. The cost is that +each shard is now a smaller cache with less global knowledge. + +## Shard Routing + +All routing should go through [`ShardSelector`](../../src/ds/shard.rs): + +```rust,ignore +let selector = ShardSelector::randomized(16); +let shard = selector.shard_for_key(&key); +``` + +Routing requirements: + +- Deterministic within a selector: same key maps to same shard. +- Uniform: no systematic bias toward lower shards. +- Keyed: adversaries should not be able to craft keys that all land on shard 0. +- Bounded: shard count is clamped to `[1, MAX_SHARDS]`. + +Use `ShardSelector::randomized` unless reproducibility is required. If using +`ShardSelector::new(shards, seed)`, treat `seed` as secret when keys are +user-controlled. + +## Capacity Semantics + +Two capacity models are possible: + +| Model | Behaviour | Pros | Cons | +|---|---|---|---| +| Per-shard capacity | total capacity split across shards | simple, one lock per op | hit rate fragmentation | +| Global capacity | one shared capacity budget | better utilization | cross-shard locking or global victim selection | + +The primitives today mostly follow **per-shard local state with global gauges**: +each shard owns its data; aggregate `len` is tracked separately where needed. +This keeps operations single-lock. It also means a full shard can evict even if +another shard has spare room. + +That is acceptable for stores and metadata primitives. For a full cache policy, +it is a hit-rate trade-off and must be documented at the policy level. + +## Locking Discipline + +Current sharded operations acquire at most **one shard lock**. This is the most +important invariant: + +- No deadlock cycles. +- Lock hold time stays bounded by one shard operation. +- Callers do not need a global lock ordering table. + +Any future operation that touches two shards must define an ordering rule, for +example "lock lower shard index first." Avoid two-shard operations unless the +hit-rate improvement justifies the concurrency risk. + +## `ShardedSlotId` + +`ShardedSlotArena` cannot use a plain `SlotId`. A slot id must identify both +the shard and the local slot: + +```text +ShardedSlotId = (shard_index, local_slot_id) +``` + +This is why sharding lives at the data-structure layer instead of being hidden +behind a generic wrapper. Once a policy stores handles, the handle type is part +of the policy's metadata layout. + +## Global Metrics + +Sharded types should expose aggregate metrics but record locally when possible. +The rule: + +- Per-operation counters can be local or atomic. +- Gauges like total `len` need either an atomic aggregate or a shard scan. +- Snapshot consistency is best-effort; do not lock every shard just to make a + metrics snapshot globally atomic. + +This matches the metrics design: observability must not dominate the hot path. + +## Roadmap: `ShardedCache` + +A generic sharded cache wrapper would look roughly like: + +```rust,ignore +pub struct ShardedCache { + shards: Vec>, + selector: ShardSelector, + capacity_per_shard: usize, + _key: PhantomData, +} +``` + +Open questions: + +- Does `C` have to be constructible by `CacheFactory`, or does the builder own + all construction? +- Is capacity split evenly, weighted by shard traffic, or global? +- Do policies expose per-shard metrics only, or aggregate metrics too? +- How does `DynCache` integrate: `DynCache::Sharded(Box<...>)` or a sibling + `DynShardedCache`? +- Should shard count be caller-specified, CPU-count-derived, or both? + +The conservative first version should use per-shard capacity and one-lock +operations. Global victim selection should wait for benchmark evidence. + +## When Not To Shard + +- Cache fits on one lock without contention. +- Hit rate matters more than write throughput. +- Workload has a small hot set: all hot keys may still map to one shard. +- Cache capacity is small: per-shard fragmentation dominates. +- You need globally strict eviction order (true global LRU, ARC target `p`). + +Sharding is a concurrency optimization, not a policy upgrade. + +## See Also + +- [Concurrency](concurrency.md) +- [Hashing and key identity](hashing.md) +- [Metrics](metrics.md) +- [`src/ds/shard.rs`](../../src/ds/shard.rs) +- [`src/store/hashmap.rs`](../../src/store/hashmap.rs) +- [`src/ds/slot_arena.rs`](../../src/ds/slot_arena.rs) +- [`src/ds/frequency_buckets.rs`](../../src/ds/frequency_buckets.rs) diff --git a/docs/design/trait-hierarchy.md b/docs/design/trait-hierarchy.md new file mode 100644 index 0000000..2eee6ea --- /dev/null +++ b/docs/design/trait-hierarchy.md @@ -0,0 +1,415 @@ +# Cache Trait Hierarchy + +> Status: design rationale for the trait surface in +> [`src/traits.rs`](../../src/traits.rs). Companion to the cross-cutting +> principles in [`docs/design/design.md`](design.md) §7 and the concurrency +> rationale in [`docs/design/concurrency.md`](concurrency.md). + +cachekit exposes its policies through a small, layered trait hierarchy. +One kernel trait (`Cache`) covers what every policy must do; +optional capability traits expose signals that some policies have and +others don't. This document explains why the surface is shaped this +way, what each trait promises, and how to add new capabilities without +breaking the kernel. + +## Goals + +The trait surface optimizes for four things, roughly in order: + +1. **Code written against the kernel survives a policy swap.** Users + writing `fn warm>(c: &mut C, …)` can pick any of + the 18 implemented concrete policies without changing call sites. +2. **Optional behaviour is visible only when present.** A policy that + doesn't track frequency should not have a `frequency()` method that + returns garbage or panics. Capability traits exist so this remains + true. +3. **The kernel stays object-safe.** `Box>` is needed + for runtime dispatch (the `DynCache` enum is the chosen alternative, + but object safety keeps the door open and keeps the trait usable in + trait objects elsewhere). +4. **The read/mutate split is explicit.** `peek` and `contains` are + side-effect-free `&self` methods; `get` is `&mut self` because it + updates policy state. This drops out of point 3 but is worth naming + on its own because it shapes the concurrent surface + ([`docs/design/concurrency.md`](concurrency.md)). + +## Map of the hierarchy + +```text + ┌───────────────────────┐ + │ Cache │ object-safe kernel + │ contains, len, │ + │ capacity, peek, get, │ + │ insert, remove, │ + │ clear, is_empty │ + └───────────┬───────────┘ + │ extends + ┌───────────────┬───────────┼───────────┬──────────────────┐ + ▼ ▼ ▼ ▼ ▼ + EvictingCache VictimInspect RecencyTrack FrequencyTrack HistoryTrack + evict_one() peek_victim() touch, frequency() access_count, + recency_rank k_distance, + access_history, + k_value + + ConcurrentCache CacheFactory + CacheConfig AsyncCacheFuture (utility traits, + (unsafe marker) (constructor abstraction) (Phase 2) not extensions) +``` + +All capability traits in the upper row extend `Cache`. They +compose by being implemented additively — `LrukCache` implements +`Cache`, `RecencyTracking`, `FrequencyTracking`, **and** +`HistoryTracking` because it tracks all three signals. + +## Layer 1 — `Cache` + +The kernel trait. Every policy implements it. The full signature lives +in [`src/traits.rs`](../../src/traits.rs); the design decisions worth +naming are: + +```rust +pub trait Cache { + fn contains(&self, key: &K) -> bool; + fn len(&self) -> usize; + fn is_empty(&self) -> bool { self.len() == 0 } + fn capacity(&self) -> usize; + + fn peek(&self, key: &K) -> Option<&V>; + fn get(&mut self, key: &K) -> Option<&V>; + fn insert(&mut self, key: K, value: V) -> Option; + fn remove(&mut self, key: &K) -> Option; + fn clear(&mut self); +} +``` + +### Object safety + +The signature deliberately avoids every feature that would break +object safety: + +- No generic methods. +- No `Self` in return position (except by reference, which is allowed). +- No `where Self: Sized` bounds. +- No `impl Trait` returns. + +This costs ergonomics — batch operations like `insert_many`, +`get_or_insert_with(closure)`, and `extend(iter)` stay as inherent +methods on each policy rather than landing on `Cache` itself. +That trade is intentional: keeping the trait object-safe means +`DynCache` is *able* to dispatch through it (even though the +shipped `DynCache` is an enum dispatcher rather than a trait object — +see [`design.md`](design.md) §13). It also keeps `Box>` +available for users writing test harnesses, factories, or registries +that need true type erasure. + +### `peek` vs `get` — the read/mutate split + +This is the most consequential design decision in the kernel: + +- **`peek(&self, …) -> Option<&V>`** does not update recency, frequency, + reference bits, segment placement, or any policy state. It is the + honest read. +- **`get(&mut self, …) -> Option<&V>`** is the policy-tracked read. + An LRU `get` moves the entry to MRU; an LFU `get` bumps the + frequency counter; a Clock `get` sets the reference bit. + +Three things fall out of the split: + +1. **`peek` is usable behind a read lock.** Concurrent wrappers + ([`docs/design/concurrency.md`](concurrency.md)) implement their + `peek` with `RwLock::read`, allowing multiple readers to proceed + in parallel. `get` requires `RwLock::write` because it mutates. +2. **`peek` is testable as a pure function.** Hit-rate measurements, + invariant assertions, and debug prints can use `peek` without + perturbing the policy. +3. **`len` / `contains` / `capacity` are also `&self`.** They live + alongside `peek` in the read-locked surface of concurrent wrappers, + for the same reason. + +`contains` is its own method — not `peek(key).is_some()` — because +some policies (S3-FIFO, ARC, CAR with ghost lists) can answer +"is this key resident?" cheaper than they can return a value reference. + +### `&V` return positions + +Returning `&V` rather than `V`-by-value or `Arc` is the right +choice **for the sequential trait**. Callers who need ownership can +clone; callers who don't pay nothing. The cost shows up in concurrent +wrappers, which cannot return `&V` across a lock boundary — that's +why `Concurrent*` types deviate from `Cache` (covered in detail +in [`concurrency.md`](concurrency.md)). + +### Default methods + +Only `is_empty` has a default. Adding more defaults — even ones that +seem obviously implementable in terms of other methods — would push +performance regressions onto policies that have cheaper specialised +implementations. The hashmap-backed `contains` is faster than the +default `peek(…).is_some()` because it skips fetching the value, and +that difference matters on hot lookup paths. + +## Layer 2 — Capability traits + +Each capability trait extends `Cache` and exposes a signal that +**some but not all** policies have. The rule is: + +> Implement the capability trait only when the policy genuinely +> exposes that signal. Do not stub out the methods with sentinel +> returns. + +### `EvictingCache: Cache` + +```rust +fn evict_one(&mut self) -> Option<(K, V)>; +``` + +Forces a single eviction by policy. Returns the evicted entry or +`None` if the cache is empty. Useful for benchmarks ("evict 1 % of +the cache and measure"), background cleanup, and capacity-on-demand +patterns. Implemented by FIFO, LIFO, LRU, FastLRU, Heap-LFU, S3-FIFO, +Clock, Clock-PRO, LRU-K, MFU, MRU, plus the LFU variants. + +Policies that **do not** implement it: ARC, CAR, NRU, Random, SLRU, +2Q. The rustdoc on `EvictingCache` lists this set explicitly. The +reason is policy-specific: + +- ARC / CAR evict via adaptive choice across two queues; "evict one + by policy" is ambiguous without an insertion that drives the + adaptation. +- NRU sweeps reference bits; an isolated `evict_one` may scan the + whole cache. +- Random has no order; users who want random eviction should call + `remove(random_key)` themselves. +- SLRU / 2Q's victim depends on which segment is over-quota, + which only happens under capacity pressure. + +The trait is `#[must_use]` on its return because dropping the evicted +entry on the floor is rarely what callers want. + +### `VictimInspectable: Cache` + +```rust +fn peek_victim(&self) -> Option<(&K, &V)>; +``` + +Read-only access to the entry that would be evicted next. Only +implemented by policies whose victim is cheap and stable to identify +without mutating state — FIFO, LIFO, LRU, FastLRU. Clock-family +policies don't implement it because identifying the victim requires +advancing the hand (a mutation). LFU-family policies don't implement +it because the heap top can be a stale entry that hasn't been popped +yet ([`LazyMinHeap`](../../src/ds/lazy_heap.rs)). + +The signature is deliberately `&self`-only. Anything that would force +`&mut self` (lazy heap rebuild, clock-hand advance, ARC adaptation) +disqualifies the policy from implementing it. + +### `RecencyTracking: Cache` + +```rust +fn touch(&mut self, key: &K) -> bool; +fn recency_rank(&self, key: &K) -> Option; +``` + +For policies that order entries by access recency: LRU, FastLRU, +LRU-K. `touch` is `get` without the value lookup — useful when you +want to refresh recency for a key whose value you already have. +`recency_rank` returns 0 for the MRU entry and `len() - 1` for the +LRU. Both are stable across `peek`/`contains`/`len` calls but invalidate +on any `&mut` call. + +### `FrequencyTracking: Cache` + +```rust +fn frequency(&self, key: &K) -> Option; +``` + +For policies that track access frequency: LFU, Heap-LFU, MFU, LRU-K. +The `u64` return is intentional even though some policies use smaller +counters internally (LFU uses small saturating counters under +[`FrequencyBuckets`](../../src/ds/frequency_buckets.rs)) — exposing +`u64` keeps the trait stable across counter-width changes. + +### `HistoryTracking: Cache` + +```rust +fn access_count(&self, key: &K) -> Option; +fn k_distance(&self, key: &K) -> Option; +fn access_history(&self, key: &K) -> Option>; +fn k_value(&self) -> usize; +``` + +LRU-K style access-history inspection. Currently implemented only by +`LrukCache`. The `access_history` return is a `Vec` because the +history is bounded by K and callers typically inspect it as a unit; +exposing the underlying [`FixedHistory`](../../src/ds/fixed_history.rs) +would couple consumers to an internal type. + +`k_value()` is on the trait rather than as a constructor argument +witness because LRU-K's K is policy-configured and consumers writing +generic code over `HistoryTracking` need to read it without knowing +the concrete type. + +## Why capability traits, not feature flags? + +cachekit could expose recency / frequency / history through methods +on `Cache` itself, gated by Cargo features. It doesn't, for +three reasons: + +- **Compile-time gating doesn't match the actual gating signal.** + Whether a method is meaningful depends on the **policy**, not on + the **build**. A `policy-all` build still has policies that can't + answer `frequency()`. +- **Method-level defaults that return `None` are a footgun.** Code + that calls `cache.frequency(&k)` on an LRU cache would silently + return `None` and pass through review. +- **Trait bounds carry information.** `fn warm()` + documents at the type-system level that the function only makes + sense for frequency-tracking caches. + +The trade is one extra `use` statement at call sites — `use +cachekit::traits::{Cache, RecencyTracking};` — which is a small price +for the correctness gain. + +## Utility traits + +Three traits live alongside the hierarchy but are not extensions of +`Cache`. + +### `unsafe trait ConcurrentCache: Send + Sync` + +Marker trait, no methods. Implementing it asserts that the type +handles internal synchronization safely. Covered in detail in +[`concurrency.md`](concurrency.md#concurrentcache-marker-trait-not-capability-trait). + +### `CacheFactory` and `CacheConfig` + +```rust +pub trait CacheFactory { + type Cache: Cache; + fn new(capacity: usize) -> Self::Cache; + fn with_config(config: CacheConfig) -> Self::Cache; +} +``` + +Constructor abstraction for generic code that needs to build caches +without naming the concrete type. `CacheConfig` is a `#[non_exhaustive]` +struct with builder-style `with_*` methods, mirroring the wider +`CacheBuilder` shape in [`src/builder.rs`](../../src/builder.rs). + +In practice most code constructs caches directly (`LruCache::new(…)`) +or through `CacheBuilder`. `CacheFactory` mostly exists for test +harnesses and benchmark runners that want to parameterise across +policies; the trait's `Cache` associated type makes that ergonomic. + +### `AsyncCacheFuture: Send + Sync` + +Phase 2 placeholder. The methods (`supports_async_get`, +`supports_async_insert`) default to `false` and no policy overrides +them. The trait exists so that async-native policies can be added in +the future without breaking the existing surface. + +## Read/mutate split rationale (recapitulated) + +Worth stating once more in one place: the methods on `Cache` +split cleanly into two groups: + +| `&self` (read-locked-safe) | `&mut self` (write-locked) | +|----------------------------|----------------------------| +| `contains`, `len`, `is_empty`, `capacity` | `get`, `insert`, `remove`, `clear` | +| `peek` | | +| (capability) `peek_victim`, `recency_rank`, `frequency`, `access_count`, `k_distance`, `access_history`, `k_value` | (capability) `evict_one`, `touch` | + +This is the contract the concurrent wrappers rely on. Adding a new +`Cache` method that mutates state through `&self` (interior mutability) +would break the lock-granularity story; adding one that takes `&mut +self` but doesn't logically mutate would prevent the read-lock fast +path in `Concurrent*` wrappers. + +## Object safety vs. ergonomic methods + +Some operations naturally belong on `Cache` but would break +object safety. They live as inherent methods on each policy instead: + +- `extend>(&mut self, iter: I)` +- `get_or_insert_with V>(&mut self, key: K, f: F) -> &V` +- `insert_many(&mut self, items: impl IntoIterator)` + with buffer reuse + +The rule: anything taking a generic closure, generic iterator, or +returning `impl Trait` is an inherent method, not a trait method. +The trait stays object-safe; the policy types stay ergonomic. + +## Adding a new capability trait + +Checklist for new capability traits: + +1. **The signal must exist in the implementing policy's metadata.** + No defaults that return `None`/`0`/`false` for "doesn't apply." +2. **Bound on `Cache`.** Capability traits compose with the + kernel; they don't replace it. +3. **Object safety is optional for capability traits** but + recommended. Trait objects of capability traits show up rarely; + ergonomic generic methods are fine. +4. **Name follows the noun-of-the-signal pattern.** `RecencyTracking`, + `FrequencyTracking`, `HistoryTracking`. New ones should follow + suit: `WeightTracking`, `CostTracking`, `AdmissionTracking`. +5. **Re-export from `prelude`.** Capability traits live in the same + `use cachekit::prelude::*;` namespace as the kernel. +6. **Document the implementing-policy set.** The rustdoc on + `EvictingCache` lists policies that opt out; new traits should + do the same for the smaller set that opts in. + +## Future capability traits + +Sketched in priority order: + +- **`ExpiringCache: Cache`** — TTL surface, per + [`docs/design/ttl.md`](ttl.md) §4(a). Signature: + + ```rust + fn insert_with_ttl(&mut self, key: K, value: V, ttl: Duration) -> Option; + fn ttl_status(&self, key: &K) -> TtlStatus; + fn set_ttl(&mut self, key: &K, ttl: Duration) -> bool; + fn purge_expired(&mut self) -> usize; + ``` + + Implemented by the `Expiring` decorator over any `Cache`. + +- **`WeightTracking: Cache`** — surface for weight-aware + caches built on [`WeightStore`](../../src/store/weight.rs). Likely + signature: + + ```rust + fn weight(&self, key: &K) -> Option; + fn total_weight(&self) -> usize; + fn weight_capacity(&self) -> usize; + ``` + + Needed before GDS/GDSF (roadmap policies) can be expressed + generically. + +- **`AdmissionTracking: Cache`** — exposes ghost-list / + admission-history state for ARC, CAR, S3-FIFO, Clock-PRO, + TinyLFU. Specifically: was this key ever resident, and if so when + did it leave? Useful for adaptive workloads where the caller + wants to know whether a miss is a one-hit-wonder or a returning + member of the working set. + +The trait is intentionally not added until a second policy implements +it. The `RecencyTracking` / `FrequencyTracking` / `HistoryTracking` +naming established the convention; adding `WeightTracking` only when +GDS lands keeps the surface honest. + +## See also + +- [Design overview](design.md) — §7 frames the layering at the + principles level, §13 covers `DynCache` runtime dispatch +- [Concurrency](concurrency.md) — read/mutate split + `ConcurrentCache` +- [TTL design](ttl.md) — applied example: `ExpiringCache` as a new + capability trait +- [Read-only traits](../guides/read-only-traits.md) — user-facing + guidance on the `peek` / `get` split +- [`src/traits.rs`](../../src/traits.rs) — the canonical definitions +- [`src/store/traits.rs`](../../src/store/traits.rs) — parallel + trait family at the store layer (sequential + concurrent) diff --git a/docs/design/weighted-eviction.md b/docs/design/weighted-eviction.md new file mode 100644 index 0000000..7a8dbac --- /dev/null +++ b/docs/design/weighted-eviction.md @@ -0,0 +1,388 @@ +# Weighted Eviction + +> Status: design rationale for [`WeightStore`](../../src/store/weight.rs) +> and [`ConcurrentWeightStore`](../../src/store/weight.rs). Companion to +> [`design.md`](design.md), [`concurrency.md`](concurrency.md), and the +> [`stores`](../stores/README.md) reference. + +Entry-count caps are the wrong tool when entries vary in size. A cache +sized "max 1 000 entries" that holds a mix of 100-byte thumbnails and +10 MB blobs will either overshoot its memory budget by orders of +magnitude (when blobs dominate) or waste capacity (when thumbnails do). +`WeightStore` exists to give callers a second, byte-denominated budget +alongside the entry count. + +This document explains the dual-limit model, the contract on the +user-supplied weight function, where weight integrates with eviction +policies today (it does not), and how it pre-stages GDS/GDSF on the +roadmap. + +## The problem + +A typical entry-count cache: + +- Fails to bound memory when value sizes differ by orders of magnitude. +- Cannot answer "how many bytes am I caching?" without iterating. +- Treats a 1 KB and a 1 MB entry as equal eviction candidates, which + is wrong when memory pressure is the binding constraint. + +The complementary failure mode — a pure byte-budgeted cache — has its +own problems: + +- Highly variable entry counts make per-entry metadata budgeting hard. +- A pathological "one giant entry fills the cache" case is the byte + version of the "millions of one-byte entries fills the cache" + problem in entry-count caches. +- Some policies (LFU bucket arrays, S3-FIFO ratios) are sized by entry + count and need a stable upper bound. + +`WeightStore` therefore enforces **both** an entry-count cap and a +weight cap — whichever is hit first triggers `StoreFull`. The user +picks the units of "weight" via a closure. + +## Dual-limit model + +```text +try_insert(key, value): + │ + ├─► Existing key (update) + │ │ + │ ├── new_weight = weight_fn(&value) + │ ├── next_total = total_weight - old_weight + new_weight + │ │ + │ └── next_total > capacity_weight? ──► Err(StoreFull) + │ └──► Ok(Some(old_value)) + │ + └─► New key (insert) + │ + ├── len() >= capacity_entries? ──► Err(StoreFull) + ├── new_weight = weight_fn(&value) + ├── total_weight + new_weight > capacity_weight? ──► Err(StoreFull) + │ + └── Ok(None) +``` + +Three properties worth naming: + +- **Pre-checked, not retroactive.** `try_insert` returns + `Err(StoreFull)` rather than silently evicting; the **store** is + full, so the caller (or the policy layered above it) decides what + to evict. +- **Updates can fail too.** Replacing a 1 MB value with a 2 MB value + on a cache with 1.5 MB of remaining headroom returns `StoreFull` — + the update is rejected and the original entry stays resident. This + is the only sensible behaviour when an update can push the store + past its budget. +- **Atomic weight bookkeeping.** `total_weight` is the live sum of + every resident entry's weight. Every successful `try_insert` / + `remove` / `clear` updates it; reads (`get`, `peek`) do not. The + invariant `total_weight == sum(entries.weight)` is debug-asserted. + +## The weight function: contract and hazards + +```rust,ignore +F: Fn(&V) -> usize +``` + +The user supplies a closure. Three pieces of the contract matter: + +- **Cheap.** Ideally O(1). The function is called on every insert and + every update. A weight function that traverses the value to compute + bytes (`|tree: &BTreeMap| tree.iter().map(…).sum()`) makes + insert latency proportional to value size. +- **Deterministic.** The same value must yield the same weight every + time. A non-deterministic weight breaks `total_weight` accounting — + the store remembers `old_weight` from the *previous* insert, so a + changed weight on update leaks `(new_actual - old_recorded)` bytes + of budget per update. +- **Non-panicking.** The function is invoked while a write lock is + held in [`ConcurrentWeightStore`](../../src/store/weight.rs). A + panicking weight function under `panic = "unwind"` poisons-by- + unwind the inner state (the lock itself is `parking_lot`'s + non-poisoning variant; what is "poisoned" is the call site, + which never completes the insert). Under the crate's default + `panic = "abort"` release profile this terminates the process. + +Common shapes: + +```rust,ignore +|v: &Vec| v.len() +|s: &String| s.len() +|img: &Image| img.width * img.height * 4 +|_: &T| 1 // entry-count only +|v: &Cow<[u8]>| v.len() // works for borrowed/owned +``` + +The "weight = 1" specialization deserves a note: it makes +`WeightStore` behave exactly like a count-only store, at the cost of +an `Arc` round-trip and per-entry weight slot. Use +`HashMapStore` for that case unless you specifically want the +`ConcurrentWeightStore` API. + +## Precomputation: weight stored per entry + +Each entry holds its weight in a small wrapper: + +```rust,ignore +struct WeightEntry { + value: Arc, + weight: usize, +} +``` + +Weight is computed **once** at insert/update time and stored alongside +the value. Three consequences: + +- Reads (`get`, `peek`, `contains`, `len`, `total_weight`) never + invoke the weight function. They cannot — they only have a + reference to the stored entry, and the stored entry already knows + its weight. +- `remove` updates `total_weight` by subtracting the stored weight, + with no recompute. +- Memory overhead per entry is `sizeof(usize)` + `sizeof(Arc)` — + one extra word plus the Arc header. Acceptable for variable-size + caches where the value itself dominates the per-entry footprint. + +The alternative — recomputing weight on every read for the sake of +"freshness" — would only matter if the weight function were +non-deterministic, which the contract forbids. + +## `Arc` everywhere + +`WeightStore` stores `Arc` even in the single-threaded variant: + +```rust,ignore +pub fn try_insert(&mut self, key: K, value: Arc) -> Result>, StoreFull> +pub fn get(&mut self, key: &K) -> Option> +pub fn peek(&self, key: &K) -> Option> +``` + +This is a deliberate divergence from `StoreCore` / `StoreMut` (which +return `V` directly). Three reasons: + +- **Cheap shared ownership.** Large `V`s (images, blobs) are the + target use case. Returning `Arc` lets callers hold or share the + value without forcing `V: Clone`. +- **Surface alignment with `ConcurrentWeightStore`.** The concurrent + variant must return `Arc` (the `&V`-across-lock problem from + [`concurrency.md`](concurrency.md)). Keeping the single-threaded + variant on the same shape lets callers swap between them by + changing one type without re-plumbing returns. +- **`V: !Clone` is supported.** Callers who don't want to require + `Clone` on their value type get the `Arc` round-trip "for free." + +The cost is that `WeightStore` does **not** implement `StoreCore` / +`StoreMut`. It is a sibling, not a subtype, of the entry-count stores +([`HashMapStore`](../../src/store/hashmap.rs), +[`SlabStore`](../../src/store/slab.rs)), and code generic over those +traits cannot accept a `WeightStore` without adaptation. This is the +single sharpest API edge in the store layer, called out explicitly in +the module documentation. + +## Why weight is at the **store** layer, not the policy layer + +The 18 implemented policies in `src/policy/` are all weight-unaware. +They count entries and evict by entry. `WeightStore` is below them in +the layering: + +```text + ┌─────────────────────────────┐ + │ policy (weight-unaware) │ evicts by recency/frequency/etc + └──────────────┬──────────────┘ + │ Cache uses store underneath + ┌──────────────▼──────────────┐ + │ WeightStore (dual limits) │ refuses inserts past weight cap + └─────────────────────────────┘ +``` + +This separation has two consequences worth understanding: + +- **The policy decides who to evict; the store decides whether the + result fits.** A policy operating over a `WeightStore` evicts its + policy-chosen victim, then attempts the insert. If the insert + still doesn't fit (one large value cannot be made room for by + evicting a single small victim), the policy must evict again or + surface `StoreFull` to the caller. +- **No policy in the tree today consumes `WeightStore` directly.** + `WeightStore` is reachable only through its own concrete API, not + through the `Cache` trait or `DynCache`. Users who want a + weight-aware cache today build one themselves on top of + `WeightStore` plus a chosen eviction strategy. + +The reason for this layering is forward compatibility. Weight-aware +**policies** (GDS, GDSF, LFU-DA, see roadmap) need this store as +their substrate. Coupling weight directly into a policy locks the +weight model to that policy; keeping it at the store layer keeps the +substrate reusable. + +## Concurrent variant + +`ConcurrentWeightStore` follows the wrapper pattern from +[`concurrency.md`](concurrency.md): + +```rust,ignore +pub struct ConcurrentWeightStore { + inner: Arc>>, +} +``` + +`parking_lot::RwLock`; `peek` / `contains` / `len` / `total_weight` +take the read lock; `try_insert` / `remove` / `clear` take the write +lock; metrics counters live in `AtomicU64` so the read-locked paths +can still increment them without escalating. + +The weight function runs **inside the write lock** on every insert +and update. A slow `F` therefore stalls every reader and writer in +the cache — a DoS amplification vector when caching user-supplied +values. The mitigation is the cheapness contract; the rustdoc on +`ConcurrentWeightStore::try_insert` says so. + +`ConcurrentWeightStore` implements `ConcurrentStoreRead` and +`ConcurrentStore`. Unlike the single-threaded variant — which +deliberately does not implement `StoreCore`/`StoreMut` — the +concurrent variant *does* fit the concurrent trait family because +both already use `Arc` returns. The asymmetry is awkward but +honest: the trait family is shaped around the constraints the +concurrent path imposes, and the single-threaded variant happens to +borrow that shape rather than the sequential one. + +## Lock-poisoning and total-weight integrity + +Under `panic = "abort"` (the crate's release default) lock poisoning +is moot — the process exits. Under `panic = "unwind"`, the order of +operations in `clear()` matters: + +```rust,ignore +fn clear(&mut self) { + self.total_weight = 0; // (1) reset first + self.entries.clear(); // (2) then drop entries (may panic) +} +``` + +If (2) panics during entry drop, `total_weight = 0` and `len() == 0` +remain consistent post-panic. Individual values may leak through the +unwinding drop but the store's accounting cannot be corrupted into +"says it has 1 GB resident when actually empty" — which would +silently reject all future inserts. The module documentation calls +this out so callers who override `panic = "abort"` know what they +get. + +## Failure mode: weight cap, not entry cap + +When the weight budget is hit but the entry count is not: + +- `try_insert` returns `StoreFull` for any new key whose value would + push `total_weight` past `capacity_weight`. +- `len() < capacity_entries` — the entry budget has headroom that + cannot be used. +- `total_weight == capacity_weight` (approximately, depending on + insert sizes). + +The reverse — entry cap hit, weight cap not — produces `StoreFull` +on any new insert regardless of weight, including tiny values. + +Both are correct. The store does not silently demote either limit; +the caller's intent is "neither budget shall be exceeded," and the +store enforces it literally. + +## Capacity tuning + +The dual limits give callers two knobs: + +| Setting | Effect | +|---|---| +| `capacity_entries` finite, `capacity_weight = usize::MAX` | Behaves like an entry-count store; weight is observable but unconstrained | +| `capacity_entries = usize::MAX`, `capacity_weight` finite | Behaves like a pure byte-budget store; entry count is observable but unconstrained | +| Both finite | Hard dual limit | + +The first row is rarely what callers want (use `HashMapStore` +instead — no per-entry weight slot). The second is a legitimate +configuration for callers who genuinely want bytes-only accounting +and accept the per-entry overhead. The third is the design intent. + +## Security considerations + +The module rustdoc is unusually long on security; the points worth +naming at the design-doc level: + +- **Hasher.** `WeightStore`'s key index uses `FxHashMap`, which is + **not** HashDoS-resistant. Callers caching variable-size values + keyed by request paths, tenant IDs, or filenames — i.e. exactly + the use case `WeightStore` targets — should pre-hash keys with a + keyed hash (`siphasher` with a per-process key) or migrate to + `HashMapStore`'s `RandomState`-backed default. +- **Side channel.** `total_weight` is publicly readable. Callers + with access to the counter can infer the size of other tenants' + cached entries from before/after differentials. Avoid exposing + `total_weight` across trust boundaries when caching tenant-keyed + variable-size records. +- **Sensitive values.** Dropped `V`s are not zeroized. Wrap `V` in + `zeroize::Zeroizing` (or equivalent) when caching credentials. +- **Counters.** Metrics use `Relaxed` ordering and wrap on overflow + in release. Best-effort observability, not audit-grade. + +## Pre-staging GDS/GDSF + +GreedyDual-Size (GDS) and its frequency-aware variant GDSF evict by +**cost ÷ size** rather than recency or frequency alone. Both +require: + +- A per-entry size (`WeightStore` already stores it). +- A per-entry cost (caller-supplied at insert time). +- An eviction priority queue ordered by `cost / size + age`. + +`WeightStore` provides the size half today. The cost half and the +priority-queue substrate ([`LazyMinHeap`](../../src/ds/lazy_heap.rs) +is a natural fit) are the missing pieces. When GDS lands, the +expected shape is: + +```rust,ignore +pub struct GdsCache { + store: WeightStore, + queue: LazyMinHeap, + aging: AgingCounter, +} +``` + +The trait surface would be `Cache` plus a future +`WeightTracking` capability trait (sketched in +[`trait-hierarchy.md`](trait-hierarchy.md#future-capability-traits)), +giving generic code the ability to consult `weight(key)` and +`total_weight()` regardless of which policy is doing the evicting. + +The non-trivial design question, when GDS lands, is whether the +priority queue stores cost / size at insert time (cheap, can become +stale if the value's "true" cost diverges from insert-time cost) or +recomputes on demand (more expensive, but always current). The +current expectation is "store at insert time, document the +staleness window" — matching the precomputed-weight discipline this +store already follows. + +## When not to use `WeightStore` + +- **Uniform value sizes.** Use `HashMapStore` or `SlabStore`. The + weight slot is overhead with no benefit. +- **Hot-path latency dominates.** The weight function runs on every + insert. If `F` is non-trivial, insert latency is `F`-dominated. +- **You need a policy.** `WeightStore` is a store; policies sit + above it. A bare `WeightStore` evicts nothing on its own — it + surfaces `StoreFull` and the caller decides what to remove. Use + this directly only when the caller knows the eviction strategy + better than any built-in policy would. + +## See also + +- [Design overview](design.md) — §2 (memory layout) and §5 + (eviction) frame the trade-offs at the principles level +- [Concurrency](concurrency.md) — `ConcurrentWeightStore` follows + the standard wrapper pattern documented there +- [Cache trait hierarchy](trait-hierarchy.md) — future + `WeightTracking` capability trait sketched in + "Future capability traits" +- [Stores](../stores/README.md) and [`weight.md`](../stores/weight.md) + — reference docs for the runtime behaviour +- [Error model](error-model.md) — `StoreFull` semantics +- [`src/store/weight.rs`](../../src/store/weight.rs) — the canonical + implementation +- [Roadmap: GDS](../policies/roadmap/gds.md) and + [GDSF](../policies/roadmap/gdsf.md) — the planned consumers diff --git a/docs/index.md b/docs/index.md index 4a893c7..cbc9c14 100644 --- a/docs/index.md +++ b/docs/index.md @@ -12,6 +12,18 @@ Key features: - [Quickstart](getting-started/quickstart.md) — Install and build your first cache - [Integration guide](getting-started/integration.md) — CacheBuilder API, policy selection, thread safety - [Design overview](design/design.md) — Architectural decisions and performance principles +- [Cache trait hierarchy](design/trait-hierarchy.md) — Kernel trait, capability traits, read/mutate split +- [Concurrency](design/concurrency.md) — `Concurrent*` wrappers, lock discipline, sharded primitives +- [Builder and runtime dispatch](design/builder-and-dyn-dispatch.md) — `CachePolicy`, `DynCache`, enum dispatch +- [Weighted eviction](design/weighted-eviction.md) — `WeightStore`, dual limits, GDS/GDSF pre-staging +- [Metrics](design/metrics.md) — Recorder / snapshot / exporter split, Prometheus integration +- [Error model](design/error-model.md) — Panic vs `Result` discipline, four error types +- [Benchmarking design](design/benchmarking.md) — Benchmark layers, policy registry, JSON artifacts +- [Hashing and key identity](design/hashing.md) — Hasher choices, key interning, shard routing +- [Sharding](design/sharding.md) — Sharded primitives, routing, capacity semantics +- [Serialization](design/serialization.md) — `serde` surface and cache-state persistence boundaries +- [Non-goals](design/non-goals.md) — Explicit boundaries and out-of-scope features +- [TTL design](design/ttl.md) — Worked example of every principle in one feature - [API surface](guides/api-surface.md) — Module map and entrypoints ## Policies