Summary
emergent-time is published to crates.io (v2.2.4) and is on branch claude/emergent-time-quantum-db4z51 (PR #561). This issue tracks its current state and remaining work.
Current state
- 72 tests pass; dependency-free; clippy clean.
- Physics: Page–Wootters and thermal time faithful (independent recomputation); Wheeler–DeWitt and entropic time have discriminating tests added.
- Performance: Page–Wootters evolution ~53× faster via cached eigenbasis, physics values unchanged.
- Agentic Time: diagnostic layer (per-channel attribution, ATI, 7-state classifier) implemented and tested.
Measured benchmark result
On the two available real agent traces, with pre-registered thresholds and the agentic clock blinded to the predicted error signal:
| Detector |
Agentic clock vs fair baseline |
Fixed-window mean + 3σ |
0 win / 1 tie / 1 loss |
| Adaptive Page–Hinkley |
0 win / 0 tie / 2 loss |
The agentic clock does not lead the fair baseline. The ADR baseline-dominance gate is marked unmet.
Follow-up work
-
Larger real-trace corpus. The early-warning evaluation has n=2. A larger, pre-registered, instrumented corpus of real agent failures is the only path to establishing (or ruling out) an early-warning lead. The harness exists (examples/real_trace_eval.rs); the gap is data.
-
Channel instrumentation. The current channel mappings are heuristic proxies extracted from session transcripts. A direct instrumentation of an agent runtime (emitting belief/memory/retrieval/goal/contradiction/plan deltas natively) would remove proxy noise.
-
Provenance hash. The witness chain uses FNV-1a, which is an integrity/corruption check, not adversarial tamper-resistance. If the provenance is to resist a motivated tamperer, replace with a keyed cryptographic hash (e.g. BLAKE3 keyed mode, as in rvm-witness). The current docstring states this limitation; the README wording "tamper-evident" should be narrowed to "integrity/provenance."
-
Wheeler–DeWitt non-trivial kernel. The discriminating test confirms a generic clock yields an empty kernel. A stronger demonstration would construct a kernel that arises from a genuine (non-energy-matched) resonance rather than by construction.
-
Scale regime. Performance and recall numbers are at small n (≤32). The numerical core's behavior at larger n (eigensolver convergence, exponentiation cost) is untested.
Non-goals
- The crate does not claim an early-warning lead over fair baselines; that is unproven on real data.
- The physics modules are finite-dimensional toy realizations of the formalisms, not claims of new physics.
Summary
emergent-timeis published to crates.io (v2.2.4) and is on branchclaude/emergent-time-quantum-db4z51(PR #561). This issue tracks its current state and remaining work.docs/adr/ADR-251-agentic-time.mdCurrent state
Measured benchmark result
On the two available real agent traces, with pre-registered thresholds and the agentic clock blinded to the predicted error signal:
mean + 3σThe agentic clock does not lead the fair baseline. The ADR baseline-dominance gate is marked unmet.
Follow-up work
Larger real-trace corpus. The early-warning evaluation has n=2. A larger, pre-registered, instrumented corpus of real agent failures is the only path to establishing (or ruling out) an early-warning lead. The harness exists (
examples/real_trace_eval.rs); the gap is data.Channel instrumentation. The current channel mappings are heuristic proxies extracted from session transcripts. A direct instrumentation of an agent runtime (emitting belief/memory/retrieval/goal/contradiction/plan deltas natively) would remove proxy noise.
Provenance hash. The witness chain uses FNV-1a, which is an integrity/corruption check, not adversarial tamper-resistance. If the provenance is to resist a motivated tamperer, replace with a keyed cryptographic hash (e.g. BLAKE3 keyed mode, as in
rvm-witness). The current docstring states this limitation; the README wording "tamper-evident" should be narrowed to "integrity/provenance."Wheeler–DeWitt non-trivial kernel. The discriminating test confirms a generic clock yields an empty kernel. A stronger demonstration would construct a kernel that arises from a genuine (non-energy-matched) resonance rather than by construction.
Scale regime. Performance and recall numbers are at small n (≤32). The numerical core's behavior at larger n (eigensolver convergence, exponentiation cost) is untested.
Non-goals