fix(cluster): sign DAG actions on cluster nodes + deterministic genesis (#99)#100
Merged
Conversation
Fixes #99. Cluster nodes rejected DAG writes with "missing Ed25519 signature": the DAG enable + author + signing-key setup lived only in `main()`, so library and integration-test callers that go through `init_cluster` got an un-enabled, unsigned DAG. Extract that setup into `cluster_init::ensure_dag_ready` and call it from `init_cluster` (and from `main` only in standalone mode, to avoid double initialization). That exposed a second, deeper bug: each node created its DAG genesis with a wall-clock timestamp, so genesis hashes diverged across nodes and replicated actions failed parent validation on followers (`DagStore::put` requires every parent to exist locally). Make the genesis deterministic (fixed epoch timestamp) so every fresh node computes the same genesis hash and cross-node replication validates. Existing persistent DAGs are unaffected (`init_or_migrate` is a no-op once actions exist). Also harden the 3-node integration test to poll for replication instead of assuming a fixed 2s delay. Verified: all 3 cluster integration tests pass; aingle_graph DAG tests (76) pass; default and `dag` builds compile cleanly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #99 — cluster DAG writes were rejected with
500 — "DagAction rejected: missing Ed25519 signature", failing the cluster integration tests.Two root causes, both fixed:
Signing setup was binary-only. The DAG enable + author + Ed25519 signing-key initialization lived only in
main(). Library/test callers that go throughcluster_init::init_cluster(bypassingmain) got an un-enabled, unsigned DAG, so cluster writes were emitted unsigned and rejected by the Raft state machine. → Extracted intocluster_init::ensure_dag_ready, called frominit_clusterand frommain(standalone mode only, to avoid double init).Genesis hash diverged across nodes (exposed once deps(rust): bump tokio-tungstenite from 0.20.1 to 0.28.0 #1 was fixed). Each node built its DAG genesis with a wall-clock
Utc::now()timestamp, so genesis hashes differed per node; a follower then failed to validate a replicated action's parent (the leader's genesis), sinceDagStore::putrequires every parent to exist locally — so the triple never materialized on followers. → Made the genesis deterministic (fixed epoch timestamp) so every fresh node computes the same genesis hash and cross-node replication validates. Existing persistent DAGs are unaffected (init_or_migrateis a no-op once actions exist).Also hardened
test_three_node_cluster_replicationto poll for replication (up to 15s) instead of assuming a fixed 2s delay.Test plan
cargo test -p aingle_cortex --features dag --test cluster_integration_test— 3/3 pass (single_node_bootstrap,three_node_replication,wal_stats); previously 1/3.cargo test -p aingle_graph --features dag dag— 76 pass (deterministic genesis breaks no existing DAG test).cargo build -p aingle_cortex(default) and--features dag— compile cleanly.Notes
aingle_graphgenesis semantics (core crate) — the change only affects newly-created genesis actions and is required for cluster DAG consistency.