fix: autobahn InitChain, GetValidators, and mempool TTL (CON-249)#3243
Merged
wen-coding merged 9 commits intomainfrom Apr 16, 2026
Merged
fix: autobahn InitChain, GetValidators, and mempool TTL (CON-249)#3243wen-coding merged 9 commits intomainfrom
wen-coding merged 9 commits intomainfrom
Conversation
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3243 +/- ##
==========================================
- Coverage 59.29% 59.26% -0.03%
==========================================
Files 2070 2070
Lines 169775 169701 -74
==========================================
- Hits 100663 100578 -85
- Misses 60321 60329 +8
- Partials 8791 8794 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Contributor
|
this pr seems to be a noop, no? |
Contributor
Author
Sorry, still working on it, Claude pushed before it's ready. I'll ping you when it's ready for review. |
d272836 to
85e690c
Compare
2 tasks
Without Commit after InitChain, the staking params are not persisted to the committed store. When executeBlock calls app.GetValidators(), it reads from the committed store and panics with "UnmarshalJSON cannot decode empty bytes". Also removes the testApp Committed check in FinalizeBlock since the extra Commit after InitChain changes the Committed state flow. Note: TestGigaRouter_FinalizeBlocks is pre-broken on main (times out at 120s) — this is not caused by this change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
85e690c to
8be74ab
Compare
GetValidators now reads from the committed store first and only falls back to DeliverContext when the committed store is empty (autobahn-only path after InitChain before first Commit). This avoids changing the code path for CometBFT consensus. Remove the Committed field from testApp. The original boolean conflated two states: "post-InitChain" and "post-Commit". It blocked autobahn's valid flow (InitChain → FinalizeBlock without intermediate Commit) because InitChain set Committed=true, causing the next Commit after FinalizeBlock to fail with "double commit". The real app guards against actual double commit structurally (deliverState becomes nil after Commit, so a second Commit panics on nil pointer) — not via a boolean flag. The testApp is too simple to replicate that mechanism and doesn't need to. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CometBFT handshaker (consensus/replay.go) always calls InitChain when appHeight==0. Having runExecute also call InitChain caused a double-init that corrupted state. Now runExecute skips InitChain and relies on the handshaker having already set up deliverState. GetValidators: at height 0, read from DeliverContext first since the committed store is empty (staking params panic on MaxValidators). After the first Commit, the committed store has the params and the normal path is used. Test: simulate the handshaker by calling InitChain before router.Run(). Verified with docker AUTOBAHN=true cluster: all 4 nodes produce blocks, bank transfer works, fault tolerance (1-down continues, 2-down halts). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Match the real SDK behavior: InitChain without Commit leaves LastBlockHeight=0. Previously the testApp returned InitialHeight-1 after InitChain with 0 blocks, causing runExecute to enter the restart path and call PushAppHash with a block number below the data layer's starting point. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5982d5f to
469a247
Compare
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The mempool starts at height=-1, so txs submitted before the first block get height=-1. With TTLNumBlocks=10, purgeExpiredTxs instantly evicts them on the first Update(blockHeight=InitialHeight) because -1 < InitialHeight-10 when InitialHeight is large (random up to 100k). This is a pre-existing race from #3224: the producer must drain the mempool before the first block is executed, otherwise txs are lost. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
sei-will
approved these changes
Apr 15, 2026
pompon0
reviewed
Apr 16, 2026
pompon0
reviewed
Apr 16, 2026
pompon0
reviewed
Apr 16, 2026
- Move TTL disable (TTLNumBlocks=0, TTLDuration=0) into TestConfig() so all mempool tests benefit, per reviewer request. - Restore Committed field in testApp to verify GigaRouter calls Commit at the right moments. InitChain now sets Committed=true (matching the handshaker flow: InitChain → FinalizeBlock → Commit). - Document TestInitChainCommitThenFinalize as a contract test for testApp. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
pompon0
reviewed
Apr 16, 2026
| cfg := DefaultConfig() | ||
| cfg.CacheSize = 1000 | ||
| cfg.DropUtilisationThreshold = 0.0 | ||
| // Disable TTL purging in tests: the mempool starts at height=-1, so txs |
Contributor
There was a problem hiding this comment.
nit: this test config is used in various context, so the comment doesn't seem relevant.
pompon0
approved these changes
Apr 16, 2026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
yzang2019
added a commit
that referenced
this pull request
Apr 20, 2026
* main: (51 commits) Giga storage integration test (#3268) test(flatkv): add flatkv integration testings (#3262) perf(app): reuse decoded transactions across ProcessProposalHandler hot path (#3257) Fix of the proto conv testing (#3261) FlatKV refactor for state sync import + export (#3250) Validate block part index matches proof index (CON-20) (#3256) fix: add retry to apt-get update in Docker CI (#3264) fix: autobahn InitChain, GetValidators, and mempool TTL (CON-249) (#3243) Fix buffer offset in ProposerPriorityHash (CON-200) (#3255) Handle error case in light client divergence detector (#3254) perf(evmrpc): eliminate redundant block fetches in simulate backend (#3208) fix(evmrpc): omit notifications from legacy JSON-RPC batch responses per spec (#3246) fix: deduplicate block fetch in getTransactionReceipt (#3244) Made autobahn producer use TxMempool (#3224) Skip signature event building during Cosmos CheckTx/ReCheckTx (#3230) Regenerate changelog in prep to tag v6.4.2 (#3240) Fix receipt default retention (#3237) feat(flatkv): introduce module-prefix physical keys across all FlatKV DBs (#3229) added a ProposerAddress check to setProposal CON-250 (#3232) feat: add AUTOBAHN option to local docker cluster (CON-247) (#3220) ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes autobahn
runExecuteso the docker cluster (AUTOBAHN=true) starts and produces blocks, and fixes a test hang inTestGigaRouter_FinalizeBlocks.Root cause 1 — double InitChain: The CometBFT handshaker (
consensus/replay.go) always callsInitChainwhenappHeight==0. The autobahnrunExecutewas also callingInitChain— this double-init corrupted app state. Additionally,GetValidatorspanicked reading staking params from the empty committed store before the firstCommit.Root cause 2 — mempool TTL eviction: The mempool starts at
height=-1. Txs submitted before the first block inheritheight=-1. WithTTLNumBlocks=10,purgeExpiredTxsinstantly evicts all txs on the firstUpdate(blockHeight=InitialHeight)because-1 < InitialHeight-10whenInitialHeightis large (random up to 100k in the test). This is a race: the producer must drain the mempool before the first block executes, otherwise txs are silently lost.What changed
giga_router.go— remove InitChain/Commit fromrunExecute: The CometBFT handshaker ownsInitChain.runExecutenow just setsnext = InitialHeightwhenlast == 0, relying on thedeliverStatethe handshaker already set up. Includes aWARNINGcomment about state sync incompatibility.app/app.go—GetValidatorsusesDeliverContextat height 0: After the handshaker'sInitChainbut before the firstCommit, the committed store is empty (staking params not persisted). Reading from it panics inMaxValidatorswith "UnmarshalJSON cannot decode empty bytes". AtLastBlockHeight() == 0, we read fromDeliverContextinstead, which has the uncommitted staking state.baseapp/test_helpers.go— addDeliverContext()accessor: Returns the currentdeliverStatecontext, or nil if not set.giga_router_test.go— align testApp with real SDK:Committedfield:InitChainnow setsCommitted=truesoFinalizeBlockcan follow without an intermediateCommit(matching the CometBFT handshaker flow). Double-commit is still caught.Info()returnsheight=0before first Commit (matches real SDK).TestGigaRouter_FinalizeBlockscallsInitChainbeforerouter.Run()to simulate the CometBFT handshaker.TestInitChainCommitThenFinalizecontract test for testApp.mempool/testonly.go— disable TTL inTestConfig(): SetsTTLNumBlocks=0andTTLDuration=0to prevent spurious tx eviction in all mempool tests.Test plan
TestInitChainCommitThenFinalizepassesTestGigaRouter_FinalizeBlockspasses (was hanging in CI)AUTOBAHN=truecluster: all 4 nodes produce blocks in sync🤖 Generated with Claude Code