Conversation
| panic("unreachable") | ||
| } | ||
|
|
||
| func (a *testApp) CheckTx(context.Context, *abci.RequestCheckTxV2) (*abci.ResponseCheckTxV2, error) { |
There was a problem hiding this comment.
Do we have a test where a block failed to execute and commit somewhere?
There was a problem hiding this comment.
only happy path tests for now until we get this to work somehow.
| return scope.Run(ctx, func(ctx context.Context, s scope.Scope) error { | ||
| // Spawn outbound connections dialing. | ||
| for _, addr := range r.cfg.ValidatorAddrs { | ||
| s.Spawn(func() error { |
There was a problem hiding this comment.
I was trying this PR on #3234. There was one error and all Giga Router stopped, no retries.
AI explains:
Spawn uses SpawnBg which calls s.Cancel(err) if the task returns an error. This cancels the entire scope — all other goroutines get cancelled.
So if runExecute (spawned with SpawnNamed which uses Spawn) returns an error early, the whole scope is cancelled including the dial retry loops.
Does that make sense?
There was a problem hiding this comment.
execution failure is a valid termination reason, because it means that we are not able to execute a finalized block.
Testing PR #3224 with autobahn integration testsTested this PR using the autobahn docker cluster setup (PR #3220 + integration tests from #3234). Found the following issues: 1. Panic in
|
Follow-up on Bug #1: Missing Commit after InitChainDug deeper into the root cause. The panic in // giga_router.go runExecute()
if last == 0 {
if _, err := app.InitChain(ctx, r.cfg.GenDoc.ToRequestInitChain()); err != nil {
return fmt.Errorf("App.InitChain(): %w", err)
}
// No Commit() here!
}
for n := next; ; n += 1 {
b, err := r.data.GlobalBlock(ctx, n)
// ...
if vals := r.cfg.App.GetValidators(); len(vals) > 0 { // PANIC
In CometBFT's normal flow, the handshaker ( |
Follow-up: Height mismatch after InitChain + Commit fixTested the latest at The issue is in if last == 0 {
if _, err := app.InitChain(ctx, ...); err != nil { ... }
if _, err := app.Commit(ctx); err != nil { ... }
next, ok = utils.SafeCast[atypes.GlobalBlockNumber](r.cfg.GenDoc.InitialHeight) // next = 1
}
for n := next; ; n += 1 {
// executeBlock at height 1, but app already committed height 1 via InitChain+Commit
// app expects height 2
Fix: after |
| return fmt.Errorf("app.InitChain(): %w", err) | ||
| } | ||
| if _, err := app.Commit(ctx); err != nil { | ||
| return fmt.Errorf("app.Commit(): %w", err) |
There was a problem hiding this comment.
Tried again and another panic:
panic: invalid height: 1; expected: 2
BeginBlock at app/abci.go:33 expects height 2 but got height 1. This is because InitChain + Commit already advanced the app state to height 1, so when executeBlock tries to finalize block 1, the app expects block 2.
6453164 to
6a2cd95
Compare
The test produces blocks but hangs waiting for all txs to finalize. This is a pre-existing consensus issue introduced in #3224, not related to the InitChain fix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The mempool starts at height=-1, so txs submitted before the first block get height=-1. With TTLNumBlocks=10, purgeExpiredTxs instantly evicts them on the first Update(blockHeight=InitialHeight) because -1 < InitialHeight-10 when InitialHeight is large (random up to 100k). This is a pre-existing race from #3224: the producer must drain the mempool before the first block is executed, otherwise txs are lost. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The mempool starts at height=-1, so txs submitted before the first block get height=-1. With TTLNumBlocks=10, purgeExpiredTxs instantly evicts them on the first Update(blockHeight=InitialHeight) because -1 < InitialHeight-10 when InitialHeight is large (random up to 100k). This is a pre-existing race from #3224: the producer must drain the mempool before the first block is executed, otherwise txs are lost. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* main: (51 commits) Giga storage integration test (#3268) test(flatkv): add flatkv integration testings (#3262) perf(app): reuse decoded transactions across ProcessProposalHandler hot path (#3257) Fix of the proto conv testing (#3261) FlatKV refactor for state sync import + export (#3250) Validate block part index matches proof index (CON-20) (#3256) fix: add retry to apt-get update in Docker CI (#3264) fix: autobahn InitChain, GetValidators, and mempool TTL (CON-249) (#3243) Fix buffer offset in ProposerPriorityHash (CON-200) (#3255) Handle error case in light client divergence detector (#3254) perf(evmrpc): eliminate redundant block fetches in simulate backend (#3208) fix(evmrpc): omit notifications from legacy JSON-RPC batch responses per spec (#3246) fix: deduplicate block fetch in getTransactionReceipt (#3244) Made autobahn producer use TxMempool (#3224) Skip signature event building during Cosmos CheckTx/ReCheckTx (#3230) Regenerate changelog in prep to tag v6.4.2 (#3240) Fix receipt default retention (#3237) feat(flatkv): introduce module-prefix physical keys across all FlatKV DBs (#3229) added a ProposerAddress check to setProposal CON-250 (#3232) feat: add AUTOBAHN option to local docker cluster (CON-247) (#3220) ...
Also