Skip to content

[DEV-1818] Fix batchAppend cross-stream error contamination#523

Merged
w1am merged 3 commits into
masterfrom
fix/batch-append-stale-response
Jun 25, 2026
Merged

[DEV-1818] Fix batchAppend cross-stream error contamination#523
w1am merged 3 commits into
masterfrom
fix/batch-append-stale-response

Conversation

@w1am

@w1am w1am commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

batchAppend used a single process-global promise bank shared by every client and stream. A transport error on one stream rejected and cleared in-flight appends belonging to unrelated streams, and a late data response after the bank was cleared crashed with TypeError: undefined is not iterable.

Each stream now owns its own promise bank, so an error only affects appends on the stream that actually failed. A data response with no matching correlation id is dropped (and logged via debug.command_grpc) instead of crashing.

Regression test asserts that when one stream errors:

  • a sibling stream's in-flight append still succeeds, and
  • the erroring stream's own in-flight append rejects.

Both appends are asserted still-pending immediately before the error is emitted, so a fast server response surfaces as a failure rather than a silent pass.

@w1am w1am force-pushed the fix/batch-append-stale-response branch 4 times, most recently from 7c46e23 to 27afe28 Compare June 24, 2026 06:27
@w1am w1am force-pushed the fix/batch-append-stale-response branch from 27afe28 to bb9cb4a Compare June 24, 2026 06:30
@w1am w1am marked this pull request as ready for review June 25, 2026 08:06
@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Scope batchAppend promise bank per stream to prevent cross-stream contamination
🐞 Bug fix 🧪 Tests 🕐 20-40 Minutes

Grey Divider

Description

• Make batchAppend correlation tracking per-stream so transport errors only reject that stream’s
 appends.
• Drop and debug-log late/unknown data responses instead of crashing on missing correlation ids.
• Add regression test ensuring sibling streams remain unaffected by another stream’s error.
Diagram

graph TD
  A["App code"] --> B["StreamsClient"] --> C["batchAppend()"] --> D["gRPC batch stream"] --> E["data/error events"] --> C
  C --> F["PromiseBank (per stream)"]
  E --> G["debug.command_grpc"]

  subgraph Legend
    direction LR
    _mod["Module/Function"] ~~~ _svc["Client/Service"] ~~~ _ext["External/Runtime"]
  end
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Global PromiseBank with composite key (streamId + correlationId)
  • ➕ Avoids additional WeakMap state keyed by stream instance
  • ➕ Keeps all in-flight tracking centralized
  • ➖ Requires a robust stream identifier and careful lifecycle cleanup
  • ➖ Still risks accidental cross-stream rejection if error handling clears broadly
2. Never clear on transport error; only reject known in-flight entries
  • ➕ Eliminates crash risk from late responses without needing per-stream banks
  • ➕ Preserves tracking for unrelated items if error handling is too aggressive
  • ➖ Can leak unresolved promises if the stream is terminally broken
  • ➖ More complex to reason about; still benefits from per-stream scoping

Recommendation: The chosen per-stream PromiseBank is the most robust and simplest isolation boundary: it prevents cross-stream contamination by construction, aligns with stream lifecycle (WeakMap keyed by the stream object), and makes error handling localized. Keeping the added guard for unknown correlation ids (drop + debug log) is also appropriate to harden against late/out-of-order events.

Files changed (2) +82 / -7

Bug fix (1) +35 / -7
batchAppend.tsIsolate batchAppend correlation tracking per gRPC stream +35/-7

Isolate batchAppend correlation tracking per gRPC stream

• Replaces the process-global correlation PromiseBank with a per-stream bank stored in a WeakMap. Adds a guard to drop and debug-log responses whose correlationId is not found (e.g., late data after an error cleared state), preventing 'undefined is not iterable' crashes.

packages/db-client/src/streams/appendToStream/batchAppend.ts

Tests (1) +47 / -0
appendToStream-batch-append.test.tsRegression test for sibling streams surviving a stream transport error +47/-0

Regression test for sibling streams surviving a stream transport error

• Adds coverage verifying that a transport error on one batch stream rejects only that stream’s in-flight append, while a sibling client/stream’s append still succeeds. Explicitly asserts both appends are still pending immediately before emitting the error to avoid false positives from fast server responses.

packages/test/src/streams/appendToStream-batch-append.test.ts

@qodo-code-review

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider

Great, no issues found!

Qodo reviewed your code and found no material issues that require review

Grey Divider

Qodo Logo

@w1am w1am changed the title Fix batchAppend cross-stream error contamination [DEVEX-1818] Fix batchAppend cross-stream error contamination Jun 25, 2026
@w1am w1am changed the title [DEVEX-1818] Fix batchAppend cross-stream error contamination [DEV-1818] Fix batchAppend cross-stream error contamination Jun 25, 2026
@linear-code

linear-code Bot commented Jun 25, 2026

Copy link
Copy Markdown

DEV-1818

@w1am w1am merged commit 12099c3 into master Jun 25, 2026
85 of 88 checks passed
@w1am w1am deleted the fix/batch-append-stale-response branch June 25, 2026 08:35
w1am added a commit that referenced this pull request Jun 25, 2026
* Scope batchAppend promise bank per stream
* Add debug log for dropped batchAppend responses
* Add batchAppend coverage for erroring stream's own appends
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant