fix(stream): reduce keepalive interval and handle sink closed gracefully#68
fix(stream): reduce keepalive interval and handle sink closed gracefully#6885339098-afk wants to merge 1 commit into
Conversation
Two root causes identified for 'stream disconnected before completion' errors when using MiMo with CodeX (issues 7as0nch#60, 7as0nch#65): 1. 15-second keepalive interval is too long for MiMo thinking mode, which can have 30s+ first token delay. Reduced to 5 seconds and made configurable via MIMO2CODEX_KEEPALIVE_MS env var. 2. When sink closes (CodeX times out), the for-await loop returns immediately without yielding to the event loop, leaving the upstream ChatStream generator cancelled without proper cleanup. Added setImmediate-based yield for graceful cancellation.
|
Thanks for digging into #60/#65 🙏 Splitting the two changes: Change 1 (keepalive 15s→5s + env override): 👍 happy to take this. Lowering the keepalive cadence is very plausibly the actual root cause for the disconnects under MiMo thinking mode. One nit: Number(x) || 5000 doesn't guard negatives — MIMO2CODEX_KEEPALIVE_MS=-1 becomes setInterval(fn, -1), which Node clamps to 1ms (keepalive storm). Could you wrap it, e.g. Math.max(1000, Number(...) || 5000)? Change 2 (setImmediate yield): I don't think this does what the comment says, could you double-check it actually helps in isolation? Walking the call chain: Upstream cancellation is already handled — req.on("close") → ac.abort() (server.ts:737-738) aborts the upstream fetch(..., { signal }), so the connection is torn down regardless of this yield. Last thing — per the repo's change-log rule, this needs entries in doc/tag-log.md + doc/tag-log.zh.md and a ReleaseHighlight in web/src/release-notes.tsx. Could you add those for the keepalive change? Then this is good to merge. Thanks again! |
Root cause:eq.on('close') -> ac.abort() is too aggressive for streaming I've been debugging the same stream disconnect issue on v0.5.28 and found another root cause beyond the keepalive fix. In server.js, three places register This explains why the keepalive fix alone doesn't fully resolve it — keepalive prevents idle socket timeout, but Fix (tested on v0.5.28)After receiving the upstream response, remove the close listener so mid-stream client disconnects don't kill the upstream:
Applied to both streaming endpoints. This plus the keepalive fix resolves the disconnect completely. |
Fix two root causes for stream disconnect. See full description in commit.