Skip to content

web terminal: adopt session-unknown fast-path once Tower emits a browser-visible WS close code #971

@amrmelsayed

Description

@amrmelsayed

Background

The VSCode terminal fast-paths a "this session no longer exists" close: Node's ws client surfaces Tower's upgrade-stage 404 as Error.message "Unexpected server response: 404", which classifyUpgradeError (in @cluesmith/codev-core/reconnect-policy, added in #961) classifies as permanent → give up immediately instead of burning the backoff budget.

The web terminal cannot do this today. Tower rejects an unknown session at the HTTP-upgrade stage (packages/codev/src/agent-farm/servers/tower-websocket.tssocket.write('HTTP/1.1 404 Not Found\r\n\r\n'); socket.destroy()). A browser WebSocket whose upgrade fails only sees onerror + onclose with code 1006 and no access to the HTTP status — browsers deliberately hide failed-upgrade response details from JS. So the web terminal stays on blind retry (it gives up only after exhausting the 6-attempt budget, same as any transient drop).

This was design-call #2/#3 in #961, explicitly deferred there.

Work

  1. Tower side: instead of (or in addition to) the upgrade-stage 404, accept the WS upgrade and immediately close with an app-range close code (e.g. 4404) + reason for the session-unknown case, so a browser can read it via CloseEvent.code / .reason. This must not regress the VSCode/Node path, which currently relies on the Unexpected server response: 404 upgrade error (terminal-adapter: WebSocket close-handler spams 'Connection lost' in a tight loop with no backoff, no give-up, no actual reconnect #936).
  2. Classifier side: classifyUpgradeError's object form currently treats 400 <= code < 500 as permanent — that's an HTTP range and will not match a WebSocket close code like 4404. Add a branch for the agreed app-range WS close code (and decide whether the HTTP-range check stays for the Node string-or-code path).
  3. Dashboard side: wire Terminal.tsx's onclose to consult classifyUpgradeError({ code: event.code }) and give up immediately on a permanent close — flipping the web terminal off blind retry for stale sessions. (core: extract transport-agnostic reconnect policy; adopt in vscode + dashboard terminals + tunnel client #961 deliberately did not add this seam, to avoid shipping a dormant code path with a provisional numeric contract.)

Acceptance

Context

Deferred from #961 (extract transport-agnostic reconnect policy). The pure-logic seam (classifyUpgradeError with an object/code form) already exists in core; this issue is the Tower close-code + dashboard wiring that makes it live.

Metadata

Metadata

Assignees

Labels

area/cross-cuttingTouches multiple areas — needs coordinated handling

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions