Skip to content

feat(connectivity): graceful offline UX — banner, action gating, auto-recovery#2932

Open
k11kirky wants to merge 16 commits into
mainfrom
posthog-code/connectivity-ux
Open

feat(connectivity): graceful offline UX — banner, action gating, auto-recovery#2932
k11kirky wants to merge 16 commits into
mainfrom
posthog-code/connectivity-ux

Conversation

@k11kirky

Copy link
Copy Markdown
Contributor

Stacked on #2931 (posthog-code/auth-code-access-resilience). Review/merge that first; this PR's base is set to its branch, so the diff here is only the connectivity work. GitHub will retarget this to main automatically once #2931 merges.

Why

The code-access fix (#2931) made one recoverable error (a transient invite-check failure) stop locking users out. This PR generalises that instinct: during intermittent connectivity, block the actions that need the network — not the whole app — keep everything offline-capable working, explain what's paused, and self-heal on reconnect.

The app already had the right instincts unevenly: agent-send is gated on isOnline, and detection is robust and server-side (workspace-server polls Google/Cloudflare with a 2-failure hysteresis + backoff, streamed over tRPC). The gaps were: no always-visible status, git remote actions firing blind offline, and no recovery trigger on the reconnect event itself. This PR closes those, reusing the existing machinery rather than rebuilding it.

What

1. Connectivity banner

ConnectivityBanner mounted in the shell (__root.tsx, all three layouts: main / channels / settings).

  • Offline: amber strip, "You're offline — network actions are paused, reconnecting automatically", with a Retry that forces an immediate probe via a new ConnectivityClient.checkNow() (wired to the existing connectivity.checkNow route). While the probe runs it shows "Checking connection…".
  • Recovery: brief green "Back online", then collapses.
  • Animated height collapse, modeled on UpdateBanner. No re-debouncing — detection is already debounced server-side, so the banner renders the settled state.

2. Consistent git-action gating (reusing the "No internet connection" idiom)

  • computeGitInteractionState (core, pure) gains an isOnline input; push / sync / publish and create-PR return the offline reason, which the existing disabledReason plumbing renders as a tooltip on the primary button and both dropdown variants — one edit, all sites. Local actions (commit, new branch) stay enabled offline.
  • usePrActions (PR lifecycle: merge / close / convert draft / ready) and useInboxCloudTaskRunner (inbox Create PR / Discuss, scout Discuss) get an imperative offline guard that shows the offline toast instead of firing a doomed request — the same pattern SessionView's send already uses.

3. Auto-recovery on reconnect

NetworkReconnectContribution subscribes to the connectivity store and, on an offline→online transition, calls a new SessionService.recoverAfterReconnect() (retry errored cloud streams + flush stranded cloud message queues). This is the same recovery the window-focus and auth-restored paths already perform — now also driven by the reconnect event, which previously fired neither. Local sessions need no new wiring: reconcileLocalConnection already re-runs when isOnline flips.

Design choices

  • Disable specific actions, not a global modal. A user can't fix wifi from inside a scrim, and reading a task / reviewing a diff / scrolling code all work offline. Trapping the whole app would be hostile; gating the unavailable actions (with a reason) and self-healing is what "graceful" means here.
  • No new store/transport plumbing. Reuses connectivityStore, useConnectivity, the existing checkNow route, and the existing cloud-recovery service methods.

Tests

  • gitInteractionLogic.test.ts: offline gates push & create-PR; commit & new-branch stay enabled.
  • network-reconnect.contribution.test.ts: recovers only on offline→online, once per cycle, never on online→offline or redundant updates.
  • core + ui + apps/code typecheck clean; biome clean; touched core/ui suites green; host-boundary check shows no new violations.

k11kirky added 2 commits June 25, 2026 12:07
…enied

`updateCodeAccessFromSession` previously collapsed every non-success path
into `hasCodeAccess: false`: a transient throw (network blip, timeout) and a
non-`true` body (e.g. `{}` from a 401) both denied access. Because this runs
on every token refresh, resume, and reconnect via `syncAuthenticatedSession`,
a single transient failure overwrote a previously-valid `true` and stranded a
valid user on the invite screen until restart.

Introduce a discriminated `CodeAccessOutcome` (`resolved` vs `indeterminate`)
and a `checkCodeAccess` helper:

- Only a clean 2xx response carrying an explicit boolean `has_access` is
  authoritative and may set `hasCodeAccess`. A genuine `{ has_access: false }`
  still correctly shows the invite screen.
- Offline status, network errors, timeouts, non-2xx responses (including
  401/403), and unparseable/incomplete bodies are indeterminate and preserve
  the prior value — a confirmed `true` stays `true`, a pending `null` stays
  `null` — letting the next sync re-check.
- Transient failures self-heal within the sync via a bounded retry
  (CODE_ACCESS_MAX_ATTEMPTS = 3) using the existing sleepWithBackoff +
  REFRESH_BACKOFF, with a fast exit when offline.

The freshly synced token is used directly rather than routing through
authenticatedFetch, because this runs inside the bootstrap/refresh flow where
re-entering the token-refresh machinery would deadlock.

Adds a `code access resilience` test block (5 cases) covering explicit denial,
transient network/401 failures preserving `true`, and inconclusive restores
staying `null`.

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
…-recovery

Builds on the code-access resilience fix to make intermittent connectivity
graceful across the app instead of letting individual network actions fail
opaquely. Three coordinated pieces, leaning on the existing server-side
connectivity detection (which already debounces with a 2-failure hysteresis):

1. Connectivity banner (`ConnectivityBanner`, mounted in the shell `__root`
   for the main, channels, and settings layouts). While offline it explains
   that network actions are paused and offers a manual Retry that forces an
   immediate reachability probe via a new `ConnectivityClient.checkNow()`
   (wired to the existing `connectivity.checkNow` route); on recovery it
   briefly confirms "Back online" then collapses. Modeled on `UpdateBanner`.

2. Consistent gating of network-requiring git actions, reusing the existing
   "No internet connection" idiom:
   - `computeGitInteractionState` gains an `isOnline` input and returns the
     offline reason for push/sync/publish and create-PR, which propagates to
     the primary button and both dropdown variants automatically. Local
     actions (commit, new branch) stay enabled offline.
   - `usePrActions` (PR lifecycle: merge/close/draft/ready) and
     `useInboxCloudTaskRunner` (inbox Create PR / Discuss, scout Discuss) gain
     an imperative offline guard that toasts instead of firing a doomed call.

3. Auto-recovery on reconnect (`NetworkReconnectContribution` +
   `SessionService.recoverAfterReconnect()`). On an offline→online transition
   it retries errored cloud streams and flushes stranded cloud message queues —
   the same recovery the window-focus and auth-restored paths already perform,
   for the network-reconnect event that previously had no trigger. Local
   sessions already recover via `reconcileLocalConnection` re-firing on the
   `isOnline` flip, so they need no extra wiring.

Tests: offline cases in `gitInteractionLogic.test.ts` (push/create-pr gated,
commit/branch still allowed) and a new `network-reconnect.contribution.test.ts`
(recovers only on offline→online, once per cycle). Core + UI + apps/code
typecheck, biome, and the touched test suites all pass.

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown

React Doctor found 2 issues in 1 file · 2 warnings.

2 warnings

src/features/connectivity/ConnectivityBanner.tsx

Reviewed by React Doctor for commit ca59ee4.

@greptile-apps

greptile-apps Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Reviews (1): Last reviewed commit: "feat(connectivity): graceful offline UX ..." | Re-trigger Greptile

Comment thread packages/core/src/git-interaction/gitInteractionLogic.test.ts Outdated
Comment thread packages/core/src/git-interaction/gitInteractionLogic.test.ts
k11kirky added 13 commits June 25, 2026 12:51
The `quality` CI check (`biome ci .`) flagged the `code access resilience`
test helpers (`authFetchWithCheck` mock bodies and the `okBody` helper) as
unformatted — they were committed after a `biome lint` (which skips
formatting) rather than a full `biome check`. Whitespace only; no behavior
change. All 41 auth tests still pass.

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
…code/connectivity-ux

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
…ze offline tests

Addresses Greptile review feedback on the offline-gating tests:

- Add `createPrDisabledReason` to `GitComputed`, mirroring `pushDisabledReason`.
  A disabled create-pr action is dropped from `actions`, so its reason was
  previously unreadable by consumers and tests; the offline create-pr test
  could only assert the action's absence (which any disabled reason would
  satisfy). The named field now lets callers and tests read the actual reason.
- Parameterize the four offline cases into two `it.each` tables — "remote
  actions gated offline" (push, create-pr) asserting the no-internet reason via
  the named field, and "local actions still allowed offline" (commit, new
  branch) — per the team's preference for parameterized tests over duplicated
  bodies.

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
Condense the over-long comments/docstrings added with the connectivity
feature (banner, reconnect contribution, recoverAfterReconnect,
createPrDisabledReason, offline tests) to keep the "why" without the prose.
No behavior change.

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
Replace the single-use `CodeAccessOutcome` discriminated union with a
`boolean | null` return from `checkCodeAccess` — `null` already means
"indeterminate" in the store's `hasCodeAccess: boolean | null`, so the union
was redundant structure. Align the retry loop with the file's other loops
(`if (isLastAttempt) break;`). Condense the verbose docstrings/comments to
keep the rationale (the deadlock-avoidance note, the indeterminate-preserves-
prior-value note) without the prose. No behavior change; all 41 auth tests
pass.

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
…code/connectivity-ux

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
Addresses Greptile review feedback on the code-access tests:

- Collapse the three single-phase cases (explicit deny, throw → indeterminate,
  2xx without has_access → indeterminate) into one `it.each`, and add the
  previously-uncovered positive grant (has_access: true → true) as a fourth row.
- Add a test for the retry loop's recovery path: attempt 1 throws, attempt 2
  returns has_access: true, hasCodeAccess becomes true — guarding against a
  regression that short-circuits after the first failure.
- Add the missing blank line between `updateCodeAccessFromSession` and the
  `checkCodeAccess` JSDoc.

No production behavior change; 43 auth tests pass.

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
…code/connectivity-ux

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
Fold the code-access tests' bespoke `authFetchWithCheck` helper into the
existing `stubAuthFetch` via an optional `checkAccess` override, removing a
duplicated copy of the user/org URL-routing stub. Parameterize the two
"keeps a confirmed grant on a later transient failure" cases (network error,
401) into a single `it.each`. Test-only; 43 auth tests pass.

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
…code/connectivity-ux

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
Drop comments the code already makes clear: the `isOnline` field note (the
gating is visible in the disabled-reason helpers), the offline-test note
(redundant with the source-side comment), the `checkNow` doc (the name says
it), and trim the reconnect test's singleton note to one line. No behavior
change.

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
Drop two comments the test names and rows already convey: the "null (not
false)" note above the parameterized code-access cases, and the "transient
blip" note above the grant-preservation cases. Test-only; no behavior change.

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
…code/connectivity-ux

Generated-By: PostHog Code
Task-Id: 8a950c40-02d5-444a-b653-0485cd4f1756
Base automatically changed from posthog-code/auth-code-access-resilience to main June 26, 2026 06:06

@tatoalo tatoalo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, approving to unblock ya

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants