Skip to content

[codex] Structure managed endpoint allocation failures#3421

Merged
juliusmarminge merged 1 commit into
mainfrom
codex/relay-environments-error-audit
Jun 20, 2026
Merged

[codex] Structure managed endpoint allocation failures#3421
juliusmarminge merged 1 commit into
mainfrom
codex/relay-environments-error-audit

Conversation

@juliusmarminge

@juliusmarminge juliusmarminge commented Jun 20, 2026

Copy link
Copy Markdown
Member

Summary

  • attach operation, user, environment, resource identifiers, and exact causes to managed-endpoint allocation, provisioning, deprovisioning, tunnel-client, and DNS-client failures
  • replace the synthetic reserve Error and valueless persistence-error constructor wrapper with structural errors; invalid tunnel responses are a distinct cause-free error nested under the provisioning failure
  • use Effect.catchTags for tunnel/DNS recovery and only treat structured nested NotFound failures as idempotent fallback
  • preserve unrelated checkpoint update failures instead of silently retrying through a different DNS path
  • export the real service make values

Verification

  • vp test infra/relay/src/environments/ManagedEndpointAllocations.test.ts infra/relay/src/environments/ManagedEndpointProvider.test.ts (19 passed)
  • vp check (passes with 20 unrelated existing warnings)
  • vp run typecheck

Scope

Excludes relay orchestration/projection and files owned by #3331, #3334, #3335, and #3392.


Note

Medium Risk
Changes live in Cloudflare tunnel/DNS provisioning paths; misclassified errors could break retries or hide outages, though behavior is narrowed and tested for non-not-found checkpoint failures.

Overview
Tightens managed endpoint provisioning so transient DNS “not found” cases still fall back, but real failures surface correctly.

Checkpoint DNS updates no longer swallow every updateRecord error via orElseSucceed; only structured NotFound causes are treated as “record gone, reconcile elsewhere.” Other errors (e.g. Cloudflare outage) fail at ensure-dns-record instead of silently taking another DNS path—covered by a new test.

Tunnel validation now requires the created tunnel’s name to match the expected stable name (not merely non-null), with returnedTunnelName on the provisioning failure when it mismatches.

Exports and layers: make is exported on allocations and provider; tunnel/DNS layer* helpers pass client implementations directly without extra of() wrappers; Cloudflare binding layer is flattened accordingly. Test fakes use tagged NotFound causes for missing DNS records.

Reviewed by Cursor Bugbot for commit 10793eb. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Harden managed endpoint allocation failure handling in ManagedEndpointProvider

  • ensureDnsRecord now rethrows DNS updateRecord errors unless the cause is NotFound; previously all errors were swallowed and treated as a miss, masking real failures
  • Tunnel response validation now checks that the returned tunnel name matches the expected name, not just that it is truthy; mismatches fail with stage validate-tunnel-response and include returnedTunnelName in the error
  • Exports the make factory from both ManagedEndpointProvider and ManagedEndpointAllocations for external construction
  • Removes the makeTunnelClient/makeDnsClient wrapper functions; service objects are now passed directly to layerTunnelClient/layerDnsClient
  • Risk: the ensureDnsRecord change is a behavioral breaking change — callers that previously succeeded despite DNS update errors will now see failures propagated

Macroscope summarized 10793eb.

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b2b8b8af-6487-47ab-8b30-7559165ad79e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/relay-environments-error-audit

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. size:L 100-499 changed lines (additions + deletions). labels Jun 20, 2026
macroscopeapp[bot]
macroscopeapp Bot previously approved these changes Jun 20, 2026
@macroscopeapp

macroscopeapp Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Approvability

Verdict: Approved

Bug fix that improves error handling by properly surfacing DNS client failures instead of silently swallowing them. Changes include stricter tunnel response validation and mechanical removal of unnecessary wrapper functions, all covered by tests.

You can customize Macroscope's approvability policy. Learn more.

@juliusmarminge juliusmarminge force-pushed the codex/relay-environments-error-audit branch from 575a4b3 to 300483c Compare June 20, 2026 18:21
@macroscopeapp macroscopeapp Bot dismissed their stale review June 20, 2026 18:21

Dismissing prior approval to re-evaluate 300483c

@juliusmarminge juliusmarminge force-pushed the codex/relay-environments-error-audit branch 2 times, most recently from 71c8e09 to 9e1cf09 Compare June 20, 2026 18:28
@github-actions github-actions Bot added size:XL 500-999 changed lines (additions + deletions). and removed size:L 100-499 changed lines (additions + deletions). labels Jun 20, 2026
@juliusmarminge juliusmarminge force-pushed the codex/relay-environments-error-audit branch from 9e1cf09 to b2cef03 Compare June 20, 2026 18:30
macroscopeapp[bot]
macroscopeapp Bot previously approved these changes Jun 20, 2026
@juliusmarminge juliusmarminge force-pushed the codex/relay-environments-error-audit branch from b2cef03 to 2d14ea0 Compare June 20, 2026 18:57
Co-authored-by: codex <codex@users.noreply.github.com>
@juliusmarminge juliusmarminge force-pushed the codex/relay-environments-error-audit branch from 2d14ea0 to 10793eb Compare June 20, 2026 19:35
@macroscopeapp macroscopeapp Bot dismissed their stale review June 20, 2026 19:35

Dismissing prior approval to re-evaluate 10793eb

@github-actions github-actions Bot added size:S 10-29 changed lines (additions + deletions). and removed size:XL 500-999 changed lines (additions + deletions). labels Jun 20, 2026
@juliusmarminge juliusmarminge merged commit eb5eb0d into main Jun 20, 2026
16 checks passed
@juliusmarminge juliusmarminge deleted the codex/relay-environments-error-audit branch June 20, 2026 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S 10-29 changed lines (additions + deletions). vouch:trusted PR author is trusted by repo permissions or the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant