Skip to content

feat(deploy): serve static.polygon.technology from Cloudflare Workers static assets#190

Open
MaximusHaximus wants to merge 1 commit into
masterfrom
feat/cloudflare-migration
Open

feat(deploy): serve static.polygon.technology from Cloudflare Workers static assets#190
MaximusHaximus wants to merge 1 commit into
masterfrom
feat/cloudflare-migration

Conversation

@MaximusHaximus

@MaximusHaximus MaximusHaximus commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Why

The static.polygon.technology HTTP endpoint serves a verbatim tree of contract ABIs and network JSON (network/, ~3 MB / 477 files) plus a health-check page — no application logic. Today it ships as an nginx Docker image on AWS ECS, with a planned migration to GKE. Both manage cluster infrastructure for what is just static files with permissive CORS. The team already runs five frontends on Cloudflare Workers static assets via per-repo wrangler configs; static should join them and stop needing AWS ECS or GCP.

This is a T0 service@maticnetwork/maticjs hardcodes static.polygon.technology as its ABI store and fetches it uncached at every client init, as do bridge-api-services and portal — so the migration validates on a test domain before any apex cutover, preserves the served paths byte-for-byte, and the cutover must be zero-downtime.

What

  • wrangler.toml — Workers static assets, not_found_handling = none (missing paths → real 404). A thin Worker entry (src/index.ts) delegates to the ASSETS binding so the deployment can be bound via a route (see cutover below); asset hits are still served from the edge without invoking it.
  • scripts/assemble-cdn.sh (pnpm run build:cdn) — stages network/ + index.html + public/_headers into dist/, preserving the exact /network/... paths consumers depend on.
  • public/_headersAccess-Control-Allow-Origin: * + Access-Control-Allow-Methods: GET, OPTIONS, reflecting the read-only GET surface (the legacy nginx config advertised POST/Authorization/Content-Type for a surface that never existed). Adds Cache-Control: public, max-age=300.
  • .github/workflows/deploy.yml — trunk-based deploy (see below). Uses the org-level CF_WORKER_ACCOUNT_ID / CF_WORKER_API_TOKEN secrets, matching the other CF frontends.
  • The @polygonlabs/meta npm surface under packages/meta/ is untouched.

The legacy nginx/ECS/GCP paths (Dockerfile, nginx.conf, deployment.yml, build_and_deploy.yml, deployment_gcp.yml) are kept as rollback and removed in a follow-up PR once the apex is verified on Cloudflare. No changeset — every changed file is outside packages/meta/.

Deployment model (trunk-based)

master is the only branch.

  • Staging ← every push to master, deploying static-cf.polygon.technology (a custom_domain on a fresh hostname). Staging always reflects trunk.
  • Productionworkflow_dispatch only, for now. Production binds the apex via a route, so the first prod deploy is the cutover and must be deliberate. After it's verified, a release-tag trigger can be re-added so prod auto-deploys on each @polygonlabs/meta release (the route is idempotent once bound, and the CDN + the published package serve the same network/ tree, so the release is the right gate).

Cutover mechanism — why a route, not custom_domain

static.polygon.technology already exists as a proxied, externally-managed DNS record (pointing at AWS). Both prior migrations (portal #1194, staking-ui) hit Cloudflare error 100117 trying to bind a custom_domain over such a record, and override_existing_dns_record does not fix it — it was tried as a route property (wrangler rejects it) and via a direct API call with every override flag set, and still failed. Root cause: our CI token has Workers permissions but not Zone:DNS:Edit, and the records are owned by the SPEC/enablement team. Both prior cutovers needed that team to manually delete the records — which for a T0 endpoint would mean a downtime window.

A route sidesteps all of this: it attaches the Worker in front of the existing proxied record and touches no DNS, so it needs only Workers-Routes permission (which our token plausibly has — it already creates custom domains), the cutover is zero-downtime, and rollback is just removing the route (traffic falls straight back to the AWS origin). Staging stays custom_domain because its hostname is new (no record to attach a route to).

How this gets tested (CI-only — no local Cloudflare creds, ever)

  1. Merge → push-to-master deploys staging in CI, validating the whole mechanism (auth, Worker upload, custom_domain bind, worker+assets serving) against static-cf.polygon.technology.
  2. workflow_dispatch → production is the apex cutover, and the run tells us the one remaining unknown: whether the token can create the route. If yes → zero-downtime self-serve cutover. If denied on permissions → fall back to the SPEC team, but as a record repoint (still zero-downtime), not a delete-and-redeploy.

A canonical apps-team-ops runbook capturing this whole process (100117, the override dead-end, the route pattern) will follow.

@MaximusHaximus MaximusHaximus force-pushed the feat/cloudflare-migration branch 3 times, most recently from 0ee36ca to eeb11d0 Compare June 29, 2026 16:52
@MaximusHaximus MaximusHaximus marked this pull request as ready for review June 29, 2026 16:58
@MaximusHaximus MaximusHaximus requested a review from a team as a code owner June 29, 2026 16:58
@claude

claude Bot commented Jun 29, 2026

Copy link
Copy Markdown

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.


… static assets

The static HTTP endpoint serves a verbatim tree of contract ABIs and network JSON
with permissive CORS — no application logic — yet ships as an nginx Docker image on
AWS ECS, with a planned migration to GKE. Both manage cluster infrastructure for what
is just static files. This moves it to Cloudflare Workers static assets via wrangler,
joining the team's other frontends, so neither AWS ECS nor GCP is needed.

scripts/assemble-cdn.sh stages network/, index.html, and public/_headers into dist/,
which wrangler.toml serves. public/_headers exposes the read-only GET CORS surface —
dropping the legacy nginx POST and Authorization/Content-Type headers, which
advertised a write/auth surface that never existed — and adds a short shared-cache
TTL. Missing paths now return a real 404 instead of nginx's index.html fallback. A
thin Worker entry (src/index.ts) delegates to the assets binding so the deployment
can be bound via a route.

Deployment is trunk-based. Every push to master deploys staging
(static-cf.polygon.technology, a custom_domain on a fresh hostname). Production is
workflow_dispatch-only and binds the apex via a ROUTE, not a custom_domain:
static.polygon.technology already has an externally-managed proxied record, so a
custom_domain fails with Cloudflare error 100117 and our CI token lacks Zone:DNS:Edit
to override it (proven in the portal and staking-ui migrations). A route attaches in
front of the existing record without touching DNS — zero downtime, reversible (remove
the route → falls back to AWS), and needs only Workers-Routes permission. The first
prod deploy is the deliberate apex cutover; a release-tag trigger for
auto-prod-on-release can be re-added once verified.

The legacy nginx/ECS/GCP paths are kept as rollback until the apex is verified on
Cloudflare, then removed in a follow-up PR. The @polygonlabs/meta npm package is
unaffected.
@MaximusHaximus MaximusHaximus force-pushed the feat/cloudflare-migration branch from eeb11d0 to 7b5129a Compare June 29, 2026 17:44
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant