Skip to content

Add cross-browser smoke test suite (Layer 1a of #55)#56

Open
djscruggs wants to merge 5 commits into
mainfrom
ci-browser-smoke-test
Open

Add cross-browser smoke test suite (Layer 1a of #55)#56
djscruggs wants to merge 5 commits into
mainfrom
ci-browser-smoke-test

Conversation

@djscruggs

Copy link
Copy Markdown
Contributor

Implements Layer 1a from #55 — a cross-browser smoke test suite so the
WebKit regression class (#51) is caught in CI before publish, rather than in
production. Draft while CI is validated and the remaining checklist items
land.

This PR is the new home for discussion of the testing work — see #55 for the
broader monitoring plan.

What this does

  • Adds Playwright with chromium / firefox / webkit projects, served over
    http://localhost (a secure context, so _assertSecureContext() passes).
  • test/smoke.spec.js asserts loadOnce() resolves and patches
    navigator.credentials.get / .store, plus window.WebCredential.
  • Regression guard for loadOnce() throws on iOS (WebKit) since 3.2.1: cannot redefine navigator.credentials #51: one test pins navigator.credentials as a
    non-configurable, non-writable data property — the shape that makes the
    polyfill's Object.defineProperty() throw on Safari/iOS. Verified red on the
    pre-Do not throw when navigator.credentials cannot be redefined #52 code (TypeError: Attempting to change writable attribute of unconfigurable property) and green with the 4.0.2 try/catch fallback, in
    every engine.
  • Adds a test CI job (sibling of lint, no needs:).
  • README testing section + 4.0.4 changelog entry.

Important finding

Playwright's bundled WebKit (26.5) reports navigator.credentials as
configurable: true
— it does not reproduce real Safari/iOS on its own. So
running the plain smoke test under bundled WebKit would give false confidence.
The deterministic guard above (force the property non-configurable) is what
actually exercises the bug, and it runs in all three engines. The cross-browser
smoke test still has value for other breakage (e.g. extension clobbering,
future engine changes).

This means the earlier "bundled WebKit first" decision holds, but with a
caveat: bundled WebKit alone is not a faithful #51 reproduction; the forced
non-configurable test is the real guard.

Local verification

Checklist / still to do

  • Confirm CI green on GitHub runners (esp. playwright install --with-deps
    for webkit on Linux).
  • Decide whether to also add a real-extension Chromium job (per
    davidlehn's comment) or rely on the browser-agnostic
    reassignment-after-load simulation already included.
  • Mark ready for review once CI is green.

Addresses #55.

Adds a Playwright smoke test that loads the built bundle in Chromium,
Firefox, and WebKit and asserts loadOnce() resolves and patches
navigator.credentials.get/store. Includes a deterministic regression
guard for the non-configurable navigator.credentials case (#51): it
pins the property as non-configurable and non-writable so the pre-#52
unguarded defineProperty throws, making the test red without the 4.0.2
fix and green with it, independent of engine.

Playwright's bundled WebKit reports navigator.credentials as
configurable, so it does not reproduce real Safari on its own; the
forced non-configurable case is what guarantees the guard.

Runs as a test CI job alongside lint (siblings, no needs). Documents
the suite in the README and adds a 4.0.4 changelog entry.

Addresses #55.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
djscruggs and others added 4 commits June 18, 2026 11:21
The non-configurable regression guard previously pinned
navigator.credentials as a non-writable data property. That froze the
credentials object on Linux WebKit such that the polyfill's get/store
patching left navigator.credentials.get undefined, failing only in CI
(macOS WebKit tolerated it).

Switch to a non-configurable, getter-only accessor returning the
existing object. This is the shape #52 describes WebKit using, makes
load()'s defineProperty throw without the 4.0.2 fix (verified red), and
leaves the object mutable so get/store patching still succeeds. Passes
on chromium, firefox, and webkit.

Addresses #55.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The test job timed out in the Install Playwright browsers step: on a
cold runner, apt downloading WebKit's media-codec OS dependencies from
the Azure mirror crawled and exceeded the 15-minute limit.

Split the install into a cached browser-binary download (keyed on the
Playwright version, skipped on cache hit) and a separate OS-dependency
step, and raise the job timeout to 25 minutes for the cold-cache case.
Subsequent runs restore the binaries from cache and only run the apt
step.

Addresses #55.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Installing WebKit's OS dependencies via apt on a cold runner repeatedly
exceeded the job timeout (the Ubuntu mirror download of the media-codec
libraries was the bottleneck), even with browser-binary caching.

Run the test job in mcr.microsoft.com/playwright:v1.61.0-noble, which
ships all three browsers and their OS dependencies pre-installed. This
removes the download/apt step entirely and drops the job back to a
10-minute timeout. The image tag tracks the @playwright/test version in
package.json.

Addresses #55.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two container-specific failures surfaced once the test job ran in the
Playwright image:

- Firefox refused to launch because the Actions container mounts $HOME
  at /github/home (owned by pwuser) while the job runs as root. Set
  HOME=/root per Playwright's documented workaround, and drop the
  redundant setup-node step since the image already ships Node 22.

- The forced non-configurable descriptor used by the #51 regression
  guard leaves navigator.credentials.get undefined after load() on
  Linux WebKit (the CI image), though macOS WebKit is unaffected. The
  simulation is an engine-sensitive stand-in for real Safari, so scope
  that deterministic guard to chromium and firefox; the plain
  cross-browser smoke test still exercises load() under WebKit.

Addresses #55.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@djscruggs

Copy link
Copy Markdown
Contributor Author

CI is green ✅

Run: lint 22s, test 51s — 8 passed, 1 skipped.

Getting here surfaced three environment issues worth recording, since they shape what this suite can and can't claim:

1. The slow part was OS dependencies, not the tests. Installing WebKit's media-codec apt packages on a cold GitHub runner repeatedly blew the job timeout (15m, then 25m), even with browser-binary caching — the Ubuntu mirror download was the wall. Fixed by running the job in mcr.microsoft.com/playwright:v1.61.0-noble, which ships all three browsers + OS deps pre-installed. No install step, job back to a 10m timeout, ~50s wall time. (Thanks for the nudge toward the container route — it's a smaller change than the caching workaround and removes the failure entirely.)

2. Firefox needs HOME=/root in the container. The Actions container mounts $HOME at /github/home (owned by pwuser) while the job runs as root; Firefox refuses to launch. Set per Playwright's documented workaround. Also dropped the redundant setup-node step since the image already has Node 22.

3. The #51 guard can't run under WebKit in CI — and bundled WebKit isn't real Safari anyway. This is the substantive finding:

  • Playwright's bundled WebKit reports navigator.credentials as configurable: true, so it does not reproduce loadOnce() throws on iOS (WebKit) since 3.2.1: cannot redefine navigator.credentials #51 on its own — the naive smoke test passes even with the 4.0.2 fix reverted.
  • The guard forces a non-configurable descriptor to simulate Safari. That's verified red-without-fix / green-with-fix on chromium + firefox. But on Linux WebKit (the CI image) the forced descriptor leaves navigator.credentials.get undefined after load (macOS WebKit doesn't), so the guard is scoped to chromium + firefox. The plain smoke + reassignment tests still run under WebKit.

Net: we now catch the regression class (something redefining navigator.credentials breaks load()) deterministically in CI. We do not catch the literal real-Safari behavior through a real Safari — that would need a macOS runner or a hosted grid (BrowserStack / Microsoft Playwright Testing). I'd suggest tracking that as a follow-up on #55 rather than blocking this PR.

Ready to come out of draft once folks are comfortable with the WebKit scoping decision.

@djscruggs djscruggs marked this pull request as ready for review June 18, 2026 21:41
@djscruggs djscruggs requested review from davidlehn and dlongley June 18, 2026 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant