Skip to content

Add CROSS_ORIGIN_STORAGE link flag#27066

Open
tomayac wants to merge 78 commits into
emscripten-core:mainfrom
tomayac:cross-origin-storage
Open

Add CROSS_ORIGIN_STORAGE link flag#27066
tomayac wants to merge 78 commits into
emscripten-core:mainfrom
tomayac:cross-origin-storage

Conversation

@tomayac

@tomayac tomayac commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Integrates the WICG Cross-Origin Storage API into Emscripten's standard Wasm loading path as a progressive enhancement.

When -sCROSS_ORIGIN_STORAGE=1 is set at link time, Emscripten computes the SHA-256 hash of the final .wasm binary and embeds it as a build-time constant. At runtime the generated JavaScript tries to retrieve the module from the cross-origin cache before falling back to the normal fetch() / WebAssembly.instantiateStreaming() path, so pages always load regardless of browser support.

New settings

  • -sCROSS_ORIGIN_STORAGE=1 — enables the feature (Web target only, disabled by default).
  • -sCROSS_ORIGIN_STORAGE_ORIGINS — controls which origins can read the cached file: '*' (default, globally available), an explicit HTTPS origin list, or [] (same-site only).

Because no browser ships the API natively yet, testing requires the COS Chrome extension (source), which polyfills navigator.crossOriginStorage on every page.

Example

Without the COS API

Loading goes through the normal fetch() / WebAssembly.instantiateStreaming() path:

Screenshot 2026-06-08 at 12 26 41

With the COS API

Now with the COS Chrome extension installed, see its popup window for details.

First time load

Loading goes through the normal fetch() / WebAssembly.instantiateStreaming() path, but the resource is stored in COS for the next time:

Screenshot 2026-06-08 at 12 27 41

Repeated load

Loading now goes through the COS path:

Screenshot 2026-06-08 at 12 29 34

Loading on a different origin

Loading now goes through the COS path and the resource is shared across origins:

Screenshot 2026-06-08 at 12 31 02

tomayac and others added 30 commits June 6, 2026 08:50
Adds a new -sCROSS_ORIGIN_STORAGE=1 compiler flag that enables
progressive enhancement of Wasm loading via the Cross-Origin Storage
browser API, targeting web environments only.

Changes:
- src/settings.js: adds CROSS_ORIGIN_STORAGE flag (default 0/off)
- src/settings_internal.js: adds WASM_SHA256 internal variable
- tools/link.py: computes SHA-256 of final .wasm after wasm-opt and
  injects it into settings as WASM_SHA256, available to JS templates
- src/preamble.js: in instantiateAsync(), attempts COS cache lookup
  by hash before falling back to the normal streaming fetch path;
  on a cache miss, stores the fetched bytes in COS for reuse

The COS path is guarded by #if ENVIRONMENT_MAY_BE_WEB and
#if CROSS_ORIGIN_STORAGE so no dead code ships in other targets
or when the feature is not enabled.
The previous implementation used a fabricated API surface. This commit
aligns with the actual spec at https://github.com/WICG/cross-origin-storage

Key corrections:

- Feature detection: 'crossOriginStorage' in navigator
  (not typeof crossOriginStorage !== 'undefined')

- Hash format: { algorithm: 'SHA-256', value: '<lowercase hex>' }
  (not a bare string; spec requires this exact object shape and
   throws TypeError if value is not a valid lowercase hex string of
   length 64)

- Cache-hit read: requestFileHandles([hash]) → handle.getFile()
  → blob.arrayBuffer() → WebAssembly.instantiate()
  (not a .get() method that returns an ArrayBuffer directly)

- Cache-miss write: requestFileHandles([hash], { create: true, origins: '*' })
  → handle.createWritable() → writable.write(new Blob([bytes]))
  → writable.close()
  - origins:'*' makes the Wasm module globally available, which is
    appropriate for public Wasm assets used across many sites
  - the UA verifies the hash against the written Blob, so we must
    write a Blob (not a raw ArrayBuffer)
  - store is fire-and-forget (async IIFE) so instantiation is never
    blocked on the write completing

- Error handling: distinguish NotFoundError (cache miss, recoverable)
  from NotAllowedError / other errors (fall through to standard path)
Five tests in test/test_other.py covering all meaningful static build
output properties of the feature (no browser runtime required):

test_cross_origin_storage_js_output
  - Verifies all expected API surface is present in the emitted JS when
    -sCROSS_ORIGIN_STORAGE=1 -sENVIRONMENT=web is used:
    'crossOriginStorage' in navigator (feature detection)
    navigator.crossOriginStorage.requestFileHandles (correct API)
    algorithm: 'SHA-256' / value: '<hex>' (spec hash object shape)
    origins: '*' (globally-available flag for public Wasm)
    getFile() / createWritable() (read and write paths)
    'NotFoundError' / 'NotAllowedError' (error discrimination)
  - Reads the emitted .wasm file and verifies the embedded hash value
    exactly matches hashlib.sha256(wasm_bytes).hexdigest()

test_cross_origin_storage_disabled_by_default
  - Verifies crossOriginStorage is absent from JS output when the flag
    is not passed (default CROSS_ORIGIN_STORAGE=0)

test_cross_origin_storage_not_emitted_for_node_target
  - Verifies the #if ENVIRONMENT_MAY_BE_WEB preprocessor guard strips
    all COS code when -sENVIRONMENT=node, even with the flag enabled

test_cross_origin_storage_not_emitted_for_single_file
  - Verifies COS code is absent in SINGLE_FILE builds where there is
    no standalone .wasm file to hash

test_cross_origin_storage_hash_changes_with_content
  - Compiles two different source files and asserts their embedded hashes
    differ, confirming the hash reflects actual binary content
- site/source/docs/compiling/CrossOriginStorage.rst (new)
  Full guide covering: overview, usage, build-time / runtime behaviour,
  testing with the extension polyfill, hash verification, and relationship
  to other caching mechanisms. Links to the WICG explainer and extension.

- site/source/docs/compiling/index.rst
  Register CrossOriginStorage in the section toctree and bullet list.

- src/settings.js
  Expand the CROSS_ORIGIN_STORAGE comment into RST-friendly prose that
  the auto-generated settings_reference.rst page will render cleanly,
  including a cross-reference to the new guide page.
A self-contained example in test/cross_origin_storage/ demonstrating the
-sCROSS_ORIGIN_STORAGE=1 flag:

  main.cpp      — minimal C++ source exporting a greet() function
  index.html    — browser shell that reports COS API availability,
                  hooks Module.print/printErr onto the page, and
                  calls greet() after the runtime initialises
  README.md     — build instructions, prerequisites (COS extension),
                  run instructions, and hash-verification one-liner

The example follows the same pattern as test/vite/ and test/webpack/:
a directory you build and serve locally to exercise a browser-only feature.
src/preamble.js
  Add three optional Module callbacks invoked at each COS runtime event:
  - Module['onCOSCacheHit'](hash)   — Wasm served from cross-origin cache
  - Module['onCOSCacheMiss'](url)   — Wasm not in COS, fetched from network
  - Module['onCOSStore'](hash)      — Wasm successfully written to COS

test/cross_origin_storage/index.html
  Rewrite to use the new callbacks exclusively. No mention of any
  particular implementation. Reports:
  - whether the COS API is active or not
  - on miss: the network URL the Wasm was fetched from, then the hash
    once it has been stored
  - on hit: the SHA-256 hash of the Wasm resource served from COS

test/cross_origin_storage/README.md
  Remove extension references; describe only the observable behaviour
  (which path was taken, hash, URL).

site/source/docs/compiling/CrossOriginStorage.rst
  Document the three callbacks with a usage example, and fold them into
  the runtime flow description.

test/test_other.py
  Assert all three Module callback strings are present in the emitted JS
  when CROSS_ORIGIN_STORAGE=1.
…ORAGE

tools/link.py
  - CROSS_ORIGIN_STORAGE + SINGLE_FILE → hard error at link time
    (wasm is inlined as base64; there is no file to hash)
  - CROSS_ORIGIN_STORAGE + WASM_ASYNC_COMPILATION=0 → warning
    (sync instantiation bypasses the async COS fetch path entirely)

test/test_other.py
  Replace test_cross_origin_storage_not_emitted_for_single_file with
  two sharper tests:
  - test_cross_origin_storage_error_with_single_file: asserts the
    expected error message via assert_fail
  - test_cross_origin_storage_warning_without_async_compilation: asserts
    the warning appears in stderr

ChangeLog.md
  Add entry under 6.0.1 (in development) describing the new flag,
  its build-time/runtime behaviour, the three Module callbacks, and
  the two incompatible flag combinations.
SINGLE_FILE uses a custom UTF-8 binary embedding by default
(SINGLE_FILE_BINARY_ENCODE=1), not base64. The point that matters is
that the wasm is inlined directly into the JS output with no standalone
.wasm file or fetchable URL — the encoding method is irrelevant to COS.

Updated: settings.js comment, link.py error message, CrossOriginStorage.rst,
and ChangeLog.md.
…N_MODULE

The COS integration only hashes and caches the primary .wasm output.
Emscripten can produce additional Wasm files that are outside its scope:

- SPLIT_MODULE: produces .deferred.wasm / .<id>.wasm secondary files
  that are fetched lazily when a deferred function is first called.
- MAIN_MODULE: side modules loaded at runtime via dlopen are separate
  .wasm files fetched through the normal network path.

Both combinations now emit a link-time warning explaining that only
the primary .wasm is covered, so developers are not surprised when
secondary files are always fetched from the network.

Tests added for both warning cases in test/test_other.py.
CrossOriginStorage.rst updated to document all four limitations
(SINGLE_FILE, WASM_ASYNC_COMPILATION=0, SPLIT_MODULE, MAIN_MODULE).
SIDE_MODULE builds output only a .wasm file with no JS glue, so there
is nothing to embed the hash into or to perform the COS lookup at
runtime. Add a warning to make this explicit.

Also corrects the earlier reasoning about SPLIT_MODULE: the secondary
.deferred.wasm files are produced by a user-run offline wasm-split step
with profiling data, not during the Emscripten link, so they genuinely
cannot be hashed at link time.
… binaries

The key insight is that COS only delivers a benefit when the .wasm binary
is byte-identical across many origins — i.e. a publicly distributed
library (SQLite Wasm, Pyodide, CanvasKit, ffmpeg.wasm) served from a
CDN. Application-specific Wasm gains nothing that the HTTP cache does
not already provide, because no other origin will ever have the same hash.

- src/settings.js: lead the CROSS_ORIGIN_STORAGE comment with a
  'When to use this flag' paragraph listing good candidates and
  explicitly warning against using it for app-specific Wasm.
- src/preamble.js: add the same guidance to the runtime comment block.
- CrossOriginStorage.rst: add a 'When to use this flag' subsection
  before Usage with a concrete list of good candidates and a clear
  'Do not' statement for application-specific code.
Exposes the 'origins' field of the COS API's
CrossOriginStorageRequestFileHandleOptions dictionary as a new
-sCROSS_ORIGIN_STORAGE_ORIGINS linker setting.

Three modes, matching the spec:

  ['*']  (default) — globally available; any origin can read the file.
         Appropriate for widely-shared public CDN assets (SQLite Wasm,
         Pyodide, CanvasKit, …).

  ['https://app.example.com', 'https://api.example.com']
         — restricted to a specific set of trusted HTTPS origins.
         For proprietary resources shared across related sites without
         making them globally enumerable.

  []     — same-site only; the 'origins' field is omitted entirely.
         The file is available only to same-site origins.

Link-time validation:
  - '*' must not be mixed with explicit origins (error)
  - each explicit origin must match https://host[:port] with no path (error)

preamble.js emits the correct JS literal via {{{ JSON.stringify(...) }}}:
  - ['*']            → origins: '*'
  - ['https://...']  → origins: ["https://..."]
  - []               → { create: true }  (no origins key)

Tests (test/test_other.py):
  - default emits origins: '*'
  - explicit list emits a JS array
  - empty list emits { create: true } with no origins key
  - error on '*' mixed with explicit origins
  - error on non-HTTPS origin
  - error on origin with a path component

Docs (CrossOriginStorage.rst): new subsection with all three modes,
example command lines, and a note on the spec's visibility upgrade rule.
ChangeLog.md updated.
Previously the default value was ['*'] in settings.js, which made it
impossible to distinguish 'user did not pass the setting' from 'user
explicitly passed ["*"]'.

Now:
- Default in settings.js is [] (empty sentinel)
- link.py checks user_settings: if CROSS_ORIGIN_STORAGE_ORIGINS was not
  explicitly passed, it is resolved to ['*'] at link time (globally
  available — the right default for a public Wasm binary)
- Explicitly passing =[] means same-site only (origins field omitted)
- Explicitly passing =['https://...'] means restricted list

This means the common case requires no extra flags:
  emcc -sCROSS_ORIGIN_STORAGE=1   →  origins: '*'

Docs and tests updated to reflect the new sentinel semantics.
An extra test asserts that explicitly passing ['*'] gives the same
result as the implicit default.
tools/link.py
  - Add warning when CROSS_ORIGIN_STORAGE=1 is used with a non-web
    environment (ENVIRONMENT=node, shell, etc.): navigator.crossOriginStorage
    is a browser API and is never available outside the browser.  This makes
    the non-web case consistent with all other no-op combinations which
    already warn (WASM_ASYNC_COMPILATION=0, SPLIT_MODULE, MAIN_MODULE,
    SIDE_MODULE).
  - Fix stale 'Inline / SINGLE_FILE builds' comment in the hash computation
    else-branch: SINGLE_FILE is now a hard error so that branch is only
    reached in unexpected build configurations.

src/preamble.js
  - Fix stale comment that hardcoded origins:'*'; replace with a reference
    to -sCROSS_ORIGIN_STORAGE_ORIGINS so the comment stays accurate
    regardless of the setting value.

test/test_other.py
  - Update test_cross_origin_storage_not_emitted_for_node_target to also
    assert the new warning is emitted (matching the pattern of all other
    warning tests).

CrossOriginStorage.rst
  - 'silently ignored' → 'emits a warning' for non-web targets.
  - Fix stale origins:'*' hardcode in the 'How it works / Cache miss' step;
    now references -sCROSS_ORIGIN_STORAGE_ORIGINS instead.
The warning was implemented in tools/link.py but never tested.
Added test_cross_origin_storage_warning_with_side_module to assert
the expected message appears in stderr.
Replace "distributed from a CDN" with "popular library loaded by many
independent sites", and add a short note explaining that COS cannot be
used as a timing oracle for restricted entries: a cache hit requires an
explicit prior write that provided the actual bytes.
…stantiateWasm loaders

When a program supplies its own Module['instantiateWasm'] callback,
Emscripten calls it directly and skips instantiateAsync(), so the
built-in COS fetch logic is never reached.  To give custom loaders the
information they need to implement their own COS-aware path, expose the
build-time SHA-256 as Module['wasmSHA256'] (set before instantiateWasm
is called) whenever -sCROSS_ORIGIN_STORAGE=1 is set.

- src/preamble.js: assign Module['wasmSHA256'] from the WASM_SHA256
  template literal, guarded by #if CROSS_ORIGIN_STORAGE, before the
  Module['instantiateWasm'] dispatch.
- test/test_other.py: two new tests — one that checks the property is
  present and matches the .wasm SHA-256, one that checks it is absent
  without the flag.
- site/source/docs/compiling/CrossOriginStorage.rst: new section
  "Custom Module['instantiateWasm'] implementations" documenting the
  bypass limitation and the Module['wasmSHA256'] escape hatch with a
  full worked example.
tomayac added 3 commits June 10, 2026 18:41
Build instructions belong in README.md, not the source file.
Copyright year corrected to 2026. Per reviewer feedback.
Not used by any automated test; browser tests use browser_test_hello_world.c
directly. The directory caused repeated confusion and build artifact issues.
Feature documentation lives in site/source/docs/compiling/CrossOriginStorage.rst.
Per reviewer feedback.
Comment thread ChangeLog.md Outdated
Comment thread tools/link.py Outdated
Comment thread test/test_browser.py Outdated
Comment thread test/test_browser.py Outdated
Comment thread test/browser_common.py
tomayac added 2 commits June 10, 2026 19:02
SIDE_MODULE is now a hard error (no JS glue emitted — genuinely
incompatible). SPLIT_MODULE and MAIN_MODULE partial-coverage warnings
are dropped; they are not true incompatibilities and add noise for an
experimental feature. Per reviewer feedback (r3383144746).
- Drop -sENVIRONMENT=web (web is included by default)
- Drop inline comment and docstrings
- Drop section banner comment

Per reviewer feedback.
@tomayac tomayac requested review from kripken and sbc100 June 10, 2026 17:24
Comment thread src/preamble.js Outdated
Comment thread src/preamble.js Outdated
Comment thread src/preamble.js Outdated
Comment thread ChangeLog.md Outdated
tomayac added 3 commits June 10, 2026 19:49
- Use globalThis.navigator?.crossOriginStorage (matches codebase pattern)
- Move var cosHash inside the if block
- Hardcode 'SHA-256' in the template; drop <<< WASM_HASH_ALGORITHM >>> placeholder

Per reviewer feedback.
@tomayac tomayac requested a review from sbc100 June 10, 2026 17:58

@sbc100 sbc100 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % final nits

Comment thread test/test_other.py Outdated
Comment thread test/test_other.py Outdated
Comment thread test/test_other.py Outdated
Comment thread test/test_browser.py
@tomayac

tomayac commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks a lot for the reviews, @sbc100 and @kripken! PTAL.

@sbc100 sbc100 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

I do have a qestion around when we might be able to remove the polyfil, and a slight concert about shipping features that are supported in zero browser.

But since these are tagged as experimental it should be easy enough to rib them out if that feature doesn't pan out I guess?

Comment thread test/browser_common.py
@tomayac

tomayac commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

As this is an opt-in and marked as experimental, most developers will likely never touch it. At this stage, this is meant for "white glove" partners we may want to work with directly, like SQLite Wasm (and also see ffmpegwasm/ffmpeg.wasm#940), and of course interested power users.

For availability and a feature bug to track, keep an eye on ChromeStatus for current developer interest. (You're in really good company.) We know that we will have a chicken and egg problem rolling this out, so the sooner we start, the higher the chance of cache hits once it lands, and developers will need lead time.

For early adopters, there's the Cross-Origin Storage extension that accurately implements the proposed API today. (We use a similar approach with WebMCP, where the Model Context Tool Inspector extension mimics an actual agent's interactions.)

Thanks for the reviews again! Really appreciate you taking the time! I was really out of my comfort zone most of the time, so thanks for being my patient guard rails! <3

@sbc100

sbc100 commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

To get the codesize tests passing you will want to do emsdk install tot then run ./tools/maint/rebaseline_tests.py (or ./test/runner codesize --rebase)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants