Skip to content

Commit 565d949

Browse files
jahoomaclaude
andcommitted
Self-heal missing tree-sitter.wasm by fetching from unpkg / jsdelivr
Round 11 shipped a binary that needs tree-sitter.wasm next to it (bun --compile asset embedding was broken on Windows for every mechanism we tried). The new freebuff/codebuff npm wrappers know to extract the wasm from the release tarball next to the binary, but the wrapper auto-updates only the binary, not itself — so users who installed a pre-fix wrapper download the new binary, the wrapper strips the wasm with the temp dir, and the new binary crashes on first run. Closing that loop in the binary itself: when init-node.ts's locateFile fallback can't find a sibling tree-sitter.wasm, fetch it synchronously from a CDN (unpkg, with jsdelivr as backup) and cache it next to the binary. Subsequent runs short-circuit at the existsSync check so the download only happens once. Sync via execFileSync('curl', ...) because emscripten's locateFile callback must return a path immediately. curl is built into macOS, Linux, and Windows 10 1803+. If it isn't, we fall through to the existing thrown error with a clear message. WEB_TREE_SITTER_VERSION is pinned to match sdk/package.json — a wasm built for a different web-tree-sitter runtime would crash with a much more confusing error than "missing wasm". Verified locally: deleted the sibling wasm, ran the binary, download fired ("[tree-sitter] downloaded https://unpkg.com/..."), file cached next to the binary, init succeeded; second run used the cache and made no network calls. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 86ebd09 commit 565d949

1 file changed

Lines changed: 77 additions & 4 deletions

File tree

packages/code-map/src/init-node.ts

Lines changed: 77 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
import { execFileSync } from 'child_process'
12
import * as fs from 'fs'
23
import * as path from 'path'
34

@@ -6,6 +7,22 @@ import { Parser } from 'web-tree-sitter'
67
const TREE_SITTER_WASM_ENV_VAR = 'CODEBUFF_TREE_SITTER_WASM_PATH'
78
const WASM_BINARY_GLOBAL_KEY = '__CODEBUFF_TREE_SITTER_WASM_BINARY__'
89

10+
// Pinned to the version in sdk/package.json. If we bump web-tree-sitter,
11+
// update this too — fetching a wasm built for a different version of the
12+
// runtime would crash with a more confusing error than "missing wasm".
13+
const WEB_TREE_SITTER_VERSION = '0.25.10'
14+
15+
// Self-heal endpoints for users on an old npm wrapper. The wrapper
16+
// auto-updates the binary but not itself, so users on pre-0.0.74
17+
// (freebuff) / pre-1.0.666 (codebuff) wrappers download the new binary
18+
// but their wrapper drops the sibling tree-sitter.wasm we tarball
19+
// alongside it. On missing wasm, the binary fetches it from one of
20+
// these CDNs and caches it next to itself for subsequent runs.
21+
const WASM_DOWNLOAD_URLS = [
22+
`https://unpkg.com/web-tree-sitter@${WEB_TREE_SITTER_VERSION}/tree-sitter.wasm`,
23+
`https://cdn.jsdelivr.net/npm/web-tree-sitter@${WEB_TREE_SITTER_VERSION}/tree-sitter.wasm`,
24+
]
25+
926
/**
1027
* Override the path to `tree-sitter.wasm` used during {@link initTreeSitterForNode}.
1128
*
@@ -30,6 +47,56 @@ function getEmbeddedWasmBinary(): Uint8Array | undefined {
3047
)[WASM_BINARY_GLOBAL_KEY]
3148
}
3249

50+
/**
51+
* Synchronously download tree-sitter.wasm from a public CDN and write it
52+
* to `targetPath`. Returns the path on success, null on any failure.
53+
*
54+
* Sync rather than async because this is called from emscripten's
55+
* locateFile callback, which must return a path immediately. We shell
56+
* out to `curl` (built-in on macOS / Linux / Windows 10+); if that
57+
* isn't available or the network's down, the caller falls through to
58+
* the next resolution strategy and ultimately throws a clear error.
59+
*
60+
* Logs a one-line status to stderr so users see what's happening on
61+
* the first run after an old-wrapper auto-update.
62+
*/
63+
function downloadWasmTo(targetPath: string): string | null {
64+
// Print to stderr so it doesn't pollute machine-readable stdout.
65+
// Visible to humans during the (briefly noticeable) first launch.
66+
process.stderr.write(
67+
`[tree-sitter] tree-sitter.wasm missing; downloading to ${targetPath}\n`,
68+
)
69+
for (const url of WASM_DOWNLOAD_URLS) {
70+
try {
71+
execFileSync(
72+
'curl',
73+
[
74+
'-fsSL',
75+
'--connect-timeout',
76+
'10',
77+
'--max-time',
78+
'60',
79+
'-o',
80+
targetPath,
81+
url,
82+
],
83+
{ stdio: 'pipe' },
84+
)
85+
if (fs.existsSync(targetPath) && fs.statSync(targetPath).size > 0) {
86+
process.stderr.write(`[tree-sitter] downloaded ${url}\n`)
87+
return targetPath
88+
}
89+
} catch (err) {
90+
process.stderr.write(
91+
`[tree-sitter] download from ${url} failed: ${
92+
err instanceof Error ? err.message : String(err)
93+
}\n`,
94+
)
95+
}
96+
}
97+
return null
98+
}
99+
33100
function resolveTreeSitterWasm(scriptDir: string): string {
34101
// Only return paths that fs.existsSync confirms — emscripten will
35102
// fs.readFile whatever we hand it, and bunfs internal paths (the
@@ -56,13 +123,19 @@ function resolveTreeSitterWasm(scriptDir: string): string {
56123
// path later. emscripten calls this locateFile callback during
57124
// Parser.init's async work, by which time execPath has stabilized.
58125
try {
59-
const sibling = path.join(
60-
path.dirname(process.execPath),
61-
'tree-sitter.wasm',
62-
)
126+
const siblingDir = path.dirname(process.execPath)
127+
const sibling = path.join(siblingDir, 'tree-sitter.wasm')
63128
if (fs.existsSync(sibling)) {
64129
return sibling
65130
}
131+
132+
// Self-heal: download from a CDN and cache next to the binary. This
133+
// is the path users on old npm wrappers take — their wrapper
134+
// auto-updated the binary but didn't extract the tarballed wasm
135+
// sibling, so the file isn't there on first run. Once we cache it,
136+
// subsequent runs short-circuit at the existsSync above.
137+
const downloaded = downloadWasmTo(sibling)
138+
if (downloaded) return downloaded
66139
} catch {
67140
// process.execPath may be unavailable in exotic runtimes; fall through.
68141
}

0 commit comments

Comments
 (0)