Skip to content

Commit 9cee852

Browse files
jahoomaclaude
andcommitted
Embed tree-sitter wasm as base64 string literal in CLI binary
Freebuff 0.0.62 still crashed on Windows with the same "Internal error: tree-sitter.wasm not found" — surfaced this time through the late renderer-cleanup handler ("Unhandled rejection: error: ...") instead of the early one, so it appeared *after* the logo had rendered. CI Windows smoke passed because the rejection fires past the 5s kill timer (after React mounts and the renderer is up), and even when it does fire, the boot screen has already matched our positive signal. Root cause: the previous fix's `fs.readFileSync(treeSitterWasmPath)` of the bunfs path silently fails on Windows for some user environments, its catch block falls through, globalThis stays unset, and init-node then hits the broken path-based fallback. CI Windows happened to pass fs.readFileSync — user Windows didn't. Bypass the filesystem entirely: bake the wasm bytes into the JS source as a base64 string literal that bun --compile bundles into the binary's text segment. No runtime fs read, no path normalization, no platform quirks. - cli/src/pre-init/tree-sitter-wasm-bytes.ts: committed stub with empty base64. Dev mode and unit tests see this and fall through to code-map's path-based resolution (which works locally because node_modules/web-tree-sitter/tree-sitter.wasm exists). - cli/scripts/build-binary.ts: overwrites the stub with the real bytes before `bun build --compile`, restores it after. `process.on('exit', restore)` is a backstop so a crash mid-build doesn't leave a multi-MB diff in the working tree. - cli/src/pre-init/tree-sitter-wasm.ts: drop the `with { type: 'file' }` + readFileSync path, decode the embedded base64 directly. - cli/scripts/smoke-binary.ts: bump the run window from 5s to 10s and match the late-handler form ("Unhandled rejection:" / "Uncaught exception:") in addition to the early one. The 0.0.62 regression fired *after* the boot screen rendered, so a positive boot signal alone isn't enough — we need to keep watching for fatal markers through the full window. Verified locally: full bun --compile build embeds 205KB of wasm as 274KB of base64, stub is restored after build (and after a simulated mid-build crash via the exit handler), binary boots cleanly to the chat surface with no wasm errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent de7bfac commit 9cee852

4 files changed

Lines changed: 115 additions & 31 deletions

File tree

cli/scripts/build-binary.ts

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
#!/usr/bin/env bun
22

33
import { spawnSync, type SpawnSyncOptions } from 'child_process'
4+
import { createRequire } from 'module'
45
import {
56
chmodSync,
67
existsSync,
@@ -144,6 +145,11 @@ async function main() {
144145
patchOpenTuiAssetPaths()
145146
await ensureOpenTuiNativeBundle(targetInfo)
146147

148+
const restoreTreeSitterWasmStub = embedTreeSitterWasmAsBase64()
149+
// Restore the stub even on build failure so a developer's git working
150+
// tree doesn't end up with a multi-megabyte modified file.
151+
process.on('exit', restoreTreeSitterWasmStub)
152+
147153
const outputFilename =
148154
targetInfo.platform === 'win32' ? `${binaryName}.exe` : binaryName
149155
const outputFile = join(binDir, outputFilename)
@@ -185,6 +191,11 @@ async function main() {
185191

186192
runCommand('bun', buildArgs, { cwd: cliRoot })
187193

194+
// Build done — restore the stub so a developer's working tree doesn't show
195+
// a multi-megabyte diff. (The exit handler above is a backstop for crashes;
196+
// the eager call here keeps a successful build clean.)
197+
restoreTreeSitterWasmStub()
198+
188199
if (targetInfo.platform !== 'win32') {
189200
chmodSync(outputFile, 0o755)
190201
}
@@ -203,6 +214,50 @@ main().catch((error: unknown) => {
203214
process.exit(1)
204215
})
205216

217+
/**
218+
* Inline the contents of `web-tree-sitter/tree-sitter.wasm` as a base64 string
219+
* literal in `cli/src/pre-init/tree-sitter-wasm-bytes.ts`. The committed
220+
* file is a stub; this overwrites it with the real bytes immediately before
221+
* `bun build --compile`, so the bytes get baked into the binary's text
222+
* segment instead of being placed at a bunfs path that has to be fs-read at
223+
* runtime.
224+
*
225+
* Returns a function that restores the stub. Always invoke it (success or
226+
* failure) so a developer's working tree doesn't show a multi-MB diff.
227+
*/
228+
function embedTreeSitterWasmAsBase64(): () => void {
229+
const stubPath = join(cliRoot, 'src', 'pre-init', 'tree-sitter-wasm-bytes.ts')
230+
const originalStub = readFileSync(stubPath, 'utf8')
231+
let restored = false
232+
const restore = (): void => {
233+
if (restored) return
234+
restored = true
235+
try {
236+
writeFileSync(stubPath, originalStub)
237+
} catch (error) {
238+
console.error('Failed to restore tree-sitter-wasm-bytes stub:', error)
239+
}
240+
}
241+
242+
// Resolve from the CLI workspace so monorepo hoisting differences don't
243+
// matter — `web-tree-sitter` is an SDK dep, but the CLI imports it
244+
// transitively and the bundler walks it from here.
245+
const cliRequire = createRequire(join(cliRoot, 'package.json'))
246+
const wasmPath = cliRequire.resolve('web-tree-sitter/tree-sitter.wasm')
247+
const wasmBytes = readFileSync(wasmPath)
248+
const base64 = wasmBytes.toString('base64')
249+
250+
const generated =
251+
`// AUTO-GENERATED by cli/scripts/build-binary.ts during \`bun build --compile\`.\n` +
252+
`// Restored to the empty stub after the build finishes — do not commit a\n` +
253+
`// non-empty value here.\n` +
254+
`export const TREE_SITTER_WASM_BASE64 = ${JSON.stringify(base64)}\n`
255+
256+
writeFileSync(stubPath, generated)
257+
log(`Embedded tree-sitter.wasm (${wasmBytes.length} bytes → ${base64.length} chars base64)`)
258+
return restore
259+
}
260+
206261
function patchOpenTuiAssetPaths() {
207262
const coreDir = join(cliRoot, 'node_modules', '@opentui', 'core')
208263
if (!existsSync(coreDir)) {

cli/scripts/smoke-binary.ts

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,14 +57,29 @@ const BOOT_SIGNAL_PATTERNS = [
5757
// regressions of bugs we've already seen. The boot-signal check above is
5858
// the real gate: it fails on *any* startup problem, including ones whose
5959
// error text we never thought to add here.
60+
//
61+
// Note both paths the cli error handlers print: "Fatal error during
62+
// startup" (earlyFatalHandler in cli/src/index.tsx, fires while main()
63+
// is still wiring up) and "Unhandled rejection:" / "Uncaught exception:"
64+
// (installProcessCleanupHandlers in cli/src/utils/renderer-cleanup.ts,
65+
// fires after the renderer is up). The wasm-load rejection on freebuff
66+
// 0.0.62 surfaced through the *late* renderer-cleanup path, after the
67+
// boot screen had already rendered.
6068
const FATAL_PATTERNS = [
6169
/Fatal error during startup/i,
70+
/Unhandled rejection:/i,
71+
/Uncaught exception:/i,
6272
/Internal error: tree-sitter\.wasm not found/i,
6373
/UnhandledPromiseRejection/i,
6474
/Cannot find module/i,
6575
] as const
6676

67-
const DEFAULT_RUN_SECONDS = 5
77+
// Long enough that an unhandled rejection from the eager Parser.init has
78+
// time to surface through the renderer-cleanup handler — that path is
79+
// what tripped freebuff 0.0.62 in the wild while a 5s window let CI pass.
80+
// Async wasm rejections can fire >5s after spawn (after React mounts and
81+
// the renderer is up).
82+
const DEFAULT_RUN_SECONDS = 10
6883

6984
async function main(): Promise<void> {
7085
const binary = process.argv[2]
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
// Stub committed for dev mode and tests. The real wasm bytes are inlined
2+
// here as base64 by `cli/scripts/build-binary.ts` immediately before
3+
// `bun build --compile`, then restored to the empty stub after the build
4+
// completes. Dev mode and unit tests see the empty stub and fall back to
5+
// path-based resolution in `packages/code-map/src/init-node.ts` (which
6+
// works locally because `node_modules/web-tree-sitter/tree-sitter.wasm`
7+
// exists on the filesystem).
8+
//
9+
// Why a string literal instead of `with { type: 'file' }` + readFileSync:
10+
// the file-import approach left the bytes in bunfs and required a runtime
11+
// fs read, which silently failed on Windows (`fs.readFileSync` for
12+
// `B:\~BUN\root\...` paths) and let the singleton fall through to a
13+
// path-based fallback that also failed there. A base64 string literal in
14+
// the JS source compiles into the bun binary's text segment, with no
15+
// filesystem step on the hot path.
16+
export const TREE_SITTER_WASM_BASE64 = ''
Lines changed: 28 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,35 @@
11
// Embed tree-sitter.wasm into the bun-compile binary so the SDK's tree-sitter
22
// parser singleton can find it at runtime. Must be the very first import in
33
// `index.tsx`: subsequent imports (the SDK / code-map) eagerly construct the
4-
// parser, and its init reads what we publish here on `globalThis` and `process.env`.
4+
// parser, and its init reads what we publish here on `globalThis`.
55
//
6-
// Why not just `locateFile` + a path? On Windows, bun --compile reports the
7-
// embedded path as `B:\~BUN\root\...`, and `fs.existsSync` returns false for
8-
// that path inside the running binary even though `fs.readFileSync` works. So
9-
// we read the bytes once at startup and pass them straight to `Parser.init`
10-
// via `wasmBinary`, sidestepping filesystem resolution entirely.
11-
12-
import * as fs from 'fs'
13-
14-
// @ts-expect-error - Bun's `with { type: 'file' }` returns a string path; TS resolves
15-
// the .wasm file via web-tree-sitter's exports map and has no loader for it.
16-
import treeSitterWasmPath from 'web-tree-sitter/tree-sitter.wasm' with {
17-
type: 'file',
18-
}
6+
// Why not `with { type: 'file' }` + a runtime fs read? That's what the prior
7+
// fix tried, and it silently failed on Windows: bun --compile reports the
8+
// embedded asset path as `B:\~BUN\root\...`, and on some Windows configs
9+
// `fs.readFileSync` of that path throws (caught silently), so the SDK fell
10+
// back to path-based resolution that also failed there.
11+
//
12+
// The base64 string in `tree-sitter-wasm-bytes.ts` is replaced with the real
13+
// wasm contents by `cli/scripts/build-binary.ts` right before `bun build
14+
// --compile` and restored after. The bytes end up in the binary's text
15+
// segment as a JS string literal — no filesystem step on the hot path. In
16+
// dev / unit tests the stub is empty and code-map falls back to the
17+
// node_modules wasm, which works because the file actually exists locally.
1918

20-
if (treeSitterWasmPath) {
21-
// Path stays for any consumer (tests, dev runs) that still resolves via fs.
22-
process.env.CODEBUFF_TREE_SITTER_WASM_PATH = treeSitterWasmPath
19+
import { TREE_SITTER_WASM_BASE64 } from './tree-sitter-wasm-bytes'
2320

24-
try {
25-
const binary = fs.readFileSync(treeSitterWasmPath)
26-
// globalThis is the only cross-bundle channel: the SDK pre-built bundle
27-
// inlines its own copy of `init-node.ts`, so a module-level variable in
28-
// the source package wouldn't be visible to the singleton initialized
29-
// via the SDK.
30-
;(globalThis as { __CODEBUFF_TREE_SITTER_WASM_BINARY__?: Uint8Array }).__CODEBUFF_TREE_SITTER_WASM_BINARY__ =
31-
new Uint8Array(binary.buffer, binary.byteOffset, binary.byteLength)
32-
} catch {
33-
// readFileSync failure is unexpected (the file is supposed to be embedded)
34-
// but we let init-node.ts fall back to path-based resolution and surface
35-
// a clearer error if that also fails.
36-
}
21+
if (TREE_SITTER_WASM_BASE64.length > 0) {
22+
const buf = Buffer.from(TREE_SITTER_WASM_BASE64, 'base64')
23+
// globalThis is the only cross-bundle channel: the SDK pre-built bundle
24+
// inlines its own copy of `init-node.ts`, so a module-level variable in
25+
// the source package isn't visible to the singleton initialized via the
26+
// SDK. Slice into a fresh Uint8Array view instead of handing over the
27+
// Buffer's shared underlying ArrayBuffer.
28+
;(
29+
globalThis as { __CODEBUFF_TREE_SITTER_WASM_BINARY__?: Uint8Array }
30+
).__CODEBUFF_TREE_SITTER_WASM_BINARY__ = new Uint8Array(
31+
buf.buffer,
32+
buf.byteOffset,
33+
buf.byteLength,
34+
)
3735
}

0 commit comments

Comments
 (0)