fix(pkgmap+routes): resolve workspace imports + extract SvelteKit filesystem routes#369
Conversation
… scan
Workspace imports (e.g. `import { x } from '@my/pkg'` inside a pnpm/yarn
monorepo) were silently dropped because the main discoverer filters
`package.json` (and `composer.json`) via IGNORED_JSON_FILES — meaning the
parallel/sequential pkgmap builders never saw a single manifest, the
hash table came back empty, and `cbm_pipeline_resolve_module` fell
through to the bare-string default. Every workspace import edge then
failed `cbm_gbuf_find_by_qn` and was silently dropped.
Empirical impact on a 885-file SvelteKit monorepo (Turborepo + 7 workspace
packages, ~1100 cross-package imports): edges 319 → 31802 (100×),
`updatePortalClient.in_degree` 0 → 7 (matches the 7 callers grep finds).
Changes:
* `cbm_pkgmap_scan_repo` — recursive filesystem walker that finds
manifest basenames independently of the main discoverer, honouring
the shared `cbm_should_skip_dir` blocklist (node_modules, .git,
target, build, …). Pluggable via the existing `cbm_pkgmap_try_parse`
language dispatcher so future manifest types just plug in.
* `cbm_pkgmap_build_from_repo` — sequential-path entry point that
combines whatever the main discoverer surfaced with the filesystem
scan. Backwards-compatible: falls back to files-only when `repo_path`
is NULL.
* Parallel path: `merge_pkg_entries` now invokes `cbm_pkgmap_scan_repo`
before the per-worker entry merge, so manifest entries from the
filesystem scan flow through the existing build pipeline.
* Sequential `pass_definitions` was split into two phases:
Phase 1 extracts every file (creating all Module nodes) before
Phase 2 resolves IMPORTS edges. Previously the loop interleaved
extract+resolve per file, so file_i's import targeting file_j (j > i)
could not find file_j's Module node and was silently dropped.
Tested on:
* 6-file pnpm fixture (apps/* + packages/* + @my/db workspace) →
imports 0 → 4, in_degree of imported function 0 → 4.
* ano-server (885 TS files, 7 @ano/* workspace packages) →
pkgmap entries 0 → 18, imports 195 → 1178, edges 319 → 31802.
* ano-core (Rust) → no regression (matches manifests via Cargo.toml).
…ver, +layout.server)
SvelteKit derives REST endpoints and SSR loaders from filesystem layout
under `src/routes/`, not from call sites like Express/Fastify
`app.get(...)`. The existing pass_calls.c chain (which recognises
service-pattern callees and emits HANDLES edges from them) therefore
never produced a single Route node for any SvelteKit project — even
fully-routed apps would return `total=0` for `label=Route`.
Adds `create_sveltekit_routes` as Phase 5 of `cbm_pipeline_create_route_nodes`
(predump pass). It walks every File node in the graph, identifies
`+server.{ts,js}` / `+page.server.{ts,js}` / `+layout.server.{ts,js}`
files via path inspection (basename match + presence of `/routes/`
ancestor), derives the URL route from the filesystem path:
* `(group)/` segments → stripped
* `[slug]` → `:slug`
* `[...rest]` → `*rest`
* `[slug=matcher]` → `:slug` (matcher suffix discarded)
then finds the handler entries via existing DEFINES edges from the
File node to Function / Variable nodes whose names match SvelteKit's
documented exports:
* `+server.ts` → GET / POST / PUT / PATCH / DELETE / OPTIONS / HEAD / fallback
* `+page.server.ts` → load (Function) + actions (Variable, form actions object)
* `+layout.server.ts` → load
For each (route, handler) pair the pass upserts a Route node with QN
`__route__<METHOD>__<path>` (consistent with the existing decorator/
infra route schema) and emits a HANDLES edge with `framework=sveltekit`
so downstream queries can distinguish them.
Empirical: 0 → 174 routes / 174 HANDLES on a 146-file SvelteKit monorepo
(ano-server: 4 apps, ~146 server-side routes including REST endpoints,
SSR loaders, form actions). Fixture test (3 server files, group + dynamic
param + REST + page server + layout server) covers all path-rewrite
edge cases.
|
Thank you, @utilia-ai-wox! 🙏 This is outstanding work — the 319→31,802 edges (100×) result on the SvelteKit+Turborepo monorepo speaks for itself, and the diagnosis was precise on both fronts:
One adaptation while landing, for transparency: your Landed in 9bcfaab, crediting you as author. Verified locally: build clean, all 3,622 tests pass including the pkgmap-scan, two-phase resolution, route, import-trace, and end-to-end indexing suites — no regressions. This is a big step for the workspace/monorepo support tracked in #271 and #56. Thank you! 🙏 |
Two structural fixes for TS monorepos (pnpm/yarn workspaces) and SvelteKit projects, measured to take a 885-file SvelteKit+Turborepo repo from 319 edges to 31802 (100x) with no fixture regressions. - pkgmap: package.json / composer.json live in IGNORED_JSON_FILES, so the manifest-driven pkgmap builders never saw them and every '@my/pkg' workspace import was silently dropped. Add a symlink-safe, skip-dir-aware filesystem walker (cbm_pkgmap_scan_repo / cbm_pkgmap_build_from_repo) that harvests manifests directly from the repo regardless of the discovery filter. Sequential and parallel paths both feed it now. - definitions: extract every file's defs (creating Module nodes) BEFORE resolving imports, so a workspace import in the first file can resolve against the complete in-memory graph (two-phase, cache-backed). - routes: synthesise Route nodes + HANDLES edges from SvelteKit's filesystem layout (+server / +page.server / +layout.server), which has no app.get(...) call site for pass_calls to pick up. Distilled from #369 onto current main. The PR's CBM_DISABLE_LSP_CROSS escape hatch was rebased onto the v0.7.0 fused cross-LSP architecture: it now gates run_cross_lsp (NULL all_defs makes the fused resolver no-op cross-file resolution) instead of wrapping the removed standalone pass_lsp_cross call. Relates to #271 / #56.
Summary
Two structural fixes for TypeScript monorepos with pnpm/yarn workspaces and SvelteKit projects, surfaced while indexing a 885-file production SvelteKit + Turborepo + 7 workspace-packages codebase (
ano-server). On that codebase the patches take the graph from 21507 nodes / 319 edges (ratio 0.015) to 29254 nodes / 31802 edges (ratio 1.09, 100× more edges), all without a single regression on the other test fixtures (Rust + Python repos).Two independent commits, easy to review separately:
fix(pkgmap)— workspace package imports (import { x } from '@my/pkg') were silently dropped becausepackage.jsonandcomposer.jsonare listed inIGNORED_JSON_FILES. The pkgmap builders walkedfiles[]looking for manifests they could never see, so the hash table was always empty, everycbm_pipeline_resolve_module(\"@my/pkg\")fell through, the resulting target QN didn't match any Module node, and the IMPORTS edge was silently dropped.feat(routes)— SvelteKit projects (which derive REST endpoints + SSR loaders from filesystem layout, not from call sites likeapp.get(...)) returnedtotal=0forlabel=Routeregardless of size. Adds a Phase 5 tocbm_pipeline_create_route_nodesthat walks File nodes, detects+server.{ts,js}/+page.server.{ts,js}/+layout.server.{ts,js}, derives the URL route path from the filesystem, and emits Route + HANDLES.Bug #1 — Workspace package imports
Root cause (3 cascading issues)
package.jsonfiltered out byIGNORED_JSON_FILES→ pkgmap builders see 0 manifestssrc/discover/discover.c:91-101pass_definitionsdoes extract + import-resolve in a single per-file loop → file_i can't resolve imports targeting file_j (j > i) because file_j's Module node doesn't exist yetsrc/pipeline/pass_definitions.c:324src/pipeline/pass_parallel.c:552Fix
cbm_pkgmap_scan_repo(repo_path, entries): recursive filesystem walker that picks up manifest basenames independently of the main discoverer, honouring the sharedcbm_should_skip_dirblocklist (node_modules, .git, target, build, .svelte-kit, dist, …). Uses the existingcbm_pkgmap_try_parsedispatcher so new manifest types just plug in.cbm_pkgmap_build_from_repo(repo_path, files, file_count, project_name): sequential-path entry point that combines whatever the discoverer surfaced with the filesystem scan.merge_pkg_entriesinvokescbm_pkgmap_scan_repobefore the existing per-worker entry merge.pass_definitionssplit into two phases: Phase 1 extracts every file (Module nodes), Phase 2 resolves IMPORTS edges (all targets now exist).Numbers (
ano-server, 885 files, 7@ano/*workspace packages)pkgmap.build entriesparallel.registry.done importsupdatePortalClient.in_degree(workspace function with 7 callers)Bug #2 — SvelteKit filesystem routes
Root cause
pass_calls.crecognises Express / Fastify / decorator-based route declarations throughcbm_service_pattern_match(matching the callee name of a function call likeapp.get(\"/foo\", handler)). SvelteKit has no such call — the file path itself IS the route declaration. So no Route nodes were ever created for SvelteKit, no HANDLES edges, no API surface exposed in the graph.Fix
create_sveltekit_routesadded as Phase 5 incbm_pipeline_create_route_nodes. Walks File nodes, identifies SvelteKit server-side files via:+server.{ts,js},+page.server.{ts,js},+layout.server.{ts,js}/routes/segment in the path so we don't snag misnamed files outside the routes treeDerives the URL route from the filesystem:
apps/x/src/routes/foo/+server.ts/fooapps/x/src/routes/api/items/+server.ts/api/itemsapps/x/src/routes/(grp)/foo/+server.ts/foo(group stripped)apps/x/src/routes/[slug]/+server.ts/:slug(param rewrite)apps/x/src/routes/[...rest]/+server.ts/*rest(rest param)apps/x/src/routes/[slug=matcher]/+server.ts/:slug(matcher suffix discarded)apps/x/src/routes/+server.ts/(root)Finds handler entries through existing DEFINES edges from the File node to Function/Variable nodes whose names match SvelteKit's documented exports:
+server.{ts,js}GET,POST,PUT,PATCH,DELETE,OPTIONS,HEAD,fallback(Function)+page.server.{ts,js}load(Function) +actions(Variable, form-actions object)+layout.server.{ts,js}load(Function)For each (route, handler) pair, upserts a Route node with QN
__route__<METHOD>__<path>(same schema as existing decorator/infra routes) and emits a HANDLES edge withframework=sveltekitso downstream queries can distinguish SvelteKit routes from Express/etc.Numbers (
ano-server)pass.sveltekit_routes filespass.sveltekit_routes routespass.sveltekit_routes handlessearch_graph label=Routetotal/action/:token,/action/:token/questionnaire,/api/v1/portal-auth/login, …Test plan
@my/dbworkspace + 3 SvelteKit files) — covers group, dynamic param, REST + page-server + layout-serverano-server(885 TS files, 7@ano/*packages, SvelteKit + REST endpoints)ano-app-finance(Rust Tauri) — no regressionano-core(Rust SDK) — no regressionmake -f Makefile.cbm test) — please CI on your side, my macOS toolchain doesn't have all the test sanitizer prereqsNotes for reviewers
ctx->result_cache: if the caller didn't supply a cache, the old (suboptimal) per-file ordering is preserved for callers that opted out of caching. No behaviour change for them.cbm_pipeline_pass_lsp_crossdeadlocks on a specific.tsfile in this codebase (≥ 600s no progress, ≤ 5s CPU). Reported as a separate issue with reproducer — unrelated to this PR.Generated with Claude Code