Skip to content

Speed up warm native compile cache paths#4294

Open
andrewtdiz wants to merge 4 commits into
PerryTS:mainfrom
andrewtdiz:codex/fix-native-warm-noop
Open

Speed up warm native compile cache paths#4294
andrewtdiz wants to merge 4 commits into
PerryTS:mainfrom
andrewtdiz:codex/fix-native-warm-noop

Conversation

@andrewtdiz
Copy link
Copy Markdown
Contributor

Summary

  • stabilize native object-cache keys for logical type fingerprints by avoiding nondeterministic Debug serialization
  • add a native executable link-cache manifest under the compile cache root to skip relinking when linker inputs, object fingerprints, output identity/content, toolchain inputs, and Perry binary identity match
  • expose additive JSON link_cache stats and cover the warm native skip path with unit and integration tests

Verification

  • cargo check -p perry
  • cargo test -p perry link_cache_
  • cargo test -p perry --test native_link_cache native_compile_skips_link_on_identical_second_build -- --nocapture
  • cargo build -p perry --release
  • focused experiment:
    python3 scripts/run_arch_experiment.py --perry-bin /Users/andrew/.codex/worktrees/perry-native-warm-noop/target/release/perry --repeats 1 --tiers dep-wide-100k --modes native --keep-generated

Focused dep-wide-100k native sample

Baseline from post-rootcause report:

  • Perry noop: 2.22s
  • Perry small_change: 2.87s
  • Perry medium_change: 12.59s

This branch, latest kept run:

  • Perry cold: 8.53s
  • Perry noop: 2.46s
  • Perry small_change: 2.80s
  • Perry medium_change: 9.63s

Earlier kept run with the same fix shape sampled Perry noop at 1.93s, small_change at 2.75s, and medium_change at 8.10s, so the benchmark remains noisy. A direct immediate repeat compile on the generated project deterministically reports "link_cache":{"linked":false,"skipped":true} in JSON and ran in 2.39s, which shows the remaining warm no-op cost is pre-link graph/codegen/object orchestration rather than the linker itself.

Remaining gap

This PR implements safe relink avoidance, not a full pre-graph build fingerprint. The next performance step is a build-level no-op manifest that can prove source graph/options/runtime/output currency before parsing/lowering/object orchestration.

@andrewtdiz andrewtdiz force-pushed the codex/fix-native-warm-noop branch from 1bb54d6 to e407b68 Compare June 3, 2026 19:55
@andrewtdiz andrewtdiz marked this pull request as ready for review June 3, 2026 19:57
@andrewtdiz
Copy link
Copy Markdown
Contributor Author

Codex review findings:

  1. P1 - Link cache can reuse stale executables when -L/-l or frameworks change.

    The fingerprint only hashes explicit file args it recognizes, but Perry supports native-library inputs resolved indirectly through -L... + -l..., /LIBPATH: + foo.lib, -F + -framework, and pkg-config flags. Those real archives/frameworks are not content-hashed, so changing vendor/lib/libfoo.a or a vendored static framework can leave the command string unchanged and make the warm build skip relinking against an old executable.

    Relevant spots:

    • crates/perry/src/commands/compile/link/link_cache.rs: file_inputs_from_arg only recognizes a small set of explicit file-shaped args.
    • crates/perry/src/commands/compile/link/mod.rs: native library metadata adds -F/-framework, -L/-l, /LIBPATH:, and pkg-config flags that can resolve link inputs without a stable file arg in the command.

    DeepWiki's main-branch reference also calls out these native libs/frameworks as link-affecting inputs, so the manifest needs to resolve and hash them or decline to cache this shape.

  2. P2 - Toolchain and inherited linker environment are not actually fingerprinted for common paths.

    compute_link_cache_state hashes only Command's explicit env overrides and only hashes the linker executable if cmd.get_program() itself is an existing path. On Linux native builds Perry uses Command::new("cc"), so the cache records the literal string cc but not the resolved executable from PATH; on Windows, an existing LIB env is inherited rather than added to cmd.get_envs(). A compiler, SDK, LIB, LIBRARY_PATH, or PATH change can therefore produce a different link if run, while this cache skips and preserves the old binary.

Suggested fix direction: make the link cache either conservative by disabling itself for unresolved/linker-search inputs, or expand the fingerprint to include resolved library/framework files, resolved linker executable, and relevant inherited linker environment.

@andrewtdiz andrewtdiz marked this pull request as draft June 3, 2026 21:14
@andrewtdiz andrewtdiz marked this pull request as ready for review June 3, 2026 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant