Skip to content

feat: cache iOS runner artifacts during prepare#688

Merged
thymikee merged 7 commits into
mainfrom
codex/ios-runner-prepare-cache-ci
Jun 4, 2026
Merged

feat: cache iOS runner artifacts during prepare#688
thymikee merged 7 commits into
mainfrom
codex/ios-runner-prepare-cache-ci

Conversation

@thymikee
Copy link
Copy Markdown
Member

@thymikee thymikee commented Jun 4, 2026

Summary

prepare ios-runner now owns Apple runner artifact restore/build/health behavior: cache hits launch from the restored .xctestrun, misses build through prepare, and bad restored artifacts are marked, cleaned, rebuilt once, and health-checked again.

The runner path is now organized around an Apple runner runtime/provider contract with runCommand, optional prepare, and optional best-effort prewarm. Local XCUITest execution implements that runtime, while provider-backed integration scenarios can own prepare directly without reducing it to a raw uptime command.

Apple platform policy moved out of the xctestrun artifact module into apple-runner-platform.ts, so SDK names, derived-cache folders, xctestrun hints, and destination strings are table-driven for iOS, tvOS targets, and macOS. That keeps future iPad/watchOS-style expansion additive instead of spreading conditionals through artifact/cache code.

The iOS CI and nightly replay workflows dogfood this path by restoring/saving runner cache and running prepare ios-runner --timeout 300000 --json as the single build-or-health gate.

Touched files: 23. Scope expanded from runner runtime to CI workflow/action wiring so this repo uses the new prepare-owned cache path.

Validation

pnpm check:quick passed after the final refactor.

Focused provider/runner coverage passed: pnpm exec vitest run src/platforms/ios/__tests__/runner-provider.test.ts src/platforms/ios/__tests__/runner-client.test.ts src/platforms/ios/__tests__/runner-command-retry.test.ts test/integration/provider-scenarios/macos-desktop.test.ts test/integration/provider-scenarios/ios-lifecycle.test.ts.

pnpm build:xcuitest passed for iOS and macOS runner builds.

pnpm check:unit passed outside the sandbox. The sandboxed attempt failed on unrelated local listener/process restrictions such as listen EPERM, so it was rerun with escalation.

pnpm format was run; unrelated existing formatter churn was reverted so this diff stays scoped.

Earlier PR validation also covered fallow, provider integration, smoke, and coverage for the cache/prepare implementation.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

Size Report

Metric Base Current Diff
JS raw 1.1 MB 1.1 MB +5.7 kB
JS gzip 368.6 kB 370.3 kB +1.7 kB
npm tarball 474.6 kB 476.7 kB +2.1 kB
npm unpacked 1.6 MB 1.6 MB +5.7 kB

Startup median (7 runs, lower is better):

Scenario Base Current Diff
CLI --version 25.6 ms 25.4 ms -0.2 ms
CLI --help 41.3 ms 40.9 ms -0.4 ms

Top changed chunks:

Chunk Raw diff Gzip diff
dist/src/2415.js +5.5 kB +1.6 kB
dist/src/session.js +126 B +53 B
dist/src/9542.js +25 B +10 B
dist/src/args.js +2 B +2 B
dist/src/1352.js +14 B +2 B

@thymikee thymikee force-pushed the codex/ios-runner-prepare-cache-ci branch 4 times, most recently from 6e6a593 to 73713ab Compare June 4, 2026 01:21
Copy link
Copy Markdown
Member Author

@thymikee thymikee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: prepare-owned Apple runner cache

Reviewed for regressions, perf, data-flow, platform parity, and API usage. This is a strong refactor — sharing the high-signal points.

What works well

  • Cleaner data flow / decomposition. runner-client.ts shrinks from ~611 lines to a thin facade; orchestration moves into runner-lifecycle.ts, transport-error recovery into runner-command-recovery.ts, and platform policy into apple-runner-platform.ts. The AppleRunnerProvider contract (runCommand + optional prepare/prewarm) is a genuinely simpler seam — provider-backed scenarios can own prepare instead of faking an uptime command.
  • Platform parity. The table-driven RUNNER_PLATFORM_PROFILES (iOS/tvOS/macOS) plus the isApplePlatform gate in the daemon prepare handler makes future iPad/watchOS-style additions additive instead of conditional sprawl. The moved resolveRunner* helpers look behavior-equivalent to the originals.
  • Correct replay safety. The recovery path correctly distinguishes before-send failures (Runner did not accept connection, readiness-preflight) — which it safely replays on a fresh session — from after-send transport loss, which goes through status-probe recovery and never blindly replays a mutating command. That distinction is the crux of correctness here and it's preserved.
  • Cache correctness win. Folding Xcode/SDK fingerprints into the cache metadata means cached .xctestrun artifacts now invalidate across toolchain upgrades — a real bug class avoided.

Worth addressing

  • failureReason semantics on a successful recovery (highest signal). In prepareLocalIosRunner, the bad-cache→rebuild→health-check-passes path returns a result that still carries failureReason: reason, recordPrepareResult then logs it at warn, and the daemon handler spreads ...result into the --json response. So a prepare that ultimately succeeded surfaces a failureReason and a warn-level log. CI/agents parsing the JSON could read that as a failure. Consider renaming to recoveredFromReason/recoveryReason (or omitting it on success) to keep "prepare succeeded" unambiguous.
  • resolveExpectedRunnerCacheMetadata now spawns subprocesses. It shells out to xcodebuild -version and two xcrun calls. It's memoized per-process via appleToolFingerprintCache, which is the right call — but it converts a previously-pure metadata function into one doing blocking subprocess I/O on first use. Worth confirming this is never invoked on a latency-sensitive per-command path (it looks build/cache-key scoped, which is fine).
  • Schema version not bumped. RUNNER_CACHE_SCHEMA_VERSION stays 1 despite five new metadata fields. It still works (full-JSON comparableRunnerCacheMetadata equality + the hash-derived cache key both naturally invalidate stale entries), but bumping it would make the invalidation intent explicit and avoid confusing "same schemaVersion, different shape" metadata in logs.
  • Order-sensitive metadata equality duplicated across two files. Reuse hinges on JSON.stringify(comparableRunnerCacheMetadata(...)) equality, and the derived cache key is a hash of the same. The CI .mjs (write-xcuitest-cache-metadata.mjs) must emit byte-identical key order, and this PR widens that implicit contract by 5 fields across both files. They currently match, but a sorted-key canonicalization would make this robust against future reordering.
  • Vestigial clean-derived: "1". With build-on-miss: "false", that input now only feeds the skipped "Build replay artifacts" step (env is scoped there, not job-level), so it's effectively dead in ios.yml/replays-nightly.yml. Dropping it would avoid implying the prepare step cleans the restored cache (it correctly doesn't — AGENT_DEVICE_IOS_RUNNER_DERIVED_PATH is job-level and AGENT_DEVICE_IOS_CLEAN_DERIVED is not, so the restore→reuse flow is intact).

Nothing here looks blocking; the failureReason-on-success naming is the one I'd most want resolved before merge.


Generated by Claude Code

@thymikee
Copy link
Copy Markdown
Member Author

thymikee commented Jun 4, 2026

Addressed the review feedback in 08dbb4a26:

  • Successful bad-cache recovery now returns/logs recoveryReason instead of failureReason, and apple_runner_prepare stays info level. Actual failed recovery still uses failureReason.
  • Bumped runner artifact cache metadata schema to 2 in both agent-device and the CI metadata writer.
  • Made runner cache metadata comparison/cache-key hashing use stable-key JSON so field ordering is no longer an implicit contract.
  • Removed dead clean-derived: "1" from the build-on-miss: "false" iOS CI/nightly setup calls.

Validation:

  • pnpm exec vitest run src/platforms/ios/__tests__/runner-command-retry.test.ts src/platforms/ios/__tests__/runner-xctestrun.test.ts src/platforms/ios/__tests__/runner-client.test.ts passed.
  • pnpm check:quick passed.
  • pnpm check:unit was rerun outside the sandbox; it hit existing timing/tooling flakes in unrelated daemon-client/iOS status-bar/Android bundletool tests, while the focused runner/cache tests above passed.

@thymikee thymikee force-pushed the codex/ios-runner-prepare-cache-ci branch from 952f43e to 5807622 Compare June 4, 2026 22:20
@thymikee thymikee merged commit ad7b386 into main Jun 4, 2026
18 checks passed
@thymikee thymikee deleted the codex/ios-runner-prepare-cache-ci branch June 4, 2026 22:37
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-04 22:37 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant