Sub-issue of #35661 (token-budget exhaustion). This tracks a distinct, separately-fixable robustness/observability bug that the token-cap failures expose: when a copilot-engine run fails and the harness tries to auto-emit a fallback safe-output, the write fails on a read-only filesystem, so the agent's failure signal is silently dropped.
Problem statement
After the Copilot CLI exits non-zero (here, because it exhausted its 3 --continue retries against the 25M effective-tokens cap), the copilot-harness attempts to auto-emit a missing_tool safe-output to record why the run failed. That emission fails:
missing_tool emission failed: EROFS: read-only file system,
open '/home/runner/work/_temp/gh-aw/safeoutputs/outputs.jsonl'
Because the write is dropped, the run surfaces only a generic non-zero exit — the structured failure reason (numerous permission-denials, token-cap) never reaches agent_output.json / safe-outputs, degrading downstream triage and auto-issue quality.
Affected workflows / runs (6h window, 2026-05-30)
EROFS fallback-emission failure observed in 2 of the 3 runs that hit the 25M token cap in this window:
| Workflow |
Run |
Token-cap peak |
Harness state |
EROFS on fallback |
| Daily Compiler Quality Check |
§26673439829 |
25,003,302 / 25,000,000 |
permissionDeniedCount=11, hasNumerousPermissionDenied=true, retries exhausted |
✅ yes |
| Copilot CLI Deep Research Agent |
§26675076543 |
25,526,138 / 25,000,000 |
retries exhausted (exitCode=1) |
✅ yes |
| Documentation Noob Tester |
§26675130908 |
25,274,612 / 25,000,000 |
3 retries exhausted |
not observed |
Harness diagnostic excerpt (run 26673439829)
[copilot-harness] attempt 1 failed: exitCode=1 isCAPIError400=false isMCPPolicyError=false
isModelNotSupportedError=false isAuthError=false permissionDeniedCount=11
hasNumerousPermissionDenied=true hasOutput=true retriesRemaining=3
[copilot-harness] missing_tool emission failed: EROFS: read-only file system,
open '/home/runner/work/_temp/gh-aw/safeoutputs/outputs.jsonl'
... Last error: CAPIError: 429 Maximum effective tokens exceeded (25003302.30 / 25000000).
Probable root cause
By the time the harness runs its post-failure fallback-emission path, the safeoutputs output file (/home/runner/work/_temp/gh-aw/safeoutputs/outputs.jsonl) is on a mount that is read-only (likely the safe-outputs collection window has closed or the bind-mount was switched to read-only during agent-container teardown). The fallback missing_tool/diagnostic write therefore hits EROFS and is dropped rather than retried or redirected.
This is independent of the token-cap root cause in #35661: even after #35661's remediations land, any agent that fails after the safe-outputs window closes will lose its diagnostic via the same path.
Proposed remediation
- Keep the safe-outputs sink writable through fallback emission — ensure
outputs.jsonl (or an equivalent fallback file) remains writable for the harness's own post-failure emission, even after the agent container is torn down.
- Fail-safe fallback path — if the primary
outputs.jsonl is read-only, write the fallback diagnostic to a guaranteed-writable location (e.g. $RUNNER_TEMP/gh-aw/agent/) and have the safe_outputs job ingest it.
- Surface, don't swallow — log the EROFS at
error level into the step summary so the lost diagnostic is at least visible, and emit a report_incomplete-style signal so the failure reason is not reduced to a bare exit code.
Success criteria / verification
- Over a 7-day window, zero
copilot-engine runs log missing_tool emission failed: EROFS (or any EROFS ... outputs.jsonl).
- When an agent fails after retry exhaustion, the run's
agent_output.json / safe-outputs contains the structured failure reason (token-cap, permission-denied count) rather than only a generic non-zero exit.
Context — parent #35661 token-cap is still recurring
This window (2026-05-30 ~01:40–07:40 UTC) confirms the parent's 25M effective-tokens cap failure mode is not abating: 3 distinct copilot-engine workflows hit it (table above), one of which (Copilot CLI Deep Research Agent) was already named in #35661's original 2026-05-29 sample — i.e. a repeat offender. #35661's proposed remediations (per-workflow max-turns, MCP tool-surface trimming, --no-resume on retry after a 429) remain unimplemented and unverified. The related infra tracker #35780 (awf-squid unhealthy) did not recur in this window.
Confidence & unknowns
- High confidence the EROFS emission failure is real and occurs in ≥2 distinct workflows (direct log evidence).
- Medium confidence on the exact mount-lifecycle cause (read-only at teardown) — not directly inspected; inferred from path + timing. The remediation should be validated against the actual mount setup in the agent job.
References: §26673439829 · §26675076543 · §26675130908
Related to #35661
Generated by 🔍 [aw] Failure Investigator (6h) · opus48 4M · ◷
Sub-issue of #35661 (token-budget exhaustion). This tracks a distinct, separately-fixable robustness/observability bug that the token-cap failures expose: when a
copilot-engine run fails and the harness tries to auto-emit a fallback safe-output, the write fails on a read-only filesystem, so the agent's failure signal is silently dropped.Problem statement
After the Copilot CLI exits non-zero (here, because it exhausted its 3
--continueretries against the 25M effective-tokens cap), thecopilot-harnessattempts to auto-emit amissing_toolsafe-output to record why the run failed. That emission fails:Because the write is dropped, the run surfaces only a generic non-zero exit — the structured failure reason (numerous permission-denials, token-cap) never reaches
agent_output.json/ safe-outputs, degrading downstream triage and auto-issue quality.Affected workflows / runs (6h window, 2026-05-30)
EROFS fallback-emission failure observed in 2 of the 3 runs that hit the 25M token cap in this window:
permissionDeniedCount=11,hasNumerousPermissionDenied=true, retries exhaustedHarness diagnostic excerpt (run 26673439829)
Probable root cause
By the time the harness runs its post-failure fallback-emission path, the
safeoutputsoutput file (/home/runner/work/_temp/gh-aw/safeoutputs/outputs.jsonl) is on a mount that is read-only (likely the safe-outputs collection window has closed or the bind-mount was switched to read-only during agent-container teardown). The fallbackmissing_tool/diagnostic write therefore hitsEROFSand is dropped rather than retried or redirected.This is independent of the token-cap root cause in #35661: even after #35661's remediations land, any agent that fails after the safe-outputs window closes will lose its diagnostic via the same path.
Proposed remediation
outputs.jsonl(or an equivalent fallback file) remains writable for the harness's own post-failure emission, even after the agent container is torn down.outputs.jsonlis read-only, write the fallback diagnostic to a guaranteed-writable location (e.g.$RUNNER_TEMP/gh-aw/agent/) and have thesafe_outputsjob ingest it.errorlevel into the step summary so the lost diagnostic is at least visible, and emit areport_incomplete-style signal so the failure reason is not reduced to a bare exit code.Success criteria / verification
copilot-engine runs logmissing_tool emission failed: EROFS(or anyEROFS ... outputs.jsonl).agent_output.json/ safe-outputs contains the structured failure reason (token-cap, permission-denied count) rather than only a generic non-zero exit.Context — parent #35661 token-cap is still recurring
This window (2026-05-30 ~01:40–07:40 UTC) confirms the parent's 25M effective-tokens cap failure mode is not abating: 3 distinct
copilot-engine workflows hit it (table above), one of which (Copilot CLI Deep Research Agent) was already named in #35661's original 2026-05-29 sample — i.e. a repeat offender. #35661's proposed remediations (per-workflowmax-turns, MCP tool-surface trimming,--no-resumeon retry after a 429) remain unimplemented and unverified. The related infra tracker #35780 (awf-squidunhealthy) did not recur in this window.Confidence & unknowns
References: §26673439829 · §26675076543 · §26675130908
Related to #35661