Skip to content

Increase CPU verify_timeout default from 600s to 1200s.#709

Open
Xreki wants to merge 5 commits into
PaddlePaddle:developfrom
Xreki:opt_extract_agent
Open

Increase CPU verify_timeout default from 600s to 1200s.#709
Xreki wants to merge 5 commits into
PaddlePaddle:developfrom
Xreki:opt_extract_agent

Conversation

@Xreki
Copy link
Copy Markdown
Collaborator

@Xreki Xreki commented May 15, 2026

PR Category

Other

Description

  • CPU forward verification often takes 1000s+ for large models
  • Update --verify-timeout help text accordingly

- CPU forward verification often takes 1000s+ for large models
- Update --verify-timeout help text accordingly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented May 15, 2026

Thanks for your contribution!

Xreki and others added 4 commits May 15, 2026 12:15
- LLMCodeFixer: support Optional[int] timeout, default 360s when None
- GraphNetAgent: add llm_timeout parameter (default: 600s)
- Remove download_timeout from previous iteration

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Raise default llm_timeout from 600s to 900s to reduce ducc -p timeout failures.
- Treat forward verification timeout as pass for large models on CPU.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ForwardVerifier now records last_timeout_success when eager forward
  passes are skipped due to subprocess timeout.
- GraphNetAgent propagates this flag via last_timeout_success attribute.
- parallel_extract worker reports timeout_success per model.
- PROGRESS line format: success=xx%(timeout_success=xx)%
- Summary and per-GPU stats also include timeout counts/rates.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant