Support extract use CPU and optimize some codes.#707
Merged
Conversation
|
Thanks for your contribution! |
- GraphNetAgent: add extract_timeout and verify_timeout parameters - parallel_extract: add --extract-timeout and --verify-timeout CLI args - Default timeouts differ by device: GPU: extract=1000s, verify=300s CPU: extract=2000s, verify=600s - Fix typo: os.envion -> os.environ Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- New CLI arg --use-llm (default: true) controls llm_retry in GraphNetAgent - Pass llm_retry through worker_fn and _resolve_config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cess rates. - Introduce ExtractionStatus(str, Enum): OK, VERIFY_FAILED, EXTRACT_FAILED, ERROR - GraphNetAgent.extract_sample() now returns ExtractionStatus instead of bool - parallel_extract tracks and prints both overall and extraction-only success rates - Per-GPU/Worker summary also shows both rates Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
luotao1
approved these changes
May 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Category
Other
Description
本次 PR 对 GraphNet 并行提取流水线进行了增强,新增 CPU 模式支持、可配置超时、细粒度的失败状态追踪,以及可选的 LLM 重试功能。变更内容如下:
1. 支持 CPU 模式
parallel_extract.py通过torch.cuda.is_available()自动检测 GPU/CPU 环境。--num-workers参数用于 CPU 模式(此前仅支持 GPU)。[Worker-{id} GPU:{id}]/[Worker-{id} CPU]。2. 可配置超时(按设备区分默认值)
--extract-timeout和--verify-timeout命令行参数。GraphNetAgent、SubprocessGraphExtractor、ForwardVerifier均支持传入可选的timeout参数。3. 细粒度提取状态(ExtractionStatus 枚举)
ExtractionStatus(str, Enum),包含四种状态:OK— 提取和验证均通过VERIFY_FAILED— 计算图提取成功,但前向验证失败EXTRACT_FAILED— 脚本生成或图提取失败ERROR— 未预期的运行时错误GraphNetAgent.extract_sample()返回类型由bool改为ExtractionStatus。4. 拆分成功率指标
[PROGRESS]日志现在同时输出两个成功率:success=— 整体成功率(提取+验证均通过)extract=— 提取成功率(包含提取成功但验证失败的模型)extract_success和extract_success_rate字段。5. 可选的 LLM 重试
--use-llm参数(store_true,默认False),用于开启提取失败时的 LLM 脚本修复重试。llm_retry=False。6. 代码重构
main()中的逻辑拆分为_parse_args()、_resolve_config()、_load_model_ids()。_worker()重命名为worker_fn(),增加worker_id和llm_retry显式参数。