feat: detect + send tool/model on submit (v0.6.0)#3
Open
RayirthDinesh wants to merge 14 commits into
Open
Conversation
feat: add cr fetch/submit commands for code review challenges
- Replace shell=True subprocess calls with list-based args to avoid shell quoting issues across platforms - Add _restrict_key_permissions() using icacls on Windows, chmod on Unix - Restrict config directory permissions on Windows via icacls - Pre-check ssh-keygen availability with Windows-specific error message - Quote SSH key paths with forward slashes for Windows compatibility - Use /dev/null (not os.devnull) in GIT_SSH_COMMAND for MSYS2 ssh compat - Use platform-appropriate file viewer hint (type on Windows, cat on Unix) - Bump version to 0.3.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On Windows, System32's OpenSSH (ssh.exe) is typically found before Git for Windows' bundled MSYS2 ssh on PATH. The native OpenSSH can trigger GUI credential dialogs or deadlock when stdout is captured, causing swe fetch to hang indefinitely during git clone. - Add _find_git_ssh() to locate Git for Windows' bundled ssh.exe - Use explicit ssh path in GIT_SSH_COMMAND to bypass System32 OpenSSH - Add BatchMode=yes to prevent interactive prompts on all platforms Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Broadens git add exclusions to ignore all dot-prefixed files/directories (.*) and all markdown files (*.md) so scaffold metadata like .devcontainer, .swebench, .gitignore, problem_description.md, and hints_text.md are never included in submissions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Downloads AI agent config files (.claudeignore, .cursorrules, CLAUDE.md, AGENTS.md, etc.) from AICodingGym/gym-environment into problem directories during swe fetch, cr fetch, and mle download. Also installs to the workspace root during configure. Downloaded files are added to .gitignore. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bump version to 0.5.1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, dotfile exclusion, gym-environment fetch)
- cli_env.py: allowlist-only tool/model detection (no full env dump); reads ANTHROPIC_MODEL, CLAUDE_CODE_MODEL, OPENAI_MODEL, AIDER_MODEL, GEMINI_MODEL, CURSOR_MODEL - api.py: extend submit_notification / cr_submit_review / mlebench_submit_csv with tool/tool_version/ai_model; add notify_mle_progress to forward percentile + attribution to Prisma UserProgress - cli.py: --tool / --tool-version / --ai-model flags on swe submit, cr submit, mle submit; MLE reads solution_log.json for accurate model record per CLAUDE.md - bump version 0.5.1 -> 0.6.0
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
swe submit,cr submit,mle submit. Forward to backend so tool/model leaderboards (companion PR onaicodingymsite) can rank.cli_env.pyreads only an env allowlist — never the full environment — so secrets cannot leak into payloads.0.5.1->0.6.0.Detection sources
CLAUDECODE,CURSOR_TRACE_ID,TERM_PROGRAM=cursor,ANTIGRAVITY,AIDER_MODEL,CODEX_CLI,WINDSURF,CONTINUE_CLI,CLINE_CLI,GEMINI_CLI.ANTHROPIC_MODEL,CLAUDE_CODE_MODEL,OPENAI_MODEL,AIDER_MODEL,GEMINI_MODEL,CURSOR_MODEL.solution_log.json(set by agent per CLAUDE.md) over env.--tool/--tool-version/--ai-modeloverride auto-detection.New API methods
submit_notification(...),cr_submit_review(...),mlebench_submit_csv(...)— extended withtool/tool_version/ai_modelkwargs.notify_mle_progress(...)— posts percentile + attribution to/api/users/{id}/progressso MLE rows feed the leaderboard aggregator.Test plan
pip install -e .from branchCLAUDECODE=1 ANTHROPIC_MODEL=claude-opus-4-7 aicodinggym swe submit <id>— output showsTool: claude-code · model=claude-opus-4-7aicodinggym swe submit <id> --tool cursor --ai-model gpt-5— flags override envaicodinggym cr submit <id> -f review.md— same Tool line in outputaicodinggym mle submit <id> -F predictions.csv— firesnotify_mle_progressw/ percentile/leaderboardTools + Models tabs populate after backend processesNotes
aicodingymsitePR #15.prisma db pushfor newtool/aiModelcolumns), then this CLI PR.