Skip to content

Add Kwai-Klear SWE-smith mini agent trajectories dataset (#178)#190

Closed
neubig wants to merge 1 commit into
mainfrom
openhands/issue-178-kwai-swe-smith-mini-swe-agent-plus
Closed

Add Kwai-Klear SWE-smith mini agent trajectories dataset (#178)#190
neubig wants to merge 1 commit into
mainfrom
openhands/issue-178-kwai-swe-smith-mini-swe-agent-plus

Conversation

@neubig
Copy link
Copy Markdown
Contributor

@neubig neubig commented May 14, 2026

Closes #178

This PR was created by an AI agent (OpenHands) on behalf of the user.

Summary

  • Added a converter for Kwai-Klear/SWE-smith-mini_swe_agent_plus-trajectories-66k.
  • Added representative sample_raw.json, sample_std.json, and sample_sft.json generated from the conversion pipeline.
  • Updated the OpenHands SFT converter to preserve function_call and observation roles instead of rewriting them to legacy role names.

Dataset Source

Files Added

  • datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/README.md
  • datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/requirements.txt
  • datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/extract_raw.py
  • datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/schema_raw.py
  • datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/raw_to_standardized.py
  • datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/sample_raw.json
  • datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/sample_std.json
  • datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/sample_sft.json

Schema Mapping Summary

  • Raw system messages are omitted because the OpenHands SFT converter supplies the target system prompt.
  • The first raw user message maps to a TextObservation with source user.
  • Later raw user messages map to TextObservation with source environment because they are command execution results.
  • Raw assistant messages containing a fenced bash command map to CodeAction(language="bash"); the THOUGHT: text before the command is preserved as the action description.
  • Raw assistant messages without a parseable bash block map to MessageAction.
  • Trajectories containing the MINI_SWE_AGENT_FINAL_OUTPUT submission command receive a final success observation and <finish> message for OpenHands SFT conversion.

Design Decisions

  • Ambiguity: The raw dataset has no resolved field. Chosen approach: Treat trajectories with the explicit MINI_SWE_AGENT_FINAL_OUTPUT submission command as completed and append the final ADP success/finish events. Example: The sample trajectories end with echo MINI_SWE_AGENT_FINAL_OUTPUT && git add -A && git diff --cached. Alternatives rejected: Adding a finish event to every row unconditionally could mislabel interrupted rows; omitting finish events would produce samples without a terminal completion signal.
  • Ambiguity: Raw user messages include both the initial task and later shell outputs. Chosen approach: Map only the first user message to source user; map subsequent user messages to source environment. Example: <returncode>0</returncode>... messages become environment observations. Alternatives rejected: Treating every user message as source user loses the command/observation alternation.
  • Ambiguity: Assistant turns are plain text with a THOUGHT: section and a fenced bash block rather than structured tool calls. Chosen approach: Extract the fenced command as CodeAction(language="bash") and keep the thought text in description. Example: THOUGHT: ... ```bash\nfind ...\n``` becomes a bash code action. Alternatives rejected: Keeping the entire assistant turn as MessageAction would lose executable structure.
  • Ambiguity: The raw system prompt is mini-swe-agent specific. Chosen approach: Omit raw system messages in standardized data so the OpenHands SFT converter can provide the canonical OpenHands system prompt. Example: The raw instruction requiring exactly one bash block is not copied into standardized content. Alternatives rejected: Preserving it as a user/environment observation would mix source-agent formatting instructions into the target training conversation.
  • Ambiguity: The OpenHands converter was rewriting function_call and observation roles after producing them. Chosen approach: Preserve those roles to match current ADP SFT validation. Example: Generated bash calls remain from: function_call. Alternatives rejected: Post-processing generated JSON would not be reproducible from the converter.

Tests Run

  • python -m pytest tests/test_dataset_structure.py -v
  • python -m pytest tests/test_raw_schemas.py -v -k kwai
  • python -m pytest tests/test_standardized_schemas.py -v -k kwai
  • python -m pytest tests/test_std_to_sft_conversion.py -v -k kwai
  • python -m pytest tests/test_datasets_from_parameter.py -v
  • python -m pytest tests/test_std_to_sft_from_parameter_simple.py tests/test_std_to_sft_structure.py -v
  • python -m ruff check agents/openhands/std_to_sft.py datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k

Known Limitations

  • Full-corpus generation was not run in this PR; only the committed representative sample files were generated and validated.
  • Installing the full root requirements.txt on this Python 3.13 environment failed because browsergym-core attempted to build an incompatible greenlet; focused dependencies were installed to run the relevant validation commands.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Contributor Author

neubig commented May 14, 2026

Closing this as a duplicate of #192, which is the retained PR for issue #178.

This comment/action was created by an AI agent (OpenHands) on behalf of the user.

@neubig neubig closed this May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add dataset: Kwai-Klear/SWE-smith-mini_swe_agent_plus-trajectories-66k

2 participants