Add Kwai-Klear SWE-smith mini agent trajectories dataset (#178) by neubig · Pull Request #190 · neulab/agent-data-protocol

neubig · 2026-05-14T03:41:53Z

Closes #178

This PR was created by an AI agent (OpenHands) on behalf of the user.

Summary

Added a converter for Kwai-Klear/SWE-smith-mini_swe_agent_plus-trajectories-66k.
Added representative sample_raw.json, sample_std.json, and sample_sft.json generated from the conversion pipeline.
Updated the OpenHands SFT converter to preserve function_call and observation roles instead of rewriting them to legacy role names.

Dataset Source

Source: https://huggingface.co/datasets/Kwai-Klear/SWE-smith-mini_swe_agent_plus-trajectories-66k
License: MIT
Split used: train
Size: 65,994 trajectories

Files Added

datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/README.md
datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/requirements.txt
datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/extract_raw.py
datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/schema_raw.py
datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/raw_to_standardized.py
datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/sample_raw.json
datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/sample_std.json
datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k/sample_sft.json

Schema Mapping Summary

Raw system messages are omitted because the OpenHands SFT converter supplies the target system prompt.
The first raw user message maps to a TextObservation with source user.
Later raw user messages map to TextObservation with source environment because they are command execution results.
Raw assistant messages containing a fenced bash command map to CodeAction(language="bash"); the THOUGHT: text before the command is preserved as the action description.
Raw assistant messages without a parseable bash block map to MessageAction.
Trajectories containing the MINI_SWE_AGENT_FINAL_OUTPUT submission command receive a final success observation and <finish> message for OpenHands SFT conversion.

Design Decisions

Ambiguity: The raw dataset has no resolved field. Chosen approach: Treat trajectories with the explicit MINI_SWE_AGENT_FINAL_OUTPUT submission command as completed and append the final ADP success/finish events. Example: The sample trajectories end with echo MINI_SWE_AGENT_FINAL_OUTPUT && git add -A && git diff --cached. Alternatives rejected: Adding a finish event to every row unconditionally could mislabel interrupted rows; omitting finish events would produce samples without a terminal completion signal.
Ambiguity: Raw user messages include both the initial task and later shell outputs. Chosen approach: Map only the first user message to source user; map subsequent user messages to source environment. Example: <returncode>0</returncode>... messages become environment observations. Alternatives rejected: Treating every user message as source user loses the command/observation alternation.
Ambiguity: Assistant turns are plain text with a THOUGHT: section and a fenced bash block rather than structured tool calls. Chosen approach: Extract the fenced command as CodeAction(language="bash") and keep the thought text in description. Example: THOUGHT: ... ```bash\nfind ...\n``` becomes a bash code action. Alternatives rejected: Keeping the entire assistant turn as MessageAction would lose executable structure.
Ambiguity: The raw system prompt is mini-swe-agent specific. Chosen approach: Omit raw system messages in standardized data so the OpenHands SFT converter can provide the canonical OpenHands system prompt. Example: The raw instruction requiring exactly one bash block is not copied into standardized content. Alternatives rejected: Preserving it as a user/environment observation would mix source-agent formatting instructions into the target training conversation.
Ambiguity: The OpenHands converter was rewriting function_call and observation roles after producing them. Chosen approach: Preserve those roles to match current ADP SFT validation. Example: Generated bash calls remain from: function_call. Alternatives rejected: Post-processing generated JSON would not be reproducible from the converter.

Tests Run

python -m pytest tests/test_dataset_structure.py -v
python -m pytest tests/test_raw_schemas.py -v -k kwai
python -m pytest tests/test_standardized_schemas.py -v -k kwai
python -m pytest tests/test_std_to_sft_conversion.py -v -k kwai
python -m pytest tests/test_datasets_from_parameter.py -v
python -m pytest tests/test_std_to_sft_from_parameter_simple.py tests/test_std_to_sft_structure.py -v
python -m ruff check agents/openhands/std_to_sft.py datasets/kwai-klear_swe-smith-mini_swe_agent_plus-trajectories-66k

Known Limitations

Full-corpus generation was not run in this PR; only the committed representative sample files were generated and validated.
Installing the full root requirements.txt on this Python 3.13 environment failed because browsergym-core attempted to build an incompatible greenlet; focused dependencies were installed to run the relevant validation commands.

Co-authored-by: openhands <openhands@all-hands.dev>

neubig · 2026-05-14T03:45:45Z

Closing this as a duplicate of #192, which is the retained PR for issue #178.

This comment/action was created by an AI agent (OpenHands) on behalf of the user.

Add Kwai-Klear SWE-smith mini agent trajectories dataset

9a14036

Co-authored-by: openhands <openhands@all-hands.dev>

neubig closed this May 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Kwai-Klear SWE-smith mini agent trajectories dataset (#178)#190

Add Kwai-Klear SWE-smith mini agent trajectories dataset (#178)#190
neubig wants to merge 1 commit into
mainfrom
openhands/issue-178-kwai-swe-smith-mini-swe-agent-plus

neubig commented May 14, 2026

Uh oh!

neubig commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

neubig commented May 14, 2026

Summary

Dataset Source

Files Added

Schema Mapping Summary

Design Decisions

Tests Run

Known Limitations

Uh oh!

neubig commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants