docs: add data generation tutorial for synthesized data pipeline#238
docs: add data generation tutorial for synthesized data pipeline#238yvvonie wants to merge 7 commits intoDexForce:mainfrom
Conversation
yvvonie
commented
Apr 17, 2026
- This PR introduces a new tutorial ( data_generation.rst ) to document the internal workflow for generating synthetic expert demonstration datasets. It aims to provide clear instructions for developers on how to configure and run the data generation pipeline using EmbodiChain's built-in environment launcher.
- Documentation Index : Registered the new tutorial in docs/source/tutorial/index.rst .
| "robot_meta": { | ||
| "robot_type": "CobotMagic", | ||
| "control_freq": 25, | ||
| "control_parts": ["left_arm", "left_eef", "right_arm", "right_eef"] |
There was a problem hiding this comment.
We have remove the control_parts from dataset. Now it is placed in the children of env.
| "func": "LeRobotRecorder", | ||
| "mode": "save", | ||
| "params": { | ||
| "save_path": "/root/workspace/Embodied_Challenge/lerobot_dataset/", |
There was a problem hiding this comment.
Remove the hard code path from docs
| - **Action Configuration**: Describes how the task-specific expert trajectory should be generated. | ||
| - **Environment Launcher**: Builds the environment directly from configuration files. | ||
| - **Expert Policy**: Each task provides ``create_demo_action_list()`` to generate a scripted trajectory. | ||
| - **Dataset Manager**: Records observation-action pairs during ``env.step()``. |
There was a problem hiding this comment.
Use Environment Rollout would be better
| basic_env | ||
| modular_env | ||
| rl | ||
| data_generation |
| Step 2: Prepare the Action Configuration | ||
| ---------------------------------------- | ||
|
|
||
| The second input is the ``action_config.json`` file. This file defines the expert action graph used by the task. It is the main configuration entry for scripted trajectory generation. Take ``items_handover_place`` as example, the file is organized around ``scope``, ``node``, ``edge``, and ``sync``. |
There was a problem hiding this comment.
Implement a subsection for action bank (maybe can be fold by user)
| :start-at: def cli(): | ||
| :end-at: main(args, env, gym_config) | ||
|
|
||
| This means the runtime inputs of the whole data-generation pipeline are simply the task config files plus launcher arguments. |
There was a problem hiding this comment.
Add CLI interface for running and preview env. (see https://dexforce.github.io/EmbodiChain/guides/cli.html)
|
|
||
| This means the runtime inputs of the whole data-generation pipeline are simply the task config files plus launcher arguments. | ||
|
|
||
| Step 4: Generate and Execute Expert Actions |
There was a problem hiding this comment.
The setp 4, 5, 6 can be removed. Instead, we can introduce more parameters in gym_config that controls the data generation (eg, max_episodes, ...)
| Outputs | ||
| ~~~~~~~ | ||
|
|
||
| After successful execution, completed episodes are saved under the configured ``save_path``. A LeRobot dataset typically contains: |
There was a problem hiding this comment.
Also mention the default saving path
|
|
||
| - **Keep the config pair together**: Always version ``gym_config.json`` and ``action_config.json`` together for a task. | ||
| - **Use valid scripted policies**: Make sure ``create_demo_action_list()`` returns executable trajectories for the current scene. | ||
| - **Enable ``use_videos`` for visual tasks**: This is especially useful for downstream vision-based training. |
There was a problem hiding this comment.
We don't have this parameter?