docs: add data generation tutorial for synthesized data pipeline by yvvonie · Pull Request #238 · DexForce/EmbodiChain

yvvonie · 2026-04-17T07:39:22Z

This PR introduces a new tutorial ( data_generation.rst ) to document the internal workflow for generating synthetic expert demonstration datasets. It aims to provide clear instructions for developers on how to configure and run the data generation pipeline using EmbodiChain's built-in environment launcher.
Documentation Index : Registered the new tutorial in docs/source/tutorial/index.rst .

yuecideng · 2026-04-17T08:50:18Z

+                       "robot_meta": {
+                           "robot_type": "CobotMagic",
+                           "control_freq": 25,
+                           "control_parts": ["left_arm", "left_eef", "right_arm", "right_eef"]


We have remove the control_parts from dataset. Now it is placed in the children of env.

yuecideng · 2026-04-17T08:50:51Z

+                   "func": "LeRobotRecorder",
+                   "mode": "save",
+                   "params": {
+                       "save_path": "/root/workspace/Embodied_Challenge/lerobot_dataset/",


Remove the hard code path from docs

yuecideng · 2026-04-21T12:51:22Z

+- **Action Configuration**: Describes how the task-specific expert trajectory should be generated.
+- **Environment Launcher**: Builds the environment directly from configuration files.
+- **Expert Policy**: Each task provides ``create_demo_action_list()`` to generate a scripted trajectory.
+- **Dataset Manager**: Records observation-action pairs during ``env.step()``.


Use Environment Rollout would be better

yuecideng · 2026-04-22T02:25:16Z

   basic_env
   modular_env
   rl
+   data_generation


Move above rl section

yuecideng · 2026-04-22T02:27:35Z

+Step 2: Prepare the Action Configuration
+----------------------------------------
+
+The second input is the ``action_config.json`` file. This file defines the expert action graph used by the task. It is the main configuration entry for scripted trajectory generation. Take ``items_handover_place`` as example, the file is organized around ``scope``, ``node``, ``edge``, and ``sync``.


Implement a subsection for action bank (maybe can be fold by user)

yuecideng · 2026-04-22T02:28:46Z

+   :start-at: def cli():
+   :end-at:     main(args, env, gym_config)
+
+This means the runtime inputs of the whole data-generation pipeline are simply the task config files plus launcher arguments.


Add CLI interface for running and preview env. (see https://dexforce.github.io/EmbodiChain/guides/cli.html)

yuecideng · 2026-04-22T02:31:25Z

+
+This means the runtime inputs of the whole data-generation pipeline are simply the task config files plus launcher arguments.
+
+Step 4: Generate and Execute Expert Actions


The setp 4, 5, 6 can be removed. Instead, we can introduce more parameters in gym_config that controls the data generation (eg, max_episodes, ...)

yuecideng · 2026-04-22T02:31:53Z

+Outputs
+~~~~~~~
+
+After successful execution, completed episodes are saved under the configured ``save_path``. A LeRobot dataset typically contains:


Also mention the default saving path

yuecideng · 2026-04-22T02:32:25Z

+
+- **Keep the config pair together**: Always version ``gym_config.json`` and ``action_config.json`` together for a task.
+- **Use valid scripted policies**: Make sure ``create_demo_action_list()`` returns executable trajectories for the current scene.
+- **Enable ``use_videos`` for visual tasks**: This is especially useful for downstream vision-based training.


We don't have this parameter?

…ocs/data-generation-tutorial

…eir docs files

yvvonie added 3 commits April 17, 2026 15:29

docs: add data generation tutorial for synthesized data pipeline

23e4d75

Merge main

aa3d2ed

docs: update data generation tutorial paths

e42e49f

yuecideng requested changes Apr 22, 2026

View reviewed changes

yvvonie added 4 commits April 25, 2026 23:19

Merge branch 'main' of https://github.com/DexForce/EmbodiChain into d…

df75ead

…ocs/data-generation-tutorial

Merge branch 'main' into GYY_Tutorial_Add

fab9843

docs: update data generation tutorial

d828c60

Merge docs/data-generation-tutorial into GYY_Tutorial_Add, keeping th…

ed454b4

…eir docs files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add data generation tutorial for synthesized data pipeline#238

docs: add data generation tutorial for synthesized data pipeline#238
yvvonie wants to merge 7 commits intoDexForce:mainfrom
yvvonie:GYY_Tutorial_Add

yvvonie commented Apr 17, 2026

Uh oh!

yuecideng Apr 17, 2026

Uh oh!

yuecideng Apr 17, 2026

Uh oh!

yuecideng Apr 21, 2026

Uh oh!

yuecideng Apr 22, 2026

Uh oh!

yuecideng Apr 22, 2026

Uh oh!

yuecideng Apr 22, 2026

Uh oh!

yuecideng Apr 22, 2026

Uh oh!

yuecideng Apr 22, 2026

Uh oh!

yuecideng Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		This means the runtime inputs of the whole data-generation pipeline are simply the task config files plus launcher arguments.

		Step 4: Generate and Execute Expert Actions

Conversation

yvvonie commented Apr 17, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants