Skip to content

Pim/feat/g1 groot wbc#2033

Open
Nabla7 wants to merge 35 commits into
mainfrom
pim/feat/g1-groot-wbc
Open

Pim/feat/g1 groot wbc#2033
Nabla7 wants to merge 35 commits into
mainfrom
pim/feat/g1-groot-wbc

Conversation

@Nabla7
Copy link
Copy Markdown
Collaborator

@Nabla7 Nabla7 commented May 9, 2026

Problem

dimos doesn't have a whole-body locomotion controller for the Unitree G1 humanoid. The ControlCoordinator / WholeBodyAdapter machinery from #1954 is set up to host one — there just isn't a task that drives the legs and waist.

Closes DIM-XXX

Solution

A GrootWBCTask that runs the GR00T balance + walk ONNX policies inside the coordinator tick loop, and two blueprints (unitree-g1-groot-wbc real-hw + unitree-g1-groot-wbc-sim) that compose it. Sim and real run identical task code; only the WholeBodyAdapter underneath differs — sim uses an in-process MuJoCo adapter, real hardware uses Mustafa's transport_lcm bridge to G1WholeBodyConnection (no new G1 low-level adapter).

The coordinator gains activate / dry_run / twist_command ports duck-typed to any task exposing arm() / set_dry_run() / set_velocity_command(). TaskConfig gains hardware_id, default_positions, auto_arm, auto_dry_run, default_ramp_seconds, decimation. WholeBodyConfig carries per-joint kp/kd; ConnectedWholeBody resolves them when building MotorCommands. WebsocketVisModule exposes Arm + Dry-Run buttons that publish on the new ports — the operator activates real-hw runs from the dashboard.

Real-hardware safety profile: comes up unarmed + dry-run + 10-s ramp from current pose to the bent-knee default. Sim auto-arms with no ramp.

G1WholeBodyConnection patched for macOS: passive Init(None, 0) + Read()-per-tick (the callback-based variant doesn't fire reliably under cyclonedds on Darwin), and mode_machine hardcoded to the G1's static value with a one-shot warning if a future firmware reports something else.

Breaking Changes

None. Additive — existing blueprints unchanged. WholeBodyConfig is new (kp/kd were previously direct fields on HardwareComponent; the migration is a constructor rename).

How to Test

Unit:

uv run pytest dimos/control/tasks/test_groot_wbc_task.py \
              dimos/control/test_control.py \
              dimos/robot/unitree/g1/blueprints/basic/test_unitree_g1_groot_wbc.py \
              dimos/robot/test_all_blueprints_generation.py

Sim (no hardware needed):

DIMOS_MUJOCO_VIEW=1 dimos run unitree-g1-groot-wbc-sim

MuJoCo viewer pops, robot stands. Open http://localhost:7779/, WASD walks the robot.

Real hardware (requires a G1 + ethernet + the [unitree-dds] extra installed; CYCLONEDDS_HOME exported in the shell):

ROBOT_INTERFACE=<nic> dimos run unitree-g1-groot-wbc

Robot connects in dev-mode damping; dashboard at http://localhost:7779/. Click Arm (10-s ramp to default pose), inspect the dry-run command stream, then toggle Dry Run OFF to hand control to the policy. WASD = cmd_vel.

Contributor License Agreement

  • I have read and approved the CLA.

Nabla7 and others added 24 commits May 6, 2026 21:35
…era_pose, _publish_tf refactor, spec.py docstring expansion)
The DDS callback registered via ChannelSubscriber.Init(handler, 10)
doesn't fire reliably on macOS, so the adapter timed out waiting for
the first LowState to capture mode_machine.  Switch to passive-mode
Init(None, 0) and Read() per tick.  mode_machine is now hardcoded to
the static value for the 29-DOF G1 instead of read-back-then-echoed.

Also adds has_motor_states() to satisfy the post-refactor
WholeBodyAdapter Protocol, makes _release_sport_mode null-tolerant
(CheckMode returns (status, None) once nothing is active), drops the
now-unused _on_low_state callback.

WebsocketVisModule: restore the arm / disarm / set_dry_run socket.io
handlers and the matching activate / dry_run Out[Bool] ports that an
earlier cleanup pass dropped — the dashboard buttons now publish on
the LCM topics the coordinator already subscribes to.

TransportWholeBodyAdapter: trivial Protocol fix — implement read_odom
returning None so isinstance(adapter, WholeBodyAdapter) passes the
runtime_checkable type check in ControlCoordinator._setup_hardware.
The earlier cleanup pass over-deleted: openarm manipulator support,
the dimos/utils/workspace.py module, the audio STT node_whisper
edits, CI configs, README, command-center package-lock churn, and
two go2 LFS DBs.  None of those are part of the GR00T WBC story.

Restore each to origin/dev's content.  Drop one path that was added
on this branch but doesn't exist on dev (api/requirements.txt).

PR diff is now 24 files / +2623 / -210, all GR00T-WBC-relevant.
Both Mustafa's #1954 (already on dev) and the GR00T-WBC PR added
this method.  After the merge, both definitions live in
coordinator.py — Python uses the second, the first becomes dead
code, and the transport adapter (which depends on hardware_id
being passed) silently picks up its default ``"wholebody"`` topic
prefix instead of the component's id.

Collapse to one definition that passes the union of kwargs both
adapters need: dof, hardware_id, network_interface, domain_id,
address, plus **adapter_kwargs.  Adapters tolerate extras via
their constructor's **_ catch-all.
Switch unitree-g1-groot-wbc to the architecture Mustafa landed in
#1954 (G1WholeBodyConnection Module + TransportWholeBodyAdapter via
LCM bridge), matching unitree-g1-coordinator.  Single G1 low-level
pattern in the codebase now.

  * Delete dimos/hardware/whole_body/unitree/g1/adapter.py — the
    UnitreeG1LowLevelAdapter (single-process DDS) is gone.
  * Rewrite unitree_g1_groot_wbc.py to compose G1WholeBodyConnection
    + ControlCoordinator(adapter_type="transport_lcm") + dashboard
    via autoconnect.
  * Patch wholebody_connection.py to apply the macOS-cyclonedds fixes
    the dropped monolith was carrying:
      - Init(None, 0) instead of Init(callback, 10)
      - Hardcode mode_machine = 5 (the static value for 29-DOF G1)
      - publish_loop polls subscriber.Read() per tick
      - Drop the now-unused _on_low_state callback
_MJCF_PATH was a relative path "data/mujoco_sim/g1_gear_wbc.xml"
which only resolved when dimos was launched from the repo root.
The mujoco viewer subprocess inherits CWD from the launching shell,
so running ``dimos run unitree-g1-groot-wbc-sim`` from any other
directory tripped FileNotFoundError on viewer startup.

get_data("mujoco_sim") resolves the absolute install path and
auto-extracts the LFS tarball on first run.
  1. unitree_g1_groot_wbc_sim.py: resolve _MJCF_PATH via get_data so
     both MujocoSimModule and the DIMOS_MUJOCO_VIEW=1 subprocess open
     the file regardless of shell CWD.  Earlier the relative path
     "data/mujoco_sim/g1_gear_wbc.xml" only worked from the repo root.

  2. wholebody_connection.py: hardcoded mode_machine = 5 has no
     fallback if a future G1 firmware revision changes the value.
     Add a one-shot warning the first time we read a LowState whose
     mode_machine doesn't match.  Self-disables after the first log
     line so it doesn't spam the operator.

  3. test_unitree_g1_groot_wbc.py: composition smoke test verifying
     the three deployed modules, bridge adapter binding (transport_lcm,
     confirming the migration off the deleted unitree_g1 monolith),
     real-hw safety profile (auto_arm=False, auto_dry_run=True,
     default_ramp_seconds=10.0), servo_arms defaults, and all bridge
     + dashboard LCM topics.  No DDS, no hardware — just module
     imports.
Two safety fixes for back-to-back ``dimos run`` cycles on real
hardware:

  1. ``stop()`` now sends a final ``LowCmd_`` with ``mode=0x00``
     (motors disabled) for every motor slot before closing the DDS
     publisher.  Previously the last commanded pose lingered in the
     motor controllers — when the next process opened a publisher
     and re-released sport mode, those stale stiff commands fought
     the new participant during the handoff window, producing
     audible gearbox stress.

  2. ``_release_sport_mode()`` bails immediately if the first
     ``CheckMode()`` reports nothing active (``result is None`` or
     ``{"name": ""}``).  A clean second start now logs "Sport mode
     already released — skipping ReleaseMode" and skips the
     loop-and-poll dance entirely, removing the handoff window.
``dimos/project/test_no_sections.py`` forbids ``# -----`` separator
banners and ``# --- text ---`` inline section headers as a project
style rule.  Three files in this PR carried the pattern from earlier
drafts; convert inline-section to plain ``# text`` and drop pure
separators.

No code-behaviour change.
  * tick_loop.py / groot_wbc_task.py: import the class
    (``from dimos.msgs.geometry_msgs.PoseStamped import PoseStamped``,
    same for ``Twist``), not the module — module-as-type tripped
    mypy ``[valid-type]``.
  * coordinator.py: when constructing ``GrootWBCTask``, narrow ``hw``
    to ``ConnectedWholeBody`` via isinstance before passing
    ``hw.adapter`` (else mypy unions the manipulator + whole-body
    adapter types).  Also raises a clear TypeError if a non-whole-body
    component is referenced by a groot_wbc task.
  * test_unitree_g1_groot_wbc.py: type ``_coordinator_kwargs() ->
    dict[str, Any]`` so list/iter usages on the values typecheck.

``mypy dimos`` is clean (670 files).
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 9, 2026

Greptile Summary

This PR introduces G1GrootWBCTask, a whole-body locomotion controller for the Unitree G1 that runs two GR00T ONNX policies (balance + walk) inside the coordinator tick loop, plus real-hardware and sim blueprints that compose it. The supporting infrastructure additions (per-joint WholeBodyConfig PD gains, ConnectedWholeBody, set_activated/set_dry_run coordinator RPCs, write_pd_tau_command atomic SHM write, WholeBodySimHooks) are all well-designed and address the transient-mode-flip race flagged in previous reviews.

  • G1GrootWBCTask: arming state machine, dry-run safety profile, _state_seen guard against zero-state inference, and _last_action obs-history warm-start are implemented correctly and faithfully mirror the reference backend.
  • G1WholeBodyConnection: patched for macOS by switching to a passive Init(None, 0) + per-tick Read() subscriber and hardcoding mode_machine; however, the publish loop emits all-zero motor states before the first real LowState arrives, which can set _state_seen = True with zero positions and send high-gain commands toward URDF zero before the robot's actual pose is known — see inline comment.
  • Sim path: SimMujocoG1WholeBodyAdapter.connect() correctly blocks on is_ready() (first engine state written), so the zero-startup issue does not affect simulation.

Confidence Score: 4/5

Safe to merge for sim use; the real-hardware path has a startup sequencing issue in G1WholeBodyConnection that should be fixed before running on a physical robot.

The zero-state publication in G1WholeBodyConnection._publish_loop can cause the WBC task to issue high-gain position commands toward URDF zero before the robot's actual joint positions are known. On a standing G1 this generates large corrective torques that could knock the robot over, entirely defeating the 10-second ramp safety profile. The sim adapter correctly blocks on is_ready() so simulation is unaffected, but the real-hardware path needs the guard described in the inline comment before deployment.

dimos/robot/unitree/g1/wholebody_connection.py — _publish_loop should suppress publication until _low_state is not None.

Important Files Changed

Filename Overview
dimos/robot/unitree/g1/wholebody_connection.py Major rewrite for macOS DDS compatibility (passive subscriber, hardcoded mode_machine). The _publish_loop unconditionally emits all-zero motor states before the first real LowState arrives from the robot, which can corrupt the WBC task's initial pose cache — see inline comment.
dimos/control/tasks/g1_groot_wbc_task.py New GR00T WBC task: well-structured ONNX policy wrapper with careful obs-caching, arming state machine, and dry-run safety. The _state_seen guard is sound but relies on the hardware layer not supplying all-zero states before the robot is ready.
dimos/simulation/engines/mujoco_shm.py New write_pd_tau_command method atomically writes all PD+tau arrays and sets mode once, eliminating the transient CMD_MODE_POSITION flash that a previous review identified. New WB SHM buffers (imu, kp_t, kd_t, tau_t) are correctly sized and named.
dimos/simulation/engines/wholebody_sim_hooks.py New pre/post step hooks for WB simulation. PD-tau latching is correct: gains are consumed when their seq-counters fire and retained across ticks; PD computation only activates once all three of pos, kp, kd are non-None.
dimos/simulation/adapters/whole_body/g1.py New sim WholeBodyAdapter for G1 over SHM. Connection correctly blocks until is_ready() so the adapter only succeeds when the engine has written at least one valid state — this is the right startup guard for simulation.
dimos/control/coordinator.py Gains set_activated / set_dry_run RPCs and auto_start support; _on_twist_command now duck-types to any task with set_velocity_command. TaskConfig extended with WBC-specific fields cleanly. Changes are additive and backward-compatible.
dimos/control/hardware_interface.py New ConnectedWholeBody wrapper resolves per-joint kp/kd from WholeBodyConfig at wire-up time; _try_initialize_last_commanded is non-blocking and correctly gates on has_motor_states().
dimos/robot/unitree/g1/blueprints/basic/unitree_g1_groot_wbc.py Clean dual-mode blueprint (real hw / sim) sharing identical task code. Safety defaults (unarmed, dry-run, 10 s ramp) are correctly separated from sim defaults (auto-arm, no ramp, decimation=1).
dimos/control/tasks/servo_task.py Adds default_positions support and timeout=0 enforcement for hold tasks. start() now resets _last_update_time to prevent an instant timeout on restart.
dimos/control/tick_loop.py Adds _read_all_imu() to the tick phase, exposing IMU through CoordinatorState.imu and decoupling WBC tasks from direct adapter access.

Sequence Diagram

sequenceDiagram
    participant Op as Operator
    participant WS as WebsocketVisModule
    participant Coord as ControlCoordinator
    participant Tick as TickLoop
    participant Task as G1GrootWBCTask
    participant HW as ConnectedWholeBody
    participant Adapt as transport_lcm / SimAdapter
    participant G1 as G1WholeBodyConnection

    Note over G1,Adapt: Real-HW startup (current — buggy)
    G1->>Adapt: /g1/motor_states (zeros, pre-LowState)
    Adapt->>HW: read_motor_states() → [0.0 x 29]
    Tick->>HW: read_state()
    HW-->>Tick: "all joints = 0.0"
    Tick->>Task: compute(state with zeros)
    Note over Task: _state_seen=True on zeros
    Task-->>Tick: "JointCommandOutput(positions=[0.0]*15)"
    Tick->>HW: write_command(zeros, SERVO_POSITION)
    HW->>Adapt: "write_motor_commands(q=0, kp=150, kd=2)"
    Adapt->>G1: /g1/motor_command
    G1->>G1: "LowCmd(q=0, kp=150) via DDS"

    Note over Op,Task: Normal armed flow (after fix)
    G1->>G1: DDS LowState received
    G1->>Adapt: /g1/motor_states (real positions)
    Op->>WS: click Arm
    WS->>Coord: set_activated(True)
    Coord->>Task: arm()
    Note over Task: ramp to default_15 over 10s
    Op->>WS: toggle Dry Run OFF
    WS->>Coord: set_dry_run(False)
    Coord->>Task: set_dry_run(False)
    Note over Task: ONNX policy outputs emitted
    Task-->>Tick: JointCommandOutput (policy)
    Tick->>HW: write_command
    HW->>Adapt: write_motor_commands
Loading

Reviews (6): Last reviewed commit: "[autofix.ci] apply automated fixes" | Re-trigger Greptile

@@ -284,20 +376,53 @@ def _apply_shm_commands(self, engine: MujocoEngine) -> None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Direct access to private _data attribute of MujocoEngine

engine._data reaches into MujocoEngine's private implementation. If that field is ever renamed or restructured, this call will break silently at runtime. A small @property data accessor on MujocoEngine would expose the same information without coupling to the private layout.

Comment thread dimos/control/tasks/groot_wbc_task.py Outdated
current_15 = np.zeros(_NUM_ACTIONS, dtype=np.float32)
for i, jname in enumerate(self._joint_names_list):
pos = state.joints.get_position(jname)
current_15[i] = pos if pos is not None else 0.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will return current_15 as [0*15] if there is a packet drop or a corrupted message.
This can cause the robot to snap because it thinks it is zero position when it is actually somewhere else.

Better to use a cached pose or handle this gracefully.

Comment thread dimos/control/tasks/groot_wbc_task.py Outdated
Comment on lines +375 to +376
q_29[i] = pos if pos is not None else 0.0
dq_29[i] = vel if vel is not None else 0.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use cached or safe values in case of None. Similar to above comment

Comment on lines +378 to +381
# IMU comes from the adapter, not CoordinatorState.
imu = self._adapter.read_imu()
gyro = np.asarray(imu.gyroscope, dtype=np.float32)
gravity = self._projected_gravity(imu.quaternion)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coordinator state should also hold IMU data.

Please create issue for this.

Comment thread dimos/control/coordinator.py Outdated
Attributes:
name: Task name (e.g., "traj_arm")
type: Task type ("trajectory", "servo", "velocity", "cartesian_ik", "teleop_ik")
type: Task type ("trajectory", "servo", "velocity", "cartesian_ik", "teleop_ik", "groot_wbc")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need better task management system. Something similar to registry for adapters

# Mismatched rates make the policy hold actions for too long and
# the robot tips over. ``None`` keeps the task's own default (10).
decimation: int | None = None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can skip so much verbose. one line comments are good enough in coordinator.

Comment thread dimos/hardware/whole_body/spec.py Outdated
Comment on lines +89 to +91
def read_odom(self) -> PoseStamped | None:
# Default: no base pose available. Sim/estimator adapters override.
return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whole body adapters will likely never publish odom so we should modify the spec.

Should be addressed in separate PR

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this should just be part of the groot-wbc_task.py

Don't see a reason why we must have this here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One blueprint for real and sim, with just a simulation flag change would be what we would like

Comment on lines +233 to +253
# Drain the latest LowState frame (None means no fresh sample
# this tick — keep the previous one).
sub = self._subscriber
if sub is not None:
fresh = sub.Read()
if fresh is not None:
with self._lock:
self._low_state = fresh
# One-shot sanity check on the hardcoded mode_machine.
if not self._mode_machine_verified:
self._mode_machine_verified = True
actual = int(getattr(fresh, "mode_machine", -1))
if actual != self._mode_machine:
logger.warning(
f"mode_machine mismatch: hardcoded "
f"{self._mode_machine}, robot reports "
f"{actual}. Commands may be silently "
f"rejected by firmware — set "
f"_MODE_MACHINE_G1 to {actual} for this "
f"variant."
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely needs a refactor. 2 tasks in 1 code block

sub is not none is not needed either.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't PR code like this pls, this shifts the burden of your agent output review to others

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't like this at all. we already have a mujoco sim module. We must not use another implementation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically we have 2 others.

  • simulation/mujoco/mujoco_process.py is the first one. It used to be a class, but Jeff noticed that Mujoco can only run on the main thread in a Mac. So I converted it to a standalone process with shared memory communication.
  • simulation/engines/mujoco_engine.py was added afterwards. It runs in side a Module, which means it's not running on the main thread.
  • This PR adds simulation/engines/mujoco_view_subprocess.py. From the comment below, I assume it was added to fix the same issue of Mujoco not working on Macs. @Nabla7 is this so?

@mustafab0 I agree that we shouldn't be adding another one. The ideal solution would be to convert simulation/engines/mujoco_engine.py to work as the main thread in a new process if possible.

logger.info("GPS goal points cleared and updated clients")

@self.sio.event # type: ignore[untyped-decorator]
async def arm(sid: str, data: dict[str, Any] | None = None) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to send these commands from the command center, right? Or are you sending from rerun?

Comment on lines +112 to +115
# Uses dimos.msgs.std_msgs.Bool to match the coordinator's
# ``activate`` / ``dry_run`` In[Bool] ports, rather than
# dimos_lcm.std_msgs.Bool used by ``explore_cmd`` — the LCM wire
# format is identical; what matters for autoconnect is type parity.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Uses dimos.msgs.std_msgs.Bool to match the coordinator's
# ``activate`` / ``dry_run`` In[Bool] ports, rather than
# dimos_lcm.std_msgs.Bool used by ``explore_cmd`` — the LCM wire
# format is identical; what matters for autoconnect is type parity.

Comment on lines +116 to +117
activate: Out[DimosBool]
dry_run: Out[DimosBool]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: you might want more specific names. This module is for all robots and t's not clear what activate means for any one particular robot. Maybe groot_activate?

data.qpos[free_qposadr + 3 : free_qposadr + 7] = base_wxyz
mujoco.mj_kinematics(model, data)
v.sync()
time.sleep(1.0 / 60.0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To achieve 60 hz you need to subtract the time it took to process the iteration of the loop.

Comment on lines +142 to +143
js_t.stop()
od_t.stop()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not good practice. You have to call stop even if the code fails in the middle. You need try-finally.

Comment on lines +90 to +111
def _on_joint_state(msg: JointState) -> None:
# Coordinator publishes dimos canonical names ("g1/left_hip_pitch")
# but the MJCF uses MuJoCo names ("left_hip_pitch_joint"); translate
# so the lookup against ``name_to_qposadr`` actually hits.
try:
for n, q in zip(list(msg.name), list(msg.position), strict=False):
latest["joints"][dimos_joint_to_mjcf(str(n))] = float(q)
except Exception:
pass

def _on_odom(msg: PoseStamped) -> None:
try:
latest["base_pos"] = np.array(
[msg.position.x, msg.position.y, msg.position.z], dtype=np.float64
)
latest["base_wxyz"] = np.array(
[msg.orientation.w, msg.orientation.x, msg.orientation.y, msg.orientation.z],
dtype=np.float64,
)
except Exception:
pass

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These calls are coming on different threads. You need a lock to modify latest. You also need the lock in the loop below where you access latest.

# from_xml_path can't find them on disk. Inject the dimos-bundled mesh
# bytes by name — same trick MujocoEngine uses.
try:
from dimos.simulation.mujoco.model import get_assets
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move all import everywhere in this file to the top.

)
self._imu_accel_slice = _find_sensor_slice(
self._engine.model,
"imu-pelvis-linear-acceleration",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are G1 specific sensor names, right? But this module isn't just for G1. I think this should be abstracted some how.

"DIMOS_MUJOCO_VIEW=1: couldn't locate mjpython/python on PATH; viewer not launched"
)
else:
_viewer_proc = subprocess.Popen(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blueprint files are for defining blueprints. You can't just start random process because a file is imported. In any case, you'd need to manage the process (shut it down when its not needed).

).transports(
{
("joint_state", JointState): LCMTransport("/coordinator/joint_state", JointState),
("odom", PoseStamped): LCMTransport("/odom", PoseStamped),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the default. No need to define it.

{
("cmd_vel", Twist): LCMTransport("/g1/cmd_vel", Twist),
("activate", DimosBool): LCMTransport("/g1/activate", DimosBool),
("dry_run", DimosBool): LCMTransport("/g1/dry_run", DimosBool),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.transports applies for all the matching modules in a blueprint so it doesn't make sense to define it twice (in _g1_coordinator too).

If you really want to change the default topic, it's better to only define it once:

unitree_g1_groot_wbc_sim = autoconnect(_g1_engine, _g1_coordinator, _g1_ws_vis).transports(
    ("dry_run", DimosBool): LCMTransport("/g1/dry_run", DimosBool)
)

But I think you should remove both and embrace the defaults. This applies to all. Why not just use the defaults?

Comment on lines +56 to +60
# Resolved to an absolute path so MujocoSimModule (parent) and the
# DIMOS_MUJOCO_VIEW=1 subprocess can both open the file regardless of
# the shell's CWD. get_data also auto-extracts the LFS tarball on
# first run.
_MJCF_PATH = str(get_data("mujoco_sim/g1_gear_wbc.xml"))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't call get_data at import time. This blocks python's imports until the data is downloaded. Import-time is just for defining (functions/variables/etc), not doing work.


_g1_connection = G1WholeBodyConnection.blueprint(
release_sport_mode=True,
network_interface=os.getenv("ROBOT_INTERFACE", "enp86s0"),
Copy link
Copy Markdown
Contributor

@paul-nechifor paul-nechifor May 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for environment variables.

You can see all the configs for a blueprint with:

$ uv run dimos run unitree-g1-groot-wbc --help
Blueprint arguments:
    g1wholebodyconnection:
      * g1wholebodyconnection.default_rpc_timeout: float (default: 120.0)
      * g1wholebodyconnection.frame_id_prefix: str | None (default: None)
      * g1wholebodyconnection.frame_id: str (default: g1_pelvis)
      * g1wholebodyconnection.network_interface: str (default: enp86s0)
      * g1wholebodyconnection.release_sport_mode: bool (default: True)
      * g1wholebodyconnection.publish_rate_hz: float (default: 500.0)
    controlcoordinator:
      * controlcoordinator.default_rpc_timeout: float (default: 120.0)
      * controlcoordinator.frame_id_prefix: str | None (default: None)
      * controlcoordinator.frame_id: str | None (default: None)
      * controlcoordinator.tick_rate: float (default: 500.0)
      * controlcoordinator.publish_joint_state: bool (default: True)
      * controlcoordinator.joint_state_frame_id: str (default: coordinator)
      * controlcoordinator.publish_odom: bool (default: True)
      * controlcoordinator.log_ticks: bool (default: False)
    websocketvismodule:
      * websocketvismodule.default_rpc_timeout: float (default: 120.0)
      * websocketvismodule.frame_id_prefix: str | None (default: None)
      * websocketvismodule.frame_id: str | None (default: None)
      * websocketvismodule.port: int (default: 7779)

So if you want to pass network_interface you just send:

uv run dimos run unitree-g1-groot-wbc -o g1wholebodyconnection.network_interface=enp86s0

See docs/usage/cli.md

type="groot_wbc",
joint_names=g1_legs_waist,
priority=50,
model_path=os.getenv("GROOT_MODEL_DIR", str(get_data("groot"))),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll want to use LfsPath instead of get_data so the file isn't downloaded at import time.


"""Composition smoke tests for ``unitree_g1_groot_wbc`` (real-hw blueprint).

These tests do not exercise DDS or actually run anything — they just
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests do not [...] run anything

A very damning statement. :P

Please delete the file. Blueprints are just declarations, not logic. There's nothing to test.

Comment on lines +164 to +171
[0:3] cmd_vel * cmd_scale # scaled velocity command
[3] height_cmd # fixed slot (0.74)
[4:7] (0, 0, 0) # rpy_cmd, zeros
[7:10] gyro * obs_ang_vel_scale # body-frame ang vel
[10:13] projected_gravity(quat) # gravity in body frame
[13:42] (q_29 - default_29) * dof_pos_scale
[42:71] dq_29 * dof_vel_scale
[71:86] last_action (15 dims)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these G1 specific values? If so, either abstract it to allow for other robots, or rename the class to make it clear it's only for G1.

Phase 1 — trivial cleanups:
- Delete test_unitree_g1_groot_wbc.py (the "tests don't run anything"
  blueprint test the reviewer asked to drop).
- MujocoEngine.data public property; mujoco_sim_module stops reaching
  into engine._data.
- JointServoTask.start() resets _last_update_time so a non-zero-timeout
  caller doesn't time out on the first tick.
- Split wholebody_connection's two-tasks-in-one publish loop into
  _drain_low_state, _verify_mode_machine_once, _snapshot_motor_imu,
  _publish_motor_state_and_imu helpers; dropped the `sub is not None`
  guard.
- MujocoSimModuleConfig.imu_gyro_sensor_names + imu_accel_sensor_names
  configurable instead of G1-hardcoded; default list still covers the
  common humanoid pelvis names.

Phase 2 — renames and moves:
- groot_wbc_task.py -> g1_groot_wbc_task.py; GrootWBCTask ->
  G1GrootWBCTask (Paul: G1-specific values, name it).
- Fold _groot_wbc_common.py (joint lists, gain tables, ARM_DEFAULT_POSE)
  into g1_groot_wbc_task.py and delete the common file.
- Move dimos/hardware/whole_body/mujoco/g1/adapter.py ->
  dimos/simulation/adapters/whole_body/g1.py. Adapter registry now
  scans both roots.

Phase 3 — safety: G1GrootWBCTask no longer falls back to zero
  when a joint is missing from CoordinatorState. Adds last-known-good
  caches (_cached_q_29, _cached_dq_29, _cached_q_15) and a _state_seen
  guard that refuses to emit a command until at least one fully-
  populated state has been observed. Stops the "packet drop -> snap
  to zero pose -> robot falls" failure mode.

Phase 4 — task registry: new dimos/control/tasks/registry.py.
  Each task module exposes a register(registry) hook; coordinator's
  _create_task_from_config drops 100+ lines of if/elif and becomes
  a single control_task_registry.create(cfg.type, cfg, hardware=...)
  call. Task type "g1_groot_wbc" is the new canonical name;
  "groot_wbc" kept as an alias.

Phase 5 — activate / dry_run from streams to RPC:
  - Drop the `activate: In[Bool]` and `dry_run: In[Bool]` ports on
    ControlCoordinator. Replaced with @rpc set_activated(engaged) and
    @rpc set_dry_run(enabled) -- these are one-shot configuration
    events, not continuous signals.
  - WebsocketVisModule's arm/disarm/set_dry_run socket.io handlers
    call the coordinator's RPC directly via RPCClient(None,
    ControlCoordinator); the matching Out[DimosBool] ports go away.
  - Blueprints lose their now-dead `("activate", DimosBool)` /
    `("dry_run", DimosBool)` LCM transport entries.

Phase 6 — hardware addressing: _create_whole_body_adapter passes a
  single canonical `address` (matching the manipulator/twist-base
  pattern) instead of overloaded address + network_interface.
  Adapter __init__ signatures updated accordingly.

Phase 7 — MuJoCo subprocess hygiene:
  - Move the standalone viewer's logic from mujoco_view_subprocess.py
    into mujoco_engine.view_main() with `python -m
    dimos.simulation.engines.mujoco_engine <mjcf>` entry point.
    One MuJoCo entry-point file, not two. Adds the missing lock
    around `latest_*` shared state, try-finally for LCM teardown,
    and tick-aware render sleep so 60 Hz holds under load.
  - New MujocoViewerModule wraps the subprocess lifecycle as a
    proper dimos Module (start() spawns, stop() terminates, atexit
    fallback). Blueprints declare it instead of running
    subprocess.Popen at import time. Configurable via -o
    mujocoviewermodule.enabled=true; no more DIMOS_MUJOCO_VIEW env
    var.

Phase 8 — one blueprint, --simulation flag:
  - Merge unitree_g1_groot_wbc.py + unitree_g1_groot_wbc_sim.py into
    a single unitree_g1_groot_wbc.py that branches on
    global_config.simulation. Real-hw path: G1WholeBodyConnection +
    transport_lcm adapter, 500 Hz, unarmed + dry-run, servo_arms.
    Sim path: MujocoSimModule + sim_mujoco_g1 adapter, 50 Hz,
    auto-arm, optional MujocoViewerModule.
  - Drop the unitree-g1-groot-wbc-sim registry entry; CLI runs
    `dimos --simulation run unitree-g1-groot-wbc`.
  - Replace get_data() at import time with LfsPath() (Paul's
    don't-block-imports note).
  - Drop ROBOT_INTERFACE / GROOT_MODEL_DIR / DIMOS_DDS_DOMAIN /
    DIMOS_MUJOCO_VIEW env-var reads from blueprints; users override
    via `-o module.field=value` like every other dimos blueprint.

Deferred (Mustafa flagged for separate PRs, captured as TODOs):
- CoordinatorState should carry IMU data so the task doesn't have to
  reach into the adapter for read_imu (TODO in g1_groot_wbc_task.py).
- Drop read_odom from WholeBodyAdapter Protocol -- most adapters
  shouldn't be a source of base pose (TODO in whole_body/spec.py).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment on lines +170 to +183
pitch = math.copysign(math.pi / 2.0, sinp) if abs(sinp) >= 1.0 else math.asin(sinp)
siny = 2.0 * (w * z + x * y)
cosy = 1.0 - 2.0 * (y * y + z * z)
yaw = math.atan2(siny, cosy)
return IMUState(
quaternion=quat,
gyroscope=gyro,
accelerometer=accel,
rpy=(roll, pitch, yaw),
)

def read_odom(self) -> PoseStamped | None:
# MujocoSimModule publishes the floating-base pose on its own
# ``odom`` Out port (TBD); the adapter doesn't currently route
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Command-mode transient race in write_motor_commands

Every call to write_motor_commands calls write_position_command first, which internally calls self._set_command_mode(CMD_MODE_POSITION) (setting the SHM control word to 0). write_kp_command then sets it back to CMD_MODE_PD_TAU (2). Because the coordinator process and the sim engine process run concurrently (separate processes, no shared GIL), the sim engine's _apply_shm_commands can read CTRL_COMMAND_MODE in the narrow window between those two adapter writes.

When that happens, shm.read_command_mode() == CMD_MODE_POSITION, so the engine falls into engine.write_joint_command(JointState(position=pos_cmd.tolist())) — direct position-servo mode, bypassing the PD-tau path. On the following tick _latest_pd_pos_target is stale (from two cycles ago) because the new position was consumed by the servo path. The net result is one tick of incorrect actuator-mode followed by one tick of a stale torque target — observable as an occasional control glitch in sim.

The fix is to stop having write_position_command touch the mode at all; let only the kp/kd/tau writes determine the mode, or latch CMD_MODE_PD_TAU once at connect time and never reset it during normal operation.

…Protocol

Promoted the two follow-up TODOs from the PR #2033 review (Mustafa
:381 and :91) into this PR after re-reading the actual scope:

1) IMU on CoordinatorState (Mustafa :381):
   - Add ``imu: dict[hardware_id, IMUState]`` field to CoordinatorState.
   - TickLoop polls every ConnectedWholeBody's read_imu() each tick via
     a new _read_all_imu() helper and stuffs the dict on the state.
   - G1GrootWBCTask reads from state.imu first, falls back to
     self._adapter.read_imu() only when state.imu is empty (e.g. unit
     tests that build a bare CoordinatorState). The state path is the
     non-coupled one going forward.

2) Drop read_odom from WholeBodyAdapter Protocol (Mustafa :91):
   - Remove read_odom from spec.py — the WholeBodyAdapter Protocol no
     longer promises base pose. Most adapters never did (real-hw G1's
     was hardcoded None, transport_lcm's likewise); the few that did
     (sim_mujoco_g1) reported pose the engine module already publishes
     itself, so the adapter polling was redundant.
   - Drop the read_odom impls from SimMujocoG1WholeBodyAdapter and
     TransportWholeBodyAdapter.
   - Drop ControlCoordinatorConfig.publish_odom + ControlCoordinator's
     ``odom: Out[PoseStamped]`` port + TickLoop._publish_odom +
     odom_callback wiring. ~50 lines deleted.
   - Sim path: MujocoSimModule's ``odom`` Out is now the sole producer
     of ``/odom``. Removed the previous ``/sim/odom`` topic override
     since the autoconnect default for ``(odom, PoseStamped)`` is
     ``/odom``.
   - Real-hw path: nothing publishes /odom (matches current behaviour
     — read_odom returned None). A future estimator Module subscribes
     to motor_states + imu and publishes to /odom directly, which is
     the architecturally correct seam.

Not addressed in this PR (genuinely pre-existing, not introduced here):
   Paul's "ideal solution" of converting mujoco_engine.py to run as
   the main thread in its own process (so mujoco_process.py and
   mujoco_engine.py can collapse to one). That's a refactor of the
   pre-existing two-MuJoCo-implementations situation; the PR review
   only flagged "don't add a third", which Phase 7 of the previous
   commit already addressed by folding the standalone viewer into
   mujoco_engine.view_main(). Unifying mujoco_process.py with
   mujoco_engine.py touches the manipulator stack (xarm/piper) and
   is out of scope for this G1-WBC PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment on lines +380 to +381
@self.sio.event # type: ignore[untyped-decorator]
async def set_dry_run(sid: str, data: dict[str, Any]) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The set_dry_run handler declares data: dict[str, Any] as a required argument with no default. In python-socketio, if a client emits set_dry_run without a payload (or the JS side sends null), the handler is invoked without the second argument, raising TypeError: missing 1 required positional argument: 'data'. Even when the payload is present but None, data.get("enabled", False) throws AttributeError. Compare to arm and disarm which both default data to None.

Suggested change
@self.sio.event # type: ignore[untyped-decorator]
async def set_dry_run(sid: str, data: dict[str, Any]) -> None:
@self.sio.event # type: ignore[untyped-decorator]
async def set_dry_run(sid: str, data: dict[str, Any] | None = None) -> None:

if client is None:
logger.warning("set_dry_run requested but ControlCoordinator RPC is unavailable")
return
enabled = bool(data.get("enabled", False))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 After the data parameter is made optional, the data.get() call needs a None guard to avoid an AttributeError when the payload is absent. Without this the fix above alone still crashes on a no-payload emission.

Suggested change
enabled = bool(data.get("enabled", False))
enabled = bool((data or {}).get("enabled", False))

Nabla7 and others added 5 commits May 11, 2026 16:56
Paul's "ideal solution" from the PR #2033 review (paraphrased):
"convert mujoco_engine.py to work as the main thread in a new process
if possible." Done.

WholeBodySimHooks (new):
- Extracts the per-step SHM-bridge logic out of MujocoSimModule into
  a standalone class: pre_step pulls motor commands from SHM and
  applies PD-with-feedforward; post_step writes motor state + gripper
  back to SHM. The Module's _apply_shm_commands + _publish_shm_state
  methods, the latched _latest_pd_* state, and _gripper_joint_to_ctrl
  all move here. Same hook bodies now run regardless of whether the
  engine lives in a thread or a subprocess.

mujoco_engine.engine_main (new entry point):
- Standalone "whole-body sim subprocess" loaded with `python -m
  dimos.simulation.engines.mujoco_engine <mjcf> <shm_key> <dof> [--view]`.
  Owns: MujocoEngine, WholeBodySimHooks, an LCM publisher for /odom
  and /imu so non-adapter consumers (viser viewer etc.) still get
  those topics, plus signal handlers that tear down SHM + LCM
  cleanly on SIGTERM/SIGINT.
- Replaces the previous view-only view_main(): the subprocess now
  runs full physics + viewer instead of just mirroring state from LCM.

MujocoSimModule:
- New config field `engine_mode: Literal["thread", "subprocess"]`
  (default "thread"). Thread mode: today's behaviour, untouched
  except for delegating SHM hooks to WholeBodySimHooks instead of
  inline methods. Subprocess mode: skips in-process engine entirely,
  spawns `engine_main` via subprocess.Popen, terminates it cleanly
  on stop(). macOS uses mjpython for the subprocess (Cocoa main-
  thread requirement); Linux uses sys.executable.
- Subprocess + cameras combo is intentionally rejected with a clear
  error — no cross-process frame buffer yet; users either disable
  cameras or pick thread mode.

MujocoViewerModule deleted:
- Now redundant. The engine subprocess can launch its own viewer
  directly with viewer.launch_passive (on its own main thread) when
  the Module is configured headless=False + engine_mode="subprocess".
  Blueprint loses its standalone _viewer module entry; the engine
  subprocess handles viewing.

What this does NOT do (deliberately):
- Touch the manipulator stack (dimos/simulation/mujoco/mujoco_process.py).
  Paul didn't ask for that — he asked specifically for mujoco_engine.py
  to support subprocess+main-thread, which is what landed. The pre-
  existing two-MuJoCo-implementations situation can collapse to one
  in a separate refactor that addresses xarm + piper too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t import

Blueprints can inspect ``global_config`` at module top level to pick
between real-hw and sim backends (``unitree_g1_groot_wbc.py`` branches
on ``global_config.simulation`` to decide which WholeBodyAdapter to
wire). Until now, the ``run`` command only handed the overrides off to
``ModuleCoordinator.build`` *after* the blueprint was already imported —
so the import-time branch read the stale defaults and the resulting
BlueprintConfig was missing the modules the user was trying to
configure via ``-o module.field=value``. Repro before this fix:

    $ dimos --simulation run unitree-g1-groot-wbc \\
        -o mujocosimmodule.engine_mode=subprocess
    ValidationError: mujocosimmodule
      Extra inputs are not permitted

(Real-hw branch was chosen at import; ``mujocosimmodule`` slot doesn't
exist in the resulting config.)

Apply the same ``global_config.update(**cli_config_overrides)`` the
``show_config`` command already uses, but earlier — before
``get_by_name_or_exit`` triggers the blueprint module import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment on lines +601 to +614
"""
if not self._active:
logger.warning(f"G1GrootWBCTask '{self._name}' arm() called before start() — ignoring")
return False
if self._armed:
logger.info(f"G1GrootWBCTask '{self._name}' already armed — arm() ignored")
return False
ramp = ramp_seconds if ramp_seconds is not None else self._config.default_ramp_seconds
self._arming_duration = max(0.0, float(ramp))
self._arm_pending = True
logger.info(
f"G1GrootWBCTask '{self._name}' arm requested (ramp={self._arming_duration:.1f}s)"
)
return True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 arm() can silently restart a live ramp mid-execution

arm() returns early only when _armed=True, but not when _arming=True. If an operator double-clicks Arm in the dashboard (or set_activated is called twice in quick succession), the second call overwrites _arming_duration, sets _arm_pending=True, and on the very next compute() tick the ramp restarts from the mid-ramp current pose. On real hardware this means the robot abruptly stops interpolating toward the target and begins a new, unexpected ramp from wherever it currently is. The disarm() guard checks all three flags (_armed or _arming or _arm_pending); arm() should follow the same convention.

Suggested change
"""
if not self._active:
logger.warning(f"G1GrootWBCTask '{self._name}' arm() called before start() — ignoring")
return False
if self._armed:
logger.info(f"G1GrootWBCTask '{self._name}' already armed — arm() ignored")
return False
ramp = ramp_seconds if ramp_seconds is not None else self._config.default_ramp_seconds
self._arming_duration = max(0.0, float(ramp))
self._arm_pending = True
logger.info(
f"G1GrootWBCTask '{self._name}' arm requested (ramp={self._arming_duration:.1f}s)"
)
return True
if self._armed:
logger.info(f"G1GrootWBCTask '{self._name}' already armed — arm() ignored")
return False
if self._arming or self._arm_pending:
logger.info(f"G1GrootWBCTask '{self._name}' arm in progress — arm() ignored")
return False

stop_explore_cmd: Out[Bool]
cmd_vel: Out[Twist]
tele_cmd_vel: Out[Twist]
movecmd_stamped: Out[TwistStamped]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 cmd_veltele_cmd_vel rename silently breaks an existing blueprint

WebsocketVisModule.cmd_vel was renamed to tele_cmd_vel, but dimos/robot/unitree/g1/blueprints/primitive/uintree_g1_primitive_no_nav.py still wires ("cmd_vel", Twist): LCMTransport("/cmd_vel", Twist) against WebsocketVisModule. After the rename no port matches that binding, so WASD teleoperation from the dashboard is silently dropped for any blueprint that imports that file. The fix is either to update that blueprint to use tele_cmd_vel, or to keep the old port name as an alias and deprecate it.

Comment on lines 321 to 338
def _publish_loop(self) -> None:
period = 1.0 / float(self.config.publish_rate_hz)
next_tick = time.perf_counter()
frame_id = self.config.frame_id

# Identity quaternion + zeros while LowState hasn't arrived (start() blocks
# for it, but the publish loop may also see _low_state cleared during stop()).
zero_quat = (1.0, 0.0, 0.0, 0.0)
zero_vec3 = (0.0, 0.0, 0.0)

while not self._stop_event.is_set():
with self._lock:
ls = self._low_state
if ls is None:
positions: list[float] = [0.0] * _NUM_MOTORS
velocities: list[float] = [0.0] * _NUM_MOTORS
efforts: list[float] = [0.0] * _NUM_MOTORS
quat = zero_quat
gyro = zero_vec3
accel = zero_vec3
else:
positions = [ls.motor_state[i].q for i in range(_NUM_MOTORS)]
velocities = [ls.motor_state[i].dq for i in range(_NUM_MOTORS)]
efforts = [ls.motor_state[i].tau_est for i in range(_NUM_MOTORS)]
quat = tuple(ls.imu_state.quaternion)
gyro = tuple(ls.imu_state.gyroscope)
accel = tuple(ls.imu_state.accelerometer)

now = time.time()
self.motor_states.publish(
JointState(
ts=now,
frame_id=frame_id,
name=G1_JOINT_NAMES,
position=positions,
velocity=velocities,
effort=efforts,
)
)

# Unitree quat is (w,x,y,z); dimos Quaternion is (x,y,z,w).
self.imu.publish(
Imu(
ts=now,
frame_id=frame_id,
orientation=Quaternion(quat[1], quat[2], quat[3], quat[0]),
angular_velocity=Vector3(gyro[0], gyro[1], gyro[2]),
linear_acceleration=Vector3(accel[0], accel[1], accel[2]),
)
self._drain_low_state()
positions, velocities, efforts, quat, gyro, accel = self._snapshot_motor_imu()
self._publish_motor_state_and_imu(
now=time.time(),
frame_id=frame_id,
positions=positions,
velocities=velocities,
efforts=efforts,
quat=quat,
gyro=gyro,
accel=accel,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Zero-state publication before first real LowState corrupts WBC initial pose cache

_publish_loop calls _publish_motor_state_and_imu unconditionally on every iteration. Before the robot's first LowState arrives, _snapshot_motor_imu returns 29 zeros (because _low_state is None). Those zeros reach the coordinator, where _refresh_state_caches sees 29 non-None positions, sets _state_seen = True, and fills _cached_q_15 with zeros.

The WBC task's unarmed-echo path then immediately returns JointCommandOutput(positions=[0.0]*15). ConnectedWholeBody.write_command initializes _last_commanded from these zero states (via _try_initialize_last_commanded) and sends MotorCommand(q=0, kp=150, kd=2, …) for all 29 joints. On a standing robot whose joints are nowhere near the URDF zero, this applies large corrective torques before any operator interaction — potential fall risk.

The safest fix is to skip publication until a real LowState has been cached.

Suggested change
def _publish_loop(self) -> None:
period = 1.0 / float(self.config.publish_rate_hz)
next_tick = time.perf_counter()
frame_id = self.config.frame_id
# Identity quaternion + zeros while LowState hasn't arrived (start() blocks
# for it, but the publish loop may also see _low_state cleared during stop()).
zero_quat = (1.0, 0.0, 0.0, 0.0)
zero_vec3 = (0.0, 0.0, 0.0)
while not self._stop_event.is_set():
with self._lock:
ls = self._low_state
if ls is None:
positions: list[float] = [0.0] * _NUM_MOTORS
velocities: list[float] = [0.0] * _NUM_MOTORS
efforts: list[float] = [0.0] * _NUM_MOTORS
quat = zero_quat
gyro = zero_vec3
accel = zero_vec3
else:
positions = [ls.motor_state[i].q for i in range(_NUM_MOTORS)]
velocities = [ls.motor_state[i].dq for i in range(_NUM_MOTORS)]
efforts = [ls.motor_state[i].tau_est for i in range(_NUM_MOTORS)]
quat = tuple(ls.imu_state.quaternion)
gyro = tuple(ls.imu_state.gyroscope)
accel = tuple(ls.imu_state.accelerometer)
now = time.time()
self.motor_states.publish(
JointState(
ts=now,
frame_id=frame_id,
name=G1_JOINT_NAMES,
position=positions,
velocity=velocities,
effort=efforts,
)
)
# Unitree quat is (w,x,y,z); dimos Quaternion is (x,y,z,w).
self.imu.publish(
Imu(
ts=now,
frame_id=frame_id,
orientation=Quaternion(quat[1], quat[2], quat[3], quat[0]),
angular_velocity=Vector3(gyro[0], gyro[1], gyro[2]),
linear_acceleration=Vector3(accel[0], accel[1], accel[2]),
)
self._drain_low_state()
positions, velocities, efforts, quat, gyro, accel = self._snapshot_motor_imu()
self._publish_motor_state_and_imu(
now=time.time(),
frame_id=frame_id,
positions=positions,
velocities=velocities,
efforts=efforts,
quat=quat,
gyro=gyro,
accel=accel,
)
def _publish_loop(self) -> None:
period = 1.0 / float(self.config.publish_rate_hz)
next_tick = time.perf_counter()
frame_id = self.config.frame_id
while not self._stop_event.is_set():
self._drain_low_state()
with self._lock:
has_state = self._low_state is not None
if not has_state:
# No LowState from the robot yet. Suppress publication so
# the coordinator never sees all-zero joint positions — the
# WBC task's _state_seen guard would otherwise fire on zeros,
# causing the unarmed-echo path to command q=0 with full
# kp/kd gains before the robot's actual pose is known.
next_tick += period
sleep_for = next_tick - time.perf_counter()
if sleep_for > 0:
time.sleep(sleep_for)
else:
next_tick = time.perf_counter()
continue
positions, velocities, efforts, quat, gyro, accel = self._snapshot_motor_imu()
self._publish_motor_state_and_imu(
now=time.time(),
frame_id=frame_id,
positions=positions,
velocities=velocities,
efforts=efforts,
quat=quat,
gyro=gyro,
accel=accel,
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants