Skip to content

DimSim scene editing/authoring in JS + moving inside dimos#2187

Open
Viswa4599 wants to merge 34 commits into
mainfrom
antim/sim-authoring-js
Open

DimSim scene editing/authoring in JS + moving inside dimos#2187
Viswa4599 wants to merge 34 commits into
mainfrom
antim/sim-authoring-js

Conversation

@Viswa4599
Copy link
Copy Markdown
Collaborator

@Viswa4599 Viswa4599 commented May 20, 2026

Brings DimSim into misc/DimSim/, wires it up as a first-class simulation backend, and replaces the legacy JSON authoring + sticky-state plumbing with a standard Three.js JS dev cycle for scenes and a JS-native eval system.

What this does

  • DimSim lives inside dimos. Continues Paul's feat: include DimSim #2081 vendoring with a cleaned-up source tree (misc/DimSim/{src,cli,evals,scenes,public,docs}) — no separate repo to clone, dimos drives it directly via --simulation dimsim.
  • Scene authoring is now JS. Each scene is one file at scenes/<name>/index.js that default-exports async build(api) and uses the engine's THREE / Rapier / physics helpers — no JSON, no editor sidebar plumbing.
  • Apartment ported to JS. 97 MB apt.json decomposed into scenes/apartment/{data/*.js, textures/*} (~6 MB total) and fed through loadLevel; full interactivity (pickables, door states, TV toggle) preserved.
  • Eval system collapsed to one file per workflow. scenes/<env>/evals/<name>.js is a runnable program that import { runEval } from '@dimsim/eval' and calls it; works both in the browser (via Vite-pinned harness chunk) and via deno run (via Deno import map) — same file, two runtimes.
  • dimsim CLI shipped. dimsim dev, dimsim eval list, dimsim eval <workflow> (auto-connect), dimsim eval --headless ....

Docs

Validation checklist (against dimos#1691)

  • Write a scene from scratch in JS (scenes/warehouse/index.js)
  • Edit the existing apartment scene in JS (scenes/apartment/)
  • Robot embodiment API — setEmbodiment({...}) scene-side API + bridge ServerPhysics/ServerLidar live reconfigure
  • Load a third-party map (Sketchfab GLB) into a scene
  • Load an arbitrary GLB via loadGLTF
  • Load assets from LFS

Follow-ups (separate)

  • Decompose the apartment into pure Three.js + a registerable-object library so its interactive contents (pickables, doors, TV) can be authored in JS like everything else, not as apt-shape loadLevel data.
  • Move objects and textures to LFS

paul-nechifor and others added 30 commits May 14, 2026 07:47
Co-authored-by: Viswajit Nair <viswajitnair@gmail.com>
… scene

- DIMSIM_LOCAL=1 (or path) makes DimSim run from a local checkout instead
  of cloning into ~/.local/state — handy when iterating on DimSim itself
  alongside dimos.
- dimsim_headless flag in GlobalConfig (default True). When False, skip the
  Playwright install, drop --headless, and tell the user to open the URL
  manually.
- Default dimsim_scene "apt" → "apartment" to match the renamed scene
  directory in DimSim.
# Conflicts:
#	.gitignore
#	dimos/simulation/dimsim/dimsim_process.py
Replaces Paul's vendored snapshot with current standalone DimSim. Brings
in recent work that wasn't yet in his vendor:

- scenes/apartment/ now uses JS-authored data modules under data/* plus
  extracted texture files under textures/.  No more 97MB apt.json.
  loadLevel() in sceneApi.ts feeds the apt-shape blob to
  importLevelFromJSON, so E-key interactivity (pickup, multi-state
  cabinets, TV) works exactly as before.
- scripts/extract_apt_to_js.py — one-shot decomposer used to produce the
  data/ modules above.
- bridge/, src/, scene-api updates — newer code than what Paul vendored.

Preserved Paul's one functional adaptation: cli.ts.resolveDistDir() now
includes the tryBuildFromSource() fallback, so on first run inside the
dimos repo we materialize dist/ via Deno+Vite (dist/ is gitignored and
not committed in this layout).

Dropped Paul's redundant public/sims/apt.json snapshot — apartment data
lives in scenes/apartment/data/ now.
… legacy JSON scenes

Vendored DimSim doesn't need to publish itself to JSR, and the legacy
JSON scene format (public/sims/*.json) is superseded by JS-authored
scenes under scenes/<name>/index.js.  Removing:

- public/sims/  (3 stale JSON scenes — JS scenes replace these)
- docker/       (CI containers — out of scope here)
- dimos-cli/test/{add_godcam,add_purple_object,list_assets,list_scene}.py
                (one-off dev scripts, zero callers)
- dimos-cli/test/{diagnose_costmap.py,loopback.ts,rubrics_test.ts,
                  scene_editor_test.py,smoke.ts}
                (one-off dev scripts, zero callers)
- dimos-cli/mod.ts   (JSR ./mod export — only used by `deno publish`)
- dimos-cli/run-eval.ts  (parallel entry point; `dimsim eval` covers it)
- deno.json:    drop ./mod export + the JSR publish.include block

Survivors in dimos-cli/test/:
  dimos_integration.py — full bridge↔Python LCM smoke test
  lcm_cross_test.{py,ts} — Python↔TS LCM byte-compat regression check
In dimos mode an external Python agent drives the AiAvatar via LCM, so
the in-browser VLM client + prompt builder + vision capture pipeline
that powered standalone DimSim's auto-exploring agent is dead weight.
AiAvatar.js stays — it's the agent visual + collider, used in both
modes; only the behavior layer is removed.

Removed:
- src/ai/modelConfig.js
- src/ai/vlmClient.js
- src/ai/visionCapture.js
- src/ai/sim/vlmActions.js
- src/ai/sim/vlmPrompt.js

engine.js: removed the 5 ai/* imports, the 4 derived constants
(ACTIVE_VLM_*, resolveActiveVlmModel, buildActiveVlmPrompt), the
vlm: {...} config block passed to new AiAvatar(...) (~150 lines —
captureBase64 / onCapture / onRequest / onResponse / onTaskFinished /
etc.), and the agent-vision-capture branch in the render loop.

AiAvatar receives no vlm option and falls back to vlm = null; all its
internal `this.vlm?.X` calls are optional-chained so they no-op without
the behavior config.

Vite build: 25 → 20 modules; main bundle 897 → 878 kB.
AiAvatar.js was 1509 lines of in-browser wander + VLM behavior. In the
dimos integration path the agent's pose is driven externally over LCM
(server physics steps from cmd_vel → /odom → engine.js sets the agent's
kinematic body), and engine.js even overrides `agent.update` on the
dimos agent to skip everything but the visual sync.

Stripping ~1260 lines of unreachable code:

- VLM behavior: _vlmUpdate, _requestVlmDecision, _applyVlmDecision (the
  big ~390-line action dispatcher), _stepPlan, _extractBubbleText…,
  _tracePush.
- Wander state machine: _state/_target/INSPECT/WALK/IDLE, _pickWanderTarget,
  _applyIdleGravity, _computeConservativeMovement.
- Agent memory: _memoryKey/_loadMemory/_saveMemory/_rememberTag.
- Thought bubble: _setThought, _labelSprite, _labelCanvas, _labelCtx,
  _labelTex, _lastDecisionBubbleAt, plus the helpers (safeParseJson,
  roundRect, wrapTextLines).
- Character controller: was only used by the wander mover.
- Constructor params that fed the above: getWorldKey, getTags,
  getPlayerPosition, senseRadius, walkSpeed.

What remains in AiAvatar:
  constructor (group + fallback capsule + facing cone + rapier body +
               vertical & spine capsule colliders + GLB load kickoff)
  setPosition / getPosition
  update(dt)               — mixer + _syncVisual; engine.js overrides
                              in dimos mode to skip the mixer too
  dispose
  _syncVisual / _syncSpineCollider
  _loadGLB / _applyGLB     — fits the GLB to capsule height, rebuilds
                              the box collider to match the model bbox

engine.js: cleaned up the leftover `agent.vlm = …`, `agent._setThought(…)`,
`agent._plan = null`, `agent._pendingDecision = null` no-ops in
startAgentTask and stopAiAgent, and dropped the unused getWorldKey /
getTags / getPlayerPosition / senseRadius / walkSpeed constructor args.

Vite: main bundle 878 → 848 kB.
AiAvatar.js: 1509 → 243 lines.
…d/eval-create)

These cli subcommands all read or write the legacy public/sims/<name>.json
format, which we already removed in the earlier cleanup pass:

  dimsim setup                    Downloaded core+evals from a registry —
                                  vendored layout builds dist/ locally
                                  via tryBuildFromSource on first run.
  dimsim scene install/list/remove
                                  Scene registry — vendored ships scenes
                                  in scenes/<name>/ already.
  dimsim list objects --scene X   Reads dist/sims/<name>.json (gone).
  dimsim build eval --scene/--target
                                  Same legacy JSON format (gone).
  dimsim eval create              Interactive wizard backed by the same
                                  scene-index of the legacy JSON format.

Removed:
- dimos-cli/setup.ts
- dimos-cli/eval/builder.ts
- dimos-cli/eval/scene-index.ts
- ~360 lines of subcommand handlers + their imports in cli.ts
- IS_COMPILED / IS_REMOTE detection — vendored is always source-local

Survivors in dimos-cli/:
  bridge/{server,lidar,physics}.ts
  eval/runner.ts             — eval *runner* still intact
  headless/launcher.ts
  vendor/lcm/                — load-bearing on macOS (verified earlier)
  cli.ts                     — dev / eval list / eval / agent only
  agent.py / deno.json / deno.lock / README.md

cli.ts: 878 → 477 lines.  Vite bundle unchanged (these were CLI-side).
dimos_integration.py covers the same Deno-bridge ↔ Python LCM multicast
path more thoroughly (cmd_vel publish + odom/lidar/image subscribe), and
the integration test is what we actually run when verifying the LCM
transport works on macOS.  The cross-test pair was a quick one-off sanity
check that's no longer useful on its own.

Removed:
- dimos-cli/test/lcm_cross_test.py
- dimos-cli/test/lcm_cross_test.ts

Survivors in dimos-cli/test/:
- dimos_integration.py
…mplate.json)

The cleanup pass on antim/sim-authoring-js removed:
- dimos-cli/mod.ts (JSR ./mod re-export, no callers)
- dimos-cli/setup.ts (registry downloader for the legacy scene install flow)
- the `dimsim scene install/list/remove` cli commands those backed

scenes.template.json was the manifest for that same registry flow — dead
in the vendored layout where scenes ship in misc/DimSim/scenes/ directly.

dimsim-check still gives us the real signal we care about: npm ci +
npm run build + deno check cli.ts.
…moved mod.ts/setup.ts/scenes.template.json (the workflow file itself; the JSON delete landed in the previous commit)
…rce null Rerun click fields

Two bugs that together looked like "the robot is invisible and clicked
goals don't move it":

1. misc/DimSim/src/dimos/dimosBridge.ts — engine.js boots scenes that
   return `embodiment: null` with `agent.group.visible = false` (it's
   the "scene didn't declare its own agent" default).  When dimos sends
   `embodimentConfig` it was reloading the GLB but never re-enabling
   visibility, so the robot stayed hidden even though physics and odom
   were running.

2. dimos/visualization/rerun/websocket_server.py — the click + twist
   handlers used `float(msg.get("z", 0))`, but `dict.get` only falls back
   to the default when the *key* is missing.  Rerun sends 2D-panel
   clicks with `"z": null`, so `float(None)` raised and the click was
   dropped before `clicked_point` was published — nav stack never got a
   goal, robot never moved.  Added a `_num` helper that maps None to 0.

Combined effect: with these two patches, the unitree GLB (loaded from
the fallback /agent-model/robot.glb since unitree_go2.glb is not shipped)
becomes visible the moment dimos's embodimentConfig lands, and clicked
points in Rerun reliably reach the nav stack regardless of which panel
they originate from.
It's not literally a Go2 mesh — it's a generic robot GLB we use as the
visible stub when dimos picks the Go2 embodiment.  unitree_go2.glb has
been the first fallback URL in scene_client.py + engine.js for a while
but never shipped, so every load relied on the second fallback
("robot.glb").  Renaming the file makes the stub's purpose obvious and
collapses the two-URL fallback to a single URL.

LFS tracking is unchanged — *.glb under misc/DimSim/public/agent-model/
matches an existing rule in .gitattributes, and `git mv` preserved the
LFS pointer (same sha256 oid).

Updated 11 references across:
- dimos/simulation/dimsim/scene_client.py (8 embodiment presets +
  3 docstring examples)
- misc/DimSim/src/engine.js (default avatarUrl in createAiAgent)
Companion to the previous rename commit which moved the file but didn't
capture the source-code reference updates (cd'd into misc/DimSim/ for
the build, so the dimos/ path was outside the relative git add).

- dimos/simulation/dimsim/scene_client.py: 8 embodiment-preset
  avatarUrl entries + 1 docstring built-in example, all collapsed from
  the two-URL ["unitree_go2.glb", "robot.glb"] fallback to a single
  "dimsim_unitree_stub.glb" URL.
- misc/DimSim/src/engine.js: createAiAgent default avatarUrl when none
  is passed in.

The "Any URL" docstring example and the local-assets/my-robot.glb
example are deliberately left as-is — they're external/user examples,
not references to the shipped stub.
Root cause of the "robot never spawns" report: paul/feat/dimsim (#1735)
landed DimSimConnection as the dimos↔DimSim transport, but that
connection only shuttles cmd_vel and odom over LCM — it never sends an
`embodimentConfig` WS message.  The visibility/avatar setup therefore
falls through to whatever engine.js does at boot.

What engine.js did: if the scene returned `embodiment: null` (apartment,
warehouse, empty all do — they don't want to dictate the model), the
agent was created with `avatarUrl: []` and `agent.group.visible = false`.
The fallback was that an explicit SceneClient.set_embodiment() call
would later send embodimentConfig and re-show the agent (a feature my
previous commit already wired up).  But the default Connection-based
agentic blueprint never calls SceneClient, so the agent stayed hidden
forever even though server physics + sensors were running fine.

Fix: in dimos mode, drop the empty-avatar / hide-group path.  Let
createAiAgent use its default avatarUrl (the dimsim_unitree_stub.glb we
just renamed), and leave the group visible.  An external SceneClient
call to set_embodiment still works — it swaps the GLB and re-asserts
visibility via the embodimentConfig handler.
Restructure src/ to drop the now-redundant `dimos/` subfolder (DimSim
itself lives inside the dimos repo now, so the prefix is noise):

  src/dimos/dimosBridge.ts → src/bridge.ts
  src/dimos/sceneApi.ts    → src/sceneApi.ts
  src/dimos/sceneEditor.ts → src/sceneEditor.ts
  src/dimos/evalHarness.ts → src/evals/harness.ts
  src/dimos/rubrics.ts     → src/evals/rubrics.ts

Convert evals from JSON+TS to JS-native modules co-located with scenes:

  evals/manifest.json + evals/apt/*.json → deleted
  scenes/apartment/evals/{go-to-couch,go-to-kitchen,go-to-tv}.js

Each workflow file default-exports a `{scene, task, timeoutSec, startPose?,
setup?(ctx), success(ctx)}` shape.  The harness dynamic-imports the
module, runs `setup` once, polls `success` every 250ms until passed or
timeout, replies `{type:'evalResult', ...}` to the runner.

Runner is now a thin Deno script: walk scenes/*/evals/*.js to discover,
open one control WS, send `{type:'runEval', workflowUrl}` per workflow,
collect results.  No JSON parsing, no manifest, no command-DSL.

EvalContext exposes:
  agent, agentPos, sceneState
  setAgentPose(p)
  findAsset(query), dist(a,b)         — low-level helpers
  rubrics.objectDistance({...})       — pre-bound high-level rubrics
  rubrics.radiusContains({...})

So a workflow file looks like:
  export default {
    scene: 'apartment',
    task: 'Go to the couch',
    timeoutSec: 30,
    startPose: { x: 0, y: 0.5, z: 3, yaw: 0 },
    success: (ctx) => ctx.rubrics.objectDistance({ target: 'sectional', thresholdM: 2.0 }),
  };

`dimsim eval list` and `dimsim eval [--connect] [--scene] [--workflow]`
both work against the new layout.

television.json (a duplicate of go-to-tv.json with a longer timeout) was
collapsed into go-to-tv.js.
…blobs

Workflow files used to be JS modules with a config-object default export
("storage in JS clothing"); the harness owned the orchestration.  Flip
that around — the workflow file is the program, it imports runEval from
@dimsim/eval and calls it directly.

  // scenes/apartment/evals/go-to-couch.js
  import { runEval } from '@dimsim/eval';

  await runEval({
    scene: 'apartment',
    task: 'Go to the couch',
    timeoutSec: 30,
    startPose: { x: 0, y: 0.5, z: 3, yaw: 0 },
    success: (ctx) => ctx.rubrics.objectDistance({ target: 'sectional', thresholdM: 2.0 }),
  });

Mechanics:

- index.html: importmap maps `@dimsim/eval` → `/_dimsim/eval-api.js`
- public/_dimsim/eval-api.js: tiny ESM facade that awaits a
  `dimsim-eval-ready` window event and delegates to
  `window.__dimsim.eval.runEval`.
- engine.js: after EvalHarness is constructed, sets
  `window.__dimsim.eval = { runEval }` and dispatches the ready event.
- src/evals/harness.ts: public `runEval(workflow)` runs the eval and
  sends `{type:'evalResult'}` itself.  The WS handler is just a
  dynamic-import — the workflow file's top-level await drives the rest.

CLI flow is unchanged from the user's side (`dimsim eval --workflow
go-to-couch`), but the runner now just sends `{type:'runEval',
workflowUrl}`, the harness imports the URL, the workflow's own top-level
await calls `runEval(...)` which finishes and replies over WS.
…argets it directly

Previously workflow files imported `@dimsim/eval`, which the importmap
aliased to a hand-written ESM proxy under public/_dimsim/.  The proxy
existed only to bridge between un-bundled user scripts and the bundled
engine's hash-named harness chunk — it delegated to a window global
(`window.__dimsim.eval.runEval`) after waiting on a custom DOM event.
Fishy.

Cleaner: tell Vite to pin the harness chunk's filename
(`dist/assets/dimsim-eval.js`) and point the importmap straight at it.
Now the workflow file and the engine import the *same module* — module
identity is preserved by the browser's ESM loader — so a module-level
singleton works.

Changes:

- vite.config.js: chunkFileNames pins src/evals/harness.ts → dimsim-eval.js
- src/evals/harness.ts: adds setEvalHarness(h) + module-level runEval(workflow)
                        that delegates to the registered singleton.
- src/engine.js: calls setEvalHarness(evalHarness) after construction;
                  drops the window.__dimsim.eval global + dispatchEvent.
- index.html: importmap now points at /assets/dimsim-eval.js
- public/_dimsim/eval-api.js: deleted, dir gone

Workflow files are unchanged — still `import { runEval } from '@dimsim/eval'`
followed by top-level await.

Verified end-to-end headless:
  deno run -A --unstable-net dimos-cli/cli.ts \
    eval --headless --scene apartment --workflow go-to-couch
  → loads apartment scene, dynamic-imports go-to-couch.js, runs setup,
    polls success every 250ms, fails on 30s timeout with a clean
    "3.313m to Modern L-shaped sectional (threshold 2m)" reason.
`dimsim eval <workflow>` is now shorthand for
`dimsim eval --workflow <workflow> --connect` — the common dev-loop
case where the sim is already open and you just want to run one eval
against it.

  dimsim eval go-to-couch            # any scene that has the workflow
  dimsim eval apartment/go-to-couch  # scene-qualified

Auto-defaults to --connect because spinning up a fresh headless bridge
for a one-off invocation is rarely the right move during dev — that
mode is for CI and is still reachable as `dimsim eval --headless …`.

The runner / harness wiring is unchanged; this is purely a cli arg
shape change.

To install the cli on PATH (one-time):

  cd misc/DimSim/dimos-cli
  deno install -gAf --unstable-net --name=dimsim --config=./deno.json ./cli.ts
The workflow file under scenes/<env>/evals/<name>.js can now be run as
a Deno program, with no shape change to the file itself:

    deno run -A misc/DimSim/scenes/apartment/evals/go-to-couch.js

The same import + same call:

    import { runEval } from '@dimsim/eval';
    await runEval({ scene, task, success, … });

…now resolves differently depending on runtime:

- Browser  → importmap in index.html → /assets/dimsim-eval.js (bundled
            EvalHarness chunk) → runs the eval in-place against the
            real THREE.js scene, agent, Rapier.

- Deno     → scenes/deno.json → dimos-cli/eval/deno-client.ts → opens a
            control WS to ws://localhost:8090, sends
            {type:'runEval', workflowUrl}, awaits evalResult, exits.
            The browser is what actually re-imports the file and
            executes setup/success — Deno is just a dispatcher.

Two new files:

- dimos-cli/eval/deno-client.ts — the Deno runEval; reads Deno.mainModule
  to figure out the workflow URL, connects to whatever bridge is up on
  DIMSIM_PORT (default 8090).
- scenes/deno.json — scoped import map for `@dimsim/eval` so anything
  under scenes/ resolves the bare specifier correctly.

Verified end-to-end against a headless bridge:

    deno run -A scenes/apartment/evals/go-to-couch.js
    [eval] dispatching /scenes/apartment/evals/go-to-couch.js → ws://localhost:8090/?ch=control
    [eval] task: Go to the couch
    [eval] PASS (277ms): 1.693m to "Modern L-shaped sectional" (threshold 2m)

`dimsim eval <workflow>` keeps working — both shortcuts dispatch to the
same EvalHarness, just over different framing.
Two restructures asked for in review:

1. The eval system was split across two folders (src/evals/ for the
   browser-side harness+rubrics, dimos-cli/eval/ for the Deno runner+
   client) which made it hard to find anything eval-related.  Both move
   to a single top-level evals/ folder; filenames make the runtime
   obvious (`harness`+`rubrics` are browser, `runner`+`deno-client` are
   Deno).

2. `dimos-cli/` was a misnomer — DimSim already lives inside dimos, so
   "dimos-cli" inside misc/DimSim is doubled up.  Renamed to cli/.

New tree:
  misc/DimSim/
    src/         engine.js, AiAvatar.js, main.js, style.css,
                 bridge.ts, sceneApi.ts, sceneEditor.ts
    cli/         cli.ts, deno.json, deno.lock, bridge/, headless/,
                 vendor/lcm/, test/, agent.py, README.md
    evals/       harness.ts, rubrics.ts (browser),
                 runner.ts, deno-client.ts (Deno),
                 deno.json (LSP hints)
    scenes/      apartment/{index.js,data/,textures/,evals/},
                 empty/, warehouse/, deno.json
    public/, index.html, package.json, vite.config.js, …

Touched files (paths only — no logic changes):
- src/engine.js: import "../evals/harness.ts"
- cli/cli.ts:    import "../evals/runner.ts"
- scenes/deno.json: @dimsim/eval → "../evals/deno-client.ts"
- evals/deno-client.ts: /// <reference lib="deno.ns" /> for IDE
- evals/deno.json: scoped import map for @std/path
- vite.config.js: chunkFileNames matches "/evals/harness.ts"
- dimos/simulation/dimsim/dimsim_process.py: `dimos-cli` → `cli`
- .github/workflows/dimsim-check.yml: `cd dimos-cli` → `cd cli`

Re-install dimsim global (path of cli.ts changed):

    cd misc/DimSim/cli
    deno install -gAf --unstable-net --name=dimsim --config=./deno.json ./cli.ts

Vite build still emits assets/dimsim-eval.js (pinned), and the headed +
headless + direct-deno-run eval paths all keep working.
Drop:
- server.js (legacy Express+OpenAI VLM proxy — paired with the in-browser
  VLM stack we already deleted)
- update-sims.sh (regenerated a manifest for public/sims/, which is gone)
- scripts/{apt_to_single_glb,decompose_apt,decompose_objects_to_glb}.py
  (old GLB-decomposition flow, superseded by extract_apt_to_js.py)
- scripts/package-release.sh (binary-release packaging — not used in the
  vendored-in-dimos model)
- deno.lock at top level (cli/ has its own deno.lock for the CLI;
  top-level was stale)
- docs/sdk-design.md (described an early design — no longer accurate)
- cli/README.md (described JSR install + `dimsim setup` flow — dead)

package.json:
- Drop dead scripts (server / sync / parity:check / update-sims) and the
  dimos:* entries that pointed at the old dimos-cli/ path.
- Drop runtime deps that only server.js used (express, cors, openai).
- Now: vite + three + rapier + spark — that's it.
- npm install needs --legacy-peer-deps because spark@latest expects
  three@^0.180 but the engine pins three@0.168.  Stable enough for
  daily work; we'll bump three together with spark in a follow-up.

README.md: rewritten for the current layout (was still describing the
standalone Spark/SimStudio era with VLM backend on :8000).

New docs/:
- getting-started.md   5-minute tour + the two run modes + cheatsheet
- scenes.md            authoring scenes (Three.js dev cycle, api args,
                        physics colliders, interactivity limitation)
- evals.md             authoring eval workflows + the three ways to run
                        + the dual-runtime `@dimsim/eval` story
- architecture.md      full file-tree + dimos↔bridge↔browser data flow
                        + key contracts (WS channels, LCM topics,
                        scene/eval module shapes)

Vite build still emits dist/assets/dimsim-eval.js (pinned), all the
chunks ship as expected.
- scenes.md: refocused on the "create/edit a scene" loop.  Concrete
  recipe up top, `api` table, common patterns (loops, GLBs, lighting,
  shadow cam tuning), copy-from-empty tip.  No backstory.
- getting-started.md: dropped the apt.json regeneration aside —
  not relevant for someone arriving fresh.
- evals.md: trimmed the runtime-resolution explainer; now purely
  "here's how to author an eval".
- architecture.md: deleted.  Anyone interested in the internals can
  read the code.
- README.md: docs index trimmed to three entries.

No code touched.
Adds the validation examples from dimos#1691 inline in
scenes/apartment/index.js — each section maps to one of Lesh's asks:

  1. loadLevel(...) the authored apartment data (unchanged)
  2. THREE.SphereGeometry + MeshPhysicalMaterial + staticCollider
     → "Standard threejs API" + "Optional physics" + "New elements"
  3. THREE.BoxGeometry crate + box collider
  4. loadGLTF + physics.staticCollider(prop, 'trimesh')
     → "Model importing" + "Optional physics for models"
  5. THREE.PointLight accent light
     → "add a light, reload"

The editing-flow validation is the file itself: edit a `ballPos.x`, a
material color, a light intensity — save — browser HMRs.

All Three.js + physics interfaces in use are the same ones any new
scene gets via the build() api argument; nothing apartment-specific.
HMR was structurally awkward — at every save the bridge's filesystem
watcher (Deno.watchFs) broadcast {type:"reload"}, sceneEditor.ts
dispatched to a __dimsimHmr handler in engine.js, that called
sceneApi._revertToBaseline() and re-imported the scene.  The agent
was created *after* the baseline snapshot, so every reload erased it
and a fresh build had no agent-creation code to re-add it.  That's
the "ball + robot disappear after editing" we just saw.

Cleaner to drop the whole machinery and let users hard-refresh the
browser to pick up scene edits:

- cli/bridge/server.ts: removed the Deno.watchFs watcher + the
  {type:"reload"} broadcast loop.
- src/engine.js: dropped sceneApi._captureBaseline() and the
  window.__dimsimHmr handler.
- src/sceneEditor.ts: dropped the reload-message branch in the WS
  patcher.
- src/sceneApi.ts: removed _captureBaseline / _revertToBaseline and
  their baseline state.
- cli/bridge/physics.ts: refreshed a stale "hot-reloading" comment.

While in there:
- src/engine.js: added support for an optional `afterBuild(api)`
  scene export so scenes that use loadLevel can append imperative
  THREE.js code after the level is built.
- src/sceneApi.ts: loadLevel + loadJson are now idempotent — calling
  either twice with the same data / URL is a no-op.

importLevelFromJSON itself was already clean — its rebuildAssets()
and rebuildAllPrimitives() pre-clear assetsGroup / primitivesGroup
before re-adding, so no engine-side patch was needed (the (A) bullet
turned out to be redundant — the actual culprit was _revertToBaseline,
which is gone now).
Lesh's #1691 "Robot definition API" — code a simple drone / holonomic /
ground robot from inside a scene file.

The bridge already handled embodimentConfig over WS (stored as
chState.embodiment, calls ServerPhysics.reconfigure + ServerLidar.
reconfigure live).  Adds the missing JS-side surface:

  // sceneApi.ts
  export function setEmbodiment(config): void

  // scene file
  setEmbodiment({
    embodimentType: 'drone',
    avatarUrl:    '/agent-model/dimsim_unitree_stub.glb',
    radius: 0.3, halfHeight: 0.1, gravity: 0,
    maxSpeed: 3.0, turnRate: 2.0, maxAltitude: 8,
  });

setEmbodiment does two things at once:
  1. Calls window.__dimosBridge._handleEmbodimentConfig(config) so the
     browser swaps the avatar GLB + un-hides the agent group.
  2. Sends {type:'embodimentConfig', ...config} over the control WS so
     the bridge reconfigures server physics + lidar mount.

cli/bridge/physics.ts:368 is the 6DoF flight branch; cli/bridge/lidar.ts
reads the mount-height fields.  No engine.js or bridge changes needed
— just exposed what was already there.

Wired into scenes/warehouse/index.js as a drone, documented in
docs/scenes.md (new "Robot embodiment" section + a row in the api
table).
Viswa4599 and others added 3 commits May 19, 2026 18:43
setEmbodiment called from scene build() was a silent no-op because
engine.js sets window.__dimosBridge ~360 lines AFTER it runs the scene's
build(); both the local _handleEmbodimentConfig hop and the WS send
through _sendPhysics quietly resolved to undefined.

Queue the config in sceneApi when the bridge isn't ready, and have
engine.js call sceneApi._flushPendingEmbodiment() right after the
window.__dimosBridge = bridge line.  After that the bridge actually
receives {type:'embodimentConfig', ...}, ServerPhysics.reconfigure
fires, and the warehouse drone correctly hovers with gravity=0.
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
except AttributeError:
pass
sock.bind(("", MCAST_PORT))
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
sock.bind(("", LCM_PORT))
const cacheBust = `?t=${Date.now()}`;
await import(/* @vite-ignore */ workflowUrl + cacheBust);
} catch (e: any) {
console.error(`[eval] failed to import ${workflowUrl}:`, e);
_npcClock: any = null; // THREE.Clock

async _execCode(code: string, id?: string): Promise<void> {
console.log(`[sceneEditor] exec${id ? ` (${id})` : ""}:`, code.slice(0, 100));
}

async _loadScript(url: string, id?: string): Promise<void> {
console.log(`[sceneEditor] loadScript${id ? ` (${id})` : ""}:`, url);
async _loadScript(url: string, id?: string): Promise<void> {
console.log(`[sceneEditor] loadScript${id ? ` (${id})` : ""}:`, url);
try {
const resp = await fetch(url);
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

Greptile Summary

This PR brings DimSim in-tree under misc/DimSim/, replaces JSON-based scene authoring with JS modules (Three.js + Rapier), ports the apartment scene, introduces a dimsim CLI, and wires DimSim as a first-class simulation backend driven directly from the dimos repo.

  • Python side: dimsim_process.py drops the external git-clone in favor of the vendored copy; a DIMSIM_LOCAL override is validated before use. websocket_server.py fixes a TypeError: float(None) crash on 2D-panel clicks.
  • CLI/bridge: A Deno CLI (cli.ts) orchestrates dev, eval, and agent subcommands; bridge/server.ts multiplexes control vs sensor WebSocket streams with per-channel LCM isolation for parallel evals.
  • Eval runner: runner.ts discovers workflows by filesystem walk and runs them sequentially (or in parallel via runEvalsMultiPage); a routing bug causes parallel-eval sockets to land on the sensor path where text frames are dropped, silently breaking the --parallel N path.

Confidence Score: 4/5

Safe to merge for single-page eval and dev workflows; the parallel eval path (--parallel N) is broken and will produce no results.

The Python integration, single-page eval flow, and bridge LCM relay are all solid. The one concrete defect is in runEvalsMultiPage: it connects sockets with ?ch=page-0 (the sensor routing param) instead of ?channel=page-0&ch=control, so every parallel-eval socket ends up in the sensor handler where text frames are silently discarded. Any CI run using --headless --parallel N (N > 1) will hang until timeout with zero results.

misc/DimSim/evals/runner.ts — the runEvalsMultiPage socket URL construction on line 133.

Important Files Changed

Filename Overview
misc/DimSim/evals/runner.ts Multi-page eval orchestrator; runEvalsMultiPage opens sockets with ?ch=page-0 (wrong param) instead of ?channel=page-0&ch=control, routing each connection to the sensor path where text frames are dropped — parallel evals silently produce no results. Single-page path is correct.
misc/DimSim/cli/bridge/server.ts Bridge server handling WebSocket routing, LCM relay, and Rapier snapshot ingestion; single-page control/sensor separation is correct, but the multi-page channel routing relies on the channel query param which the runner currently doesn't supply correctly.
misc/DimSim/cli/cli.ts Main CLI entry point; dev/eval/agent subcommands wired correctly; existing known issue with stale apt default scene in the dev subcommand (already tracked).
dimos/simulation/dimsim/dimsim_process.py Replaces external git-clone approach with vendored misc/DimSim/; _resolve_dimsim_dir correctly handles the DIMSIM_LOCAL override and validates the resolved path before use.
dimos/visualization/rerun/websocket_server.py Bug fix: replaces float(msg.get(x, 0)) pattern with a _num() helper that handles explicit None values (e.g., from 2D-panel clicks) that would previously crash with TypeError: float(None).
.github/workflows/dimsim-check.yml Adds a CI check for the DimSim subtree: npm build, Deno type-check. Path-filtered correctly so it only runs on misc/DimSim/** changes.
dimos/simulation/dimsim/scene_client.py Updates all robot embodiment presets to point at the new stub GLB (dimsim_unitree_stub.glb) that ships inside the repo, removing the dependency on external GLB files.

Sequence Diagram

sequenceDiagram
    participant PY as dimsim_process.py
    participant CLI as cli.ts (Deno)
    participant BRG as bridge/server.ts
    participant PW as Playwright (headless)
    participant ENG as engine.js (browser)
    participant RUN as runner.ts

    PY->>CLI: spawn deno run cli.ts --scene apartment --port 8090 --headless
    CLI->>BRG: "startBridgeServer({port, scene, headless})"
    BRG-->>BRG: open HTTP+WS server on :8090
    CLI->>PW: launchHeadless(url)
    PW->>BRG: GET / → injected HTML (window.__dimosScene)
    PW->>ENG: dynamic import /scenes/apartment/index.js
    ENG->>BRG: "WS connect ?ch=control (control socket)"
    ENG->>BRG: "WS connect ?ch=sensor (sensor socket)"
    ENG->>BRG: send Rapier snapshot (DSS2 binary)
    BRG-->>BRG: initServerSystems(snapshot)
    BRG->>ENG: pose updates (JSON)
    RUN->>BRG: "WS connect ?ch=control"
    RUN->>BRG: "send {type:runEval, workflowUrl}"
    BRG->>ENG: relay runEval message
    ENG-->>ENG: dynamic import workflow, poll success()
    ENG->>BRG: "send {type:evalResult, ...}"
    BRG->>RUN: relay evalResult
    RUN-->>CLI: EvalResult[]
Loading

Reviews (2): Last reviewed commit: "Update misc/DimSim/cli/bridge/server.ts" | Re-trigger Greptile

Comment thread misc/DimSim/cli/cli.ts
// ── Dev ─────────────────────────────────────────────────────────────
if (subcommand === "dev") {
const distDir = await resolveDistDir();
const scene = (opts.scene as string) || "apt";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Stale default scene name breaks dimsim dev

The fallback is "apt" but the apartment scene was renamed to "apartment" (its directory is scenes/apartment/). Any developer running dimsim dev without --scene will load /scenes/apt/index.js, which returns 404, resulting in a blank scene. The Python-driven path always passes --scene apartment explicitly, so only standalone CLI usage is affected — but that's the primary path documented in docs/getting-started.md.

Suggested change
const scene = (opts.scene as string) || "apt";
const scene = (opts.scene as string) || "apartment";

Comment thread misc/DimSim/cli/bridge/server.ts Outdated
const ratesJs = sensorRates ? `window.__dimosSensorRates=${JSON.stringify(sensorRates)};` : "";
const enableJs = sensorEnable ? `window.__dimosSensorEnable=${JSON.stringify(sensorEnable)};` : "";
const fovJs = cameraFov ? `window.__dimosCameraFov=${cameraFov};` : "";
const inject = `<script>window.__dimosMode=true;window.__dimosScene="${activeSceneName}";${headless ? "window.__dimosHeadless=true;" : ""}${ratesJs}${enableJs}${fovJs}</script>`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 security Unescaped scene name injected into HTML script tag

activeSceneName is interpolated directly into the <script> block without HTML/JS escaping. If a user passes --scene 'foo";window.open("http://evil")//' (or a specially crafted DIMSIM_SCENE env var), the quote terminates the string literal and injects arbitrary JS. Because this is a local dev server the blast radius is limited, but sanitising the value (e.g. allowing only [a-zA-Z0-9_-]) would prevent the class of mistake entirely.

Comment on lines +192 to +214
function _runOne(ws: WebSocket, wf: WorkflowEntry): Promise<EvalResult> {
return new Promise((resolve) => {
const onMessage = (event: MessageEvent) => {
if (typeof event.data !== "string") return;
let msg: any;
try { msg = JSON.parse(event.data); } catch { return; }
if (msg.type !== "evalResult") return;
if (msg.workflowUrl && msg.workflowUrl !== wf.url) return;
ws.removeEventListener("message", onMessage);
resolve({
scene: wf.scene,
workflow: wf.workflow,
workflowUrl: wf.url,
task: msg.task ?? "",
passed: !!msg.passed,
reason: msg.reason ?? (msg.passed ? "ok" : "fail"),
score: typeof msg.score === "number" ? msg.score : null,
durationMs: msg.durationMs ?? 0,
});
};
ws.addEventListener("message", onMessage);
ws.send(JSON.stringify({ type: "runEval", workflowUrl: wf.url }));
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _runOne hangs indefinitely on WebSocket close/error

The returned Promise resolves only on a matching evalResult message. If the bridge crashes, the socket closes, or the eval workflow import fails silently on the browser side, the Promise never settles and the entire runEvals call stalls. Adding an onerror/onclose handler that rejects (or resolves with a failure result) would prevent the runner from hanging in CI.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Comment on lines +132 to +134
const sockets = await Promise.all(
options.channels.map((ch) => _connect(`${options.wsUrl}/?ch=${encodeURIComponent(ch)}`)),
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Multi-page eval sockets routed to sensor path, not control

?ch=page-0 sets the ch query param, which the bridge uses to distinguish control vs sensor sockets (isSensor = ch !== "control"). A channel name like "page-0" is not "control", so every socket opened here lands in the sensor onmessage handler — which immediately drops text frames with if (!(event.data instanceof ArrayBuffer) …) return. The runEval JSON command is a text frame, so it is silently discarded. All parallel eval workflows will hang indefinitely waiting for an evalResult reply that never arrives.

The multi-channel routing key is channel (not ch). The URL should be ?channel=${ch}&ch=control so the socket is both directed to the right channel state and treated as a control socket.

Suggested change
const sockets = await Promise.all(
options.channels.map((ch) => _connect(`${options.wsUrl}/?ch=${encodeURIComponent(ch)}`)),
);
const sockets = await Promise.all(
options.channels.map((ch) => _connect(`${options.wsUrl}/?channel=${encodeURIComponent(ch)}&ch=control`)),
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants