Nav pt4: Fix tranform frames by jeff-hykin · Pull Request #2112 · dimensionalOS/dimos

jeff-hykin · 2026-05-16T11:41:48Z

(Will fill out later)

Problem

Closes DIM-XXX

Solution

How to Test

Contributor License Agreement

I have read and approved the CLA.

… rrb Native module (cpp/main.cpp) now publishes two new streams on every keyframe: GraphNodes3D for keyframe optimized poses, LineSegments3D for odometry (traversability=1.0) and loop-closure (0.4) edges. Both wire through SimplePGO::keyPoses() + historyPairs() — no changes needed to simple_pgo.{h,cpp} since the accessors already exist. Native binary rebuilt cleanly via nix build .#default --no-write-lock-file. Python (pgo.py) declares matching pgo_graph_nodes / pgo_graph_edges Out streams so the rerun bridge auto-discovers and logs them. nav_stack_rerun_config() now picks _agentic_debug_rerun_blueprint when agentic_debug=True — an rrb.Horizontal layout with a 3D pane and a dedicated top-down pane (both Spatial3DView over origin="world", named "3D" and "top_down" so dimos-viewer persists camera state separately). demo_better_pgo_viz.py composes the cross-wall sim blueprint with agentic_debug=True so the new layout + pose graph render together. Used for manual screenshot validation.

Adds visual_override entries for world/pgo_graph_nodes and world/pgo_graph_edges that mirror the existing FAR pattern: when agentic_debug=True, the PGO pose graph renders at z=_AGENTIC_DEBUG_LIFT (3.0m) instead of the default 1.7m, with slightly larger node radii (0.15) and edge thickness (0.06) so the green keyframe trajectory stands out clearly above the terrain cloud in the top-down pane. Verified visually via demo_better_pgo_viz with the cross-wall sim — green keyframe nodes + edges are now plainly identifiable above terrain in both the 3D and top_down rerun panels.

rerun's Spatial3DView doesn't have a top-down camera API, so the "top_down" pane introduced in a7a9be9 was just a duplicate 3D view. Drop _agentic_debug_rerun_blueprint and use _default_rerun_blueprint unconditionally — the agentic_debug lift on visual_override is what actually makes the pose graph and nav markers readable from any angle.

C++ side (main.cpp): when searchForLoopPairs sets m_cache_pairs (i.e. this keyframe will be incorporated into iSAM2 with a loop factor), snapshot the current global poses before smoothAndUpdate. After the update, build a nav_msgs::Path-encoded LoopClosureDeltas message: position = post.t - r_delta * pre.t, orientation = quaternion(post.R * pre.R^T). Publish on the new pgo_loop_closure topic. Stderr logs the event count for live observability. Python side (pgo.py): declare pgo_loop_closure: Out[NavPath] so the new topic is registered alongside corrected_odometry/pgo_tf/etc. Slow test (test_pgo_loop_closure.py): replays og_nav_60s through the native binary with permissive thresholds (loop_time_thresh=5s, min_loop_detect_duration=1s, loop_search_radius=2m, loop_score_thresh=0.5) so the recording reliably triggers loop closures. Subscribes to pgo_loop_closure, logs each event the moment it arrives (event #, poses_length, frame_id, first delta), and after the run validates each event has >0 poses, finite translations (<100m), and unit-norm quaternions (drift <0.05). Stdout from a run shows 19 events, sizes 10..35, max |t|=0.0013m, max |q|-1|=1e-6 — exactly the small-nudge profile expected from a self-consistent recording.

Replaces the kdtree-on-keyframe-positions loop search with a Scan Context (Kim & Kim 2018) descriptor-based pipeline: 1. addKeyPose now also caches a polar-binned (20 rings × 60 sectors) max-z descriptor + the per-row mean "ring key" for each keyframe. The descriptor is appearance-based and pose-independent, so it keeps working even when odometry has drifted enough that the new keyframe is no longer "near" its old neighbours in pose-space. 2. searchForLoopPairs first asks Scan Context for a candidate: ring-key L2 distance ranks all past keyframes, top-K are scored by column-shifted cosine distance on the full descriptor, the best below the threshold (default 0.4) is the candidate. The winning column shift is also converted to a yaw rotation and used to seed ICP, which dramatically improves convergence on revisits that arrive at a different heading from the original pass. 3. Position-based search is retained as a fallback when SC is disabled or finds nothing, so existing behaviour is preserved. Replaces ~50 lines of position-search with ~30 lines of SC retrieval in searchForLoopPairs; new scan_context.{h,cpp} (~150 lines, MIT attribution to upstream irapkaist/scancontext concepts but no source copied) implements the descriptor + distance. Side-effect: this makes on-start relocalization a small follow-up addition — descriptors + ring-keys + poses are now per-keyframe state that can be serialised, and the SC search path already does "appearance-based pose recovery without an initial pose guess." Verified via test_pgo_loop_closure.py: 17 loop-closure events fired across the og_nav_60s rosbag (was 19 with naive position search; SC is more selective and rejects two borderline-position matches that weren't actually visual revisits). All events have valid shape + tiny quaternion/translation deltas as expected for a self-consistent bag.

…n search misses Adds CLI args to expose Scan Context config on the native binary (--use_scan_context, --sc_n_rings, --sc_n_sectors, --sc_max_range_m, --sc_top_k, --sc_match_threshold). New slow test test_pgo_synthetic_drift.py: - Synthesises a 4-wall point-cloud room with two distinctive interior columns (so the scene isn't rotationally symmetric). - Generates an out-and-back trajectory: drives east 8m then returns to the origin, heading unchanged. - Injects DRIFT_AT_REVISIT_M = 5m of additive y-drift into the reported odometry, ramped linearly with travelled distance. The body-frame scan stays byte-identical between first and second visit (same true sensor view of the same scene); the odom pose at revisit is 5m offset. - Runs the native PGO binary twice over the same input: * use_scan_context=true → expect ≥1 loop event * use_scan_context=false → expect 0 loop events (drift >> 1m radius) - Dumps PGO stderr after each run for diagnostics. Result: SC fires 10 loop closure events on the synthetic trajectory; position-based search fires 0 — exactly the demonstration of why we swapped to appearance-based place recognition. Both assertions pass. Verifies the core SC value prop: appearance-based place recognition doesn't depend on the (drifted) pose, so it keeps working when the odometry has wandered far enough that the kdtree-on-positions search no longer finds neighbours.

Test files now use setup_logger() / logger.info(...) per the fix_nits rule "no print() calls in tests; use logging if diagnostics are genuinely needed." Matches the existing test_pgo_rosbag.py convention. Also drops the now-unused sys import. Also clears a stale docstring on demo_better_pgo_viz.py: it claimed the demo enabled a "horizontal 3D + top-down panes" layout, which was reverted in 1801759 — rerun's Spatial3DView didn't support an initial camera angle (rrb.EyeControls3D existed at the time but wasn't used). The remaining value of agentic_debug=True is the visual override lift, which the new docstring describes accurately. No behavioural change. Tests still pass.

Sweep over names introduced by the better_pgo work that hit fix_nits "expand mod -> module" rule: - scan_context: cfg -> config (param + 12 call-sites); d (return val) -> descriptor in make_descriptor/make_ring_key/make_sector_key; pt -> point in the descriptor build loop; zf -> point_z (float cast); q_col/c_col -> query_column/candidate_column; q_norm/c_norm -> query_norm/ candidate_norm; cj -> shifted_j; d (in best_distance return loop) -> distance with min_distance for the running best. - simple_pgo: desc -> descriptor on the per-keyframe cache; k -> top_k_count for the partial-sort bound; structured-binding `auto [d, shift]` -> `auto [distance, shift]`. - main.cpp: kp -> keyframe; ps -> pose_stamped (build_graph_nodes and build_loop_closure_deltas); a/b -> start/end and p1/p2 -> start_pose/end_pose in append_segment; n -> count for the loop bound; lc_msg -> loop_closure_msg at the publish site. - tests: ps -> pose in the validate loop (test_pgo_loop_closure); c,s -> cos_yaw,sin_yaw in _yaw_rotation (test_pgo_synthetic_drift). Names that intentionally stay short are the math-convention ones: r/t for SE(3) rotation+translation, q for quaternion, i/j as loop indices, idx as keyframe index, ts as timestamp, dt for time delta, tx/ty/tz/qx/qy/qz/qw for component decomposition. The fix_nits rule calls out mod/lc as the target pattern; expanding the math-notation names would make the code less readable, not more. Also drops one section-label comment ("# Log each event the moment it arrives.") whose adjacent function name already conveys the same and one in-loop "# node_type 1 = odom/robot" that repeats info already stated in the function-level docstring. Native binary rebuilt + slow test still passes (17 events, all valid).

Drops in the wiring for evaluating the PGO native module on KITTI-360. Cannot run end-to-end yet — the dataset is gated behind a registered login at cvlibs.net so the data download is a manual user step. What's here: - kitti360_loader.py: parses the KITTI-360 directory layout (data_3d_raw + data_poses + calibration); composes per-frame lidar→world pose by chaining cam0_to_world ⊕ inv(velo_to_cam). Exposes a frame iterator + scan_xyz(frame_id). - loop_groundtruth.py: LCDNet/KITTI-convention groundtruth (≥50 frame gap, ≤4m radius), order-agnostic scoring of detected pairs. - run_kitti360_benchmark.py: argparse CLI, spawns the native binary on private LCM topics, plays (registered_scan, odometry) from disk, subscribes to pgo_graph_edges to extract loop-closure pairs (via traversability ≈ 0.4 segments) and pgo_loop_closure for delta event counts. Writes JSON. - README.md: download instructions for the official "Test SLAM 3D" 12 GB package, published SOTA reference numbers from LCDNet + ISC papers (LCDNet 0.91-0.93 AP, Scan Context 0.62-0.78 AP), expected ballpark for our minimal SC port.

… into jeff/clean/nav2

Source-of-truth wire layout for `nav_msgs.GraphNodes3D` — a bare list of node3d entries with no edges. Useful when a producer only wants to publish poses (keyframe positions, frontier points, navpoints) without graph connectivity. Same `node3d` wire layout as Graph3D's nodes, so a GraphNodes3D message is a Graph3D with `edge_count = 0`. Kaitai-Struct format lets us generate parsers in any language (Python/C++/Rust/etc) without hand-keeping each in sync. The Python counterpart already exists in GraphNodes3D.py; this commit just documents the wire layout.

Previously ``LineSegments3D`` stored its data on private attrs (``_segments``, ``_traversability``) with no accessors. Promote them to public class fields (``segments``, ``traversability``) so consumers can read them after decode, and add ``segment_timestamps`` — a list of ``(start_ts, end_ts)`` tuples preserved from each ``PoseStamped``'s header during ``lcm_decode``. The endpoint timestamps are useful when the segment endpoints map back to source events (e.g. pose-graph SLAM producers stamp each endpoint with its keyframe creation time, so a scorer can correlate edges back to input frame ids). Wire format unchanged — additive change in the Python wrapper only. Validated round-trip on a synthesized ``nav_msgs/Path`` that matches what the far_planner C++ binary publishes.

Header-only ``dimos::LineSegments3D`` wrapper matching the Python ``nav_msgs.LineSegments3D``. Lets a C++ native module publish line segments via ``add(x1,y1,z1, x2,y2,z2, traversability)`` and ``publish(lcm, channel)`` instead of hand-rolling ``nav_msgs::Path`` pose pairs with magic ``orientation.w`` values for the traversability tag. Lives next to the Python sibling so any future schema-bound changes keep both implementations side-by-side. Not used by anything yet — modules opt in by including ``msgs/LineSegments3D.hpp``. Wire format: ``nav_msgs/Path`` with consecutive pose pairs forming segments; ``orientation.w`` on the first pose carries traversability. Matches the Python ``lcm_decode`` byte-for-byte.

New nav_msgs.Graph3D for pose-graph / visibility-graph data with proper typed nodes and edges: Graph3D { u8 edge_count; u8 node_count; f8 timestamp; Node3D nodes[]; Edge edges[] } Node3D { PoseStamped pose; u8 id; u8 metadata_id } Edge { u8 start_id; u8 end_id; f8 timestamp; u8 metadata_id } Schema source-of-truth is Graph3D.ksy (Kaitai Struct, big-endian). Python encoder/decoder is in Graph3D.py (round-trip tested), matching C++ encoder is Graph3D.hpp (header-only, raw lcm.publish bytes). Both implementations are hand-written to match the ksy byte-for-byte. Replaces the convention of using nav_msgs/Path with orientation.w overloaded as node-type or traversability — which lost rotation, had no stable ids for edges to reference, and required custom decode per consumer. Graph3D edges reference nodes by id (not list index) so producers can reorder/re-emit nodes between snapshots; each node carries a full SE(3) PoseStamped; metadata_id is a caller-defined enum (e.g. PGO edges: 0=odometry / 1=loop_closure). Tests pin the wire layout (header order, pose-first inside node, edges decoding regardless of node table presence) so future drift between the ksy and the implementations is caught immediately. No consumers in this commit — modules opt in by importing Graph3D and publishing/subscribing to Out[Graph3D] / In[Graph3D].

…raph3D] Bumps the upstream pin to dimos-module-far-planner v0.7.0, which collapses the two-stream (GraphNodes3D + LineSegments3D) visibility-graph output into a single graph: Out[Graph3D] stream. Node metadata_id matches the previous GraphNodes3D node-type enum (0=normal, 1=odom, 2=goal, 3=frontier, 4=navpoint). Edge metadata_id encodes traversability tier (0=non-traversable, 1=partial, 2=both endpoints traversable) instead of the float orientation.w convention. nav_boundary stays on LineSegments3D — it's a polyline (collision boundaries with no graph topology), not a graph. Validated end-to-end with test_far_planner_rosbag.py against v0.7.0.

…p GraphNodes3D Collapse PGO's two-stream nav_msgs/Path encoding into a single pose_graph: Out[Graph3D] snapshot per keyframe cycle. Each node carries a full SE(3) PoseStamped (previously rotation was zeroed), a stable uint64 id (= keyframe index), and metadata_id = NODE_KEYFRAME. Edges reference nodes by id and carry metadata_id = EDGE_ODOMETRY (0) or EDGE_LOOP_CLOSURE (1) — replaces the orientation.w == 0.4 traversability tag previously used to mark loop closures. Touches: * pgo.py / specs.py: Out[NavPath] x 2 → Out[Graph3D] x 1. * cpp/main.cpp: build_graph_nodes + build_graph_edges + append_segment helpers collapse into build_pose_graph() returning dimos::Graph3D. * cpp/msgs/GraphNodes3D.hpp deleted (vendored copy no longer needed). * scoring.py: filter loop edges via metadata_id == EDGE_LOOP_CLOSURE instead of orientation.w ≈ 0.4. Endpoint frame_id lookup goes through node_id → node.pose.ts → timestamp_to_frame, removing the loop_closure_traversability / traversability_tolerance config. * nav_stack/main.py: _pose_graph_nodes_colors_debug + _pose_graph_edges_colors_debug → _pose_graph_colors_debug, calling Graph3D.to_rerun_multi to render nodes + edges under separate rerun sub-paths. * runner.py / README.md docstring updates. Deletes dimos/msgs/nav_msgs/GraphNodes3D.py — no longer has consumers after this commit and the matching far_planner switch. Validated by: * test_Graph3D.py round-trip + wire-layout pin tests pass. * test_far_planner_rosbag.py (uses the same Graph3D Python type for far_planner's graph stream) passes. * benchmark_kitti360_smoke deterministically produces keyframes across 5 consecutive runs at publish_interval_sec=0.02 (50 Hz), including publish_interval_sec=0.0 (unbounded).

The smoke test isn't a demo (it runs PGO end-to-end with KITTI-360 playback and reports per-topic message counts as a liveness check), and it's specific to KITTI-360 — the new name reflects both. Also matches the sibling benchmark module names under ``dimos/navigation/nav_stack/benchmarks/pose_graph_kitti360/``. Pure rename + topic count updates: the TopicCounterModule subscribes to pose_graph (Graph3D) instead of pose_graph_nodes + pose_graph_edges (matching the PGO output collapse in the previous commit), and the verdict messages collapse accordingly.

Earlier rename to test_pgo_loop_correction_delta.py (29dfafe) was out-of-sync with main, where the file went the opposite direction. Restore the upstream filename to avoid a manual rename-direction conflict on the next merge. Internal symbols (LoopCorrectionDeltaRecorderModule, the loop_correction_delta stream attribute, etc.) stay — those describe what the stream actually carries (per-keyframe SE(3) deltas, not loop closures themselves).

… into jeff/feat/better_pgo

`topic-counter-module` path updated to reflect the demo_benchmark_kitti_smoke → benchmark_kitti360_smoke rename (commit f97e1bd).

New nav_msgs.GraphDelta3D: two aligned arrays — a list of nodes (reusing Graph3D.Node3D byte-for-byte) and a list of SE(3) Transforms, one per node. ``transforms[i]`` is the delta about to be applied to ``nodes[i].pose``: ``post = transforms[i] * nodes[i].pose`` (left-multiply). Layout (big-endian, mirrors Graph3D conventions): u8 node_count f8 timestamp Node3D nodes[node_count] // same wire bytes as Graph3D nodes Transform transforms[node_count] // 7×f8 (translation xyz + quat xyzw) Test pin (test_node_layout_matches_graph3d) asserts that a GraphDelta3D.nodes[i] is bit-identical to a Graph3D.nodes[i] for the same Node3D, so consumers can correlate ids across the two messages. PGO wiring: * pgo.py / specs.py: ``loop_correction_delta: Out[NavPath]`` → ``loop_closure_event: Out[GraphDelta3D]``. Renamed too — the stream carries a graph-mutation event, not just a delta payload. * cpp/main.cpp: build_loop_correction_delta (returned nav_msgs::Path with position-encodes-delta hack) → build_loop_closure_event returning dimos::GraphDelta3D. Each (node, transform) pair carries the pre-smooth keyframe (id, pose, ts) and the iSAM2-applied delta. * cpp/msgs/GraphDelta3D.hpp vendored. Downstream consumers updated to subscribe ``loop_closure_event: In[GraphDelta3D]``: * PoseGraphScoringModule * test_pgo_loop_closure recorder (full payload validation rewrite — unit-quaternion + finite-translation checks now read each transform.rotation / transform.translation instead of pose.orientation / pose.position) * test_pgo_synthetic_drift counter * benchmark_kitti360_smoke TopicCounter * runner.py docstring 11/11 unit tests pass (5 GraphDelta3D layout, 5 Graph3D layout, 1 blueprints registry). PGO native rebuilt — new topic ``loop_closure_event`` confirmed in binary symbols.

# Conflicts: # dimos/core/native_module.py # dimos/msgs/nav_msgs/LineSegments3D.py # dimos/navigation/nav_stack/benchmarks/pose_graph_kitti360/scoring.py # dimos/navigation/nav_stack/modules/far_planner/far_planner.py # dimos/navigation/nav_stack/modules/pgo/benchmark_kitti360_smoke.py # dimos/navigation/nav_stack/modules/pgo/cpp/main.cpp # dimos/navigation/nav_stack/modules/pgo/pgo.py # dimos/navigation/nav_stack/specs.py

# Conflicts: # dimos/navigation/nav_stack/modules/pgo/cpp/main.cpp

- main.py: _graph_colors_debug body already conveys what it does; matches sibling _pose_graph_colors_debug shape. - test_pgo_rosbag.py: drop "# -- Analysis --" section header and the multi-line docstring explaining why corrected_tf isn't subscribed (provenance belongs in PR description, not the test class).

PGO's pose_graph was rendering as nodes-only outside agentic_debug mode: with no visual_override registered, the bridge fell back to Graph3D.to_rerun() which intentionally returns just rr.Points3D of the nodes (to_rerun_multi is the path that emits edges too). Bridge fix: in final_convert, prefer to_rerun_multi(base_path=...) when the message exposes it. The bridge already knows the entity_path it would log to, so pass it as base_path. Graph3D consumers now get nodes + edges by default, no per-blueprint visual_override required. Existing visual_overrides still win because they run before final_convert in the pipe. Only Graph3D currently defines to_rerun_multi, so this is effectively a Graph3D-rendering fix; other RerunConvertible types (TFMessage, etc.) fall through to to_rerun() unchanged.

jeff-hykin added 30 commits May 12, 2026 07:48

merge main into jeff/feat/better_pgo

e45b228

make model_path optional

1cf0045

add flowbase robot

c9a426a

add alfred

272833c

add blueprint

0863ad1

naming

9020415

name fix

6ea0251

mypy

8e9c445

mypy

277cd29

-

f6010b0

-

4f43322

fixup vis_throttle

c0f6bb9

-

45f57a5

Merge branch 'jeff/feat/better_pgo' of github.com:dimensionalOS/dimos…

d1d484f

… into jeff/clean/nav2

add recording topics

726808e

-

84bfe56

-

36a7454

Merge branch 'jeff/clean/nav2' into jeff/clean/nav3

fe3c43d

-

8fe1f54

cleanup

d780dac

Merge branch 'jeff/clean/nav2' into jeff/clean/nav3

559a0c1

jeff-hykin added 30 commits May 16, 2026 20:44

fixup on macos

5697835

fixup LoopClosure spec

547205b

-

ab133a0

fix timeout

df16d46

Merge remote-tracking branch 'origin/main' into jeff/feat/better_pgo

c138f66

Merge remote-tracking branch 'origin' into jeff/clean/nav3

b9df4fa

-

29dfafe

Merge remote-tracking branch 'origin/main' into HEAD

cf6772e

Merge remote-tracking branch 'origin/main' into HEAD

5769d48

build native modules in the build step

124c4e3

remove .ksy

be9f2ca

Merge remote-tracking branch 'origin/main' into jeff/feat/better_pgo

7311034

Merge branch 'jeff/feat/better_pgo' of github.com:dimensionalOS/dimos…

bf3c570

… into jeff/feat/better_pgo

chore(blueprints): regenerate all_blueprints.py for smoke-test rename

1c59f23

`topic-counter-module` path updated to reflect the demo_benchmark_kitti_smoke → benchmark_kitti360_smoke rename (commit f97e1bd).

misc

6c87dcc

clean comments

446fa16

Merge branch 'jeff/feat/better_pgo' into jeff/clean/nav3

444b0ad

# Conflicts: # dimos/navigation/nav_stack/modules/pgo/cpp/main.cpp

clean

e0002e4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nav pt4: Fix tranform frames#2112

Nav pt4: Fix tranform frames#2112
jeff-hykin wants to merge 142 commits into
mainfrom
jeff/clean/nav3

jeff-hykin commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jeff-hykin commented May 16, 2026

Problem

Solution

How to Test

Contributor License Agreement

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants