Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
167 changes: 167 additions & 0 deletions .github/workflows/collision-monitor-integration.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
# Layer-2 "deploy to customer" integration: run the LICENSED, pre-bundled
# collision_monitor node in parallel with the lab_sim MoveIt Pro integration
# test.
#
# IP boundary (this repo is PUBLIC; the monitor is PROPRIETARY):
# * This workflow PULLS a pre-built, license-gated monitor bundle image and
# uses it as the build base. It NEVER clones or builds collision_monitor
# source. The only collision-monitor-specific things in this public repo
# are the lab_sim_collision_monitor usage example (launch/params/test).
# * The monitor's engine constructor is license-gated, so the example only
# runs with a valid MOVEIT_LICENSE_KEY (supplied here from the
# STUDIO_CI_LICENSE_KEY secret). Without it the monitor node dies on init
# and the Layer-2 test fails loudly.
#
# Dispatch contract consumed (workstream C / upstream must match EXACTLY):
# repository_dispatch event_type: collision-monitor-integration
# client_payload:
# bundle_image (REQUIRED) container image ref of the licensed,
# pre-bundled monitor (monitor + licensing + coal +
# runtime deps, layered on the MoveIt Pro base).
# collision_monitor_ref (optional) upstream sha/branch, logged only.
# upstream_run_id (optional) upstream run id, logged only.
name: Collision Monitor Integration (Layer 2)

on:
repository_dispatch:
types: [collision-monitor-integration]
workflow_dispatch:
inputs:
bundle_image:
description: >-
Licensed, pre-bundled collision_monitor image to use as the build
base (monitor + licensing + coal + runtime deps).
required: true
type: string
collision_monitor_ref:
description: 'Upstream collision_monitor ref (logged only).'
required: false
default: ''
type: string

concurrency:
group: collision-monitor-integration-${{ github.ref }}
cancel-in-progress: true

jobs:
layer2-parallel-monitor:
name: lab_sim + licensed monitor (parallel)
runs-on: picknik-16-amd64
steps:
- name: Resolve dispatch inputs
id: inputs
env:
DISPATCH_BUNDLE: ${{ github.event.client_payload.bundle_image }}
DISPATCH_REF: ${{ github.event.client_payload.collision_monitor_ref }}
DISPATCH_RUN_ID: ${{ github.event.client_payload.upstream_run_id }}
MANUAL_BUNDLE: ${{ inputs.bundle_image }}
MANUAL_REF: ${{ inputs.collision_monitor_ref }}
run: |
set -euo pipefail
bundle="${DISPATCH_BUNDLE:-$MANUAL_BUNDLE}"
ref="${DISPATCH_REF:-$MANUAL_REF}"
run_id="${DISPATCH_RUN_ID:-n/a}"
if [ -z "${bundle}" ]; then
echo "::error::No bundle_image provided in client_payload or workflow_dispatch inputs." >&2
echo "The licensed monitor bundle image is required; this workflow does NOT build monitor source." >&2
exit 1
fi
echo "Using licensed monitor bundle image: ${bundle}"
echo "Upstream collision_monitor ref (informational): ${ref:-n/a}"
echo "Upstream run id (informational): ${run_id}"
echo "bundle_image=${bundle}" >> "${GITHUB_OUTPUT}"

- name: Verify license secret present (fail loudly if not)
env:
MOVEIT_LICENSE_KEY: ${{ secrets.STUDIO_CI_LICENSE_KEY }}
run: |
set -euo pipefail
if [ -z "${MOVEIT_LICENSE_KEY}" ]; then
echo "::error::STUDIO_CI_LICENSE_KEY secret is empty. The collision_monitor engine is" >&2
echo "license-gated and the Layer-2 example cannot run without a valid key." >&2
exit 1
fi
echo "License key present (value masked)."

- name: Checkout example_ws (usage example only — NO monitor source)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
submodules: recursive

Comment on lines +86 to +90

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n .github/workflows/collision-monitor-integration.yaml

Repository: PickNikRobotics/moveit_pro_example_ws

Length of output: 8394


Disable persisted Git credentials for checkout.

This workflow does not perform any git operations after checkout (no push, pull, or clone), so persisting credentials is unnecessary security exposure.

Proposed fix
       - name: Checkout example_ws (usage example only — NO monitor source)
         uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
         with:
           submodules: recursive
+          persist-credentials: false
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Checkout example_ws (usage example only — NO monitor source)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
submodules: recursive
- name: Checkout example_ws (usage example only — NO monitor source)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
submodules: recursive
persist-credentials: false
🧰 Tools
🪛 zizmor (1.25.2)

[warning] 86-89: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/collision-monitor-integration.yaml around lines 86 - 90,
Disable credential persistence in the checkout step named "Checkout example_ws
(usage example only — NO monitor source)" by adding the persist-credentials
parameter set to false in the with block of the actions/checkout action. This
prevents unnecessary retention of Git credentials since no git operations occur
after this checkout step.

Source: Linters/SAST tools

- name: Detect bundle registry credentials
id: regcreds
env:
REG_USER: ${{ secrets.BUNDLE_REGISTRY_USERNAME }}
run: |
set -euo pipefail
if [ -n "${REG_USER}" ]; then
echo "have_creds=true" >> "${GITHUB_OUTPUT}"
else
echo "have_creds=false" >> "${GITHUB_OUTPUT}"
fi

- name: Log in to the bundle image registry
if: ${{ steps.regcreds.outputs.have_creds == 'true' }}
uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772 # v3.4.0
with:
registry: ${{ secrets.BUNDLE_REGISTRY }}
username: ${{ secrets.BUNDLE_REGISTRY_USERNAME }}
password: ${{ secrets.BUNDLE_REGISTRY_PASSWORD }}

- name: Pull the licensed monitor bundle image
env:
BUNDLE_IMAGE: ${{ steps.inputs.outputs.bundle_image }}
run: |
set -euo pipefail
echo "Pulling pre-bundled licensed monitor image (no source build): ${BUNDLE_IMAGE}"
docker pull "${BUNDLE_IMAGE}"

- name: Build the example_ws overlay on top of the bundle image
env:
BUNDLE_IMAGE: ${{ steps.inputs.outputs.bundle_image }}
run: |
set -euo pipefail
# Layer the public example workspace (lab_sim + lab_sim_collision_monitor)
# on top of the licensed bundle. The bundle already contains the built,
# licensed collision_monitor overlay; we only build the example packages.
# NOTE: monitor source is never present here — only the prebuilt bundle.
docker build \
--build-arg "MOVEIT_PRO_BASE_IMAGE=${BUNDLE_IMAGE}" \
-f ./Dockerfile \
-t lab-sim-with-monitor:ci \
.

- name: Run the Layer-2 parallel-monitor test
env:
BUNDLE_IMAGE: lab-sim-with-monitor:ci
MOVEIT_LICENSE_KEY: ${{ secrets.STUDIO_CI_LICENSE_KEY }}
run: |
set -euo pipefail
# Headless, license-keyed run of lab_sim + the licensed monitor.
# Coarsen the MuJoCo timestep on CI (matches ci.yaml) so the heavier
# 3.6.0 solver stays at-or-under realtime on the runner.
docker run --rm \
-e MOVEIT_CONFIG_PACKAGE=lab_sim \
-e MOVEIT_LICENSE_KEY="${MOVEIT_LICENSE_KEY}" \
-e MUJOCO_CI_TIMESTEP=0.003 \
"${BUNDLE_IMAGE}" \
bash -lc '
set -euo pipefail
source /opt/ros/humble/setup.bash
source "${USER_WS}/install/setup.bash"
colcon test \
--packages-select lab_sim_collision_monitor \
--executor sequential \
--event-handlers console_direct+
colcon test-result --verbose
'
Comment on lines +143 to +157

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /tmp && find . -name "collision-monitor-integration.yaml" -type f 2>/dev/null | head -5

Repository: PickNikRobotics/moveit_pro_example_ws

Length of output: 63


🏁 Script executed:

# First, let's locate the workflow file in the repository
git ls-files | grep -i collision-monitor-integration

Repository: PickNikRobotics/moveit_pro_example_ws

Length of output: 134


🏁 Script executed:

# Get the workflow file and examine it around the specified lines
cat -n .github/workflows/collision-monitor-integration.yaml | sed -n '130,175p'

Repository: PickNikRobotics/moveit_pro_example_ws

Length of output: 1842


Add volume mount and copy test results from container to host filesystem.

Test results written inside the container during colcon test are lost when docker run --rm exits, so the artifact upload step has nothing to collect. Bind mount the host workspace and copy results out after test completion.

Proposed fix
       - name: Run the Layer-2 parallel-monitor test
         env:
           BUNDLE_IMAGE: lab-sim-with-monitor:ci
           MOVEIT_LICENSE_KEY: ${{ secrets.STUDIO_CI_LICENSE_KEY }}
         run: |
           set -euo pipefail
+          mkdir -p build/lab_sim_collision_monitor/test_results log/latest_test
           # Headless, license-keyed run of lab_sim + the licensed monitor.
           # Coarsen the MuJoCo timestep on CI (matches ci.yaml) so the heavier
           # 3.6.0 solver stays at-or-under realtime on the runner.
           docker run --rm \
+            -v "${PWD}:/host_ws" \
             -e MOVEIT_CONFIG_PACKAGE=lab_sim \
             -e MOVEIT_LICENSE_KEY="${MOVEIT_LICENSE_KEY}" \
             -e MUJOCO_CI_TIMESTEP=0.003 \
             "${BUNDLE_IMAGE}" \
             bash -lc '
               set -euo pipefail
               source /opt/ros/humble/setup.bash
               source "${USER_WS}/install/setup.bash"
               colcon test \
                 --packages-select lab_sim_collision_monitor \
                 --executor sequential \
                 --event-handlers console_direct+
               colcon test-result --verbose
+              cp -r "${USER_WS}/build/lab_sim_collision_monitor/test_results/." \
+                /host_ws/build/lab_sim_collision_monitor/test_results/ || true
+              cp -r "${USER_WS}/log/latest_test/." /host_ws/log/latest_test/ || true
             '

Also applies to: 159-167

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/collision-monitor-integration.yaml around lines 143 - 157,
The docker run command in the colcon test execution is losing test results
because the container is being removed (--rm flag) after it exits, making
artifacts unavailable for the subsequent upload step. Add a volume mount flag to
bind the host workspace directory to the container filesystem so that test
results are accessible from the host after the container exits. Additionally,
ensure that the colcon test results are being written to a location within the
mounted volume so they persist after the container terminates and can be
collected by the artifact upload step.


- name: Upload test results
if: ${{ always() }}
uses: actions/upload-artifact@018cc2cf5baa6db3ef3c5f8a56943fffe632ef53 # v4.6.2
with:
name: collision-monitor-integration-results
path: |
build/lab_sim_collision_monitor/test_results/**
log/latest_test/**
if-no-files-found: ignore
108 changes: 80 additions & 28 deletions src/hangar_sim/objectives/ml_move_boxes_to_loading_zone.xml
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
<?xml version="1.0" encoding="UTF-8" ?>
<root BTCPP_format="4" main_tree_to_execute="ML Move Boxes to Loading Zone">
<!--//////////-->
<BehaviorTree
ID="ML Move Boxes to Loading Zone"
_description="Move all of the boxes into the loading zone using ML perception."
Expand All @@ -14,33 +13,85 @@
orientation_xyzw="0;1.0;0;0"
position_xyz="-4.5;8.4;0.5"
/>
<Decorator ID="RetryUntilSuccessful" num_attempts="1000">
<Decorator ID="ForceFailure">
<!--Initialize the drained signal and the per-iteration pick flags.

`drained := false` is the load-bearing one — if the very first iteration of
the loop below fails before the in-loop `Script(drained := ...)` runs, the
post-loop `ScriptCondition(drained)` would otherwise read an undefined value.

`picked_at_wp1 := false` / `picked_at_wp2 := false` are defensive. Each pick
waypoint Subtree resets its `picked_box` output port at the top of its own
body and again before wp2 below, so the in-loop reads are always fresh; this
outer init only matters if the Subtree fails before either of those resets
runs.

Compatibility note: this Objective relies on three runtime contracts that the
loaded `moveit_pro` build must satisfy — `ScriptCondition` (built-in BT.CPP 4),
`GetMasks2DFromExemplar` (SAM3 multimodal segmenter, 9.3+) which exposes the
`mask_count` output port the inner Subtree uses, `RepeatUnlessFailureEachTick`
(9.3+), and BT.CPP 4's SKIPPED-status propagation through `_skipIf`-decorated
nodes and parent Sequences. Older runtimes will either fail to load the tree
or behave incorrectly on the `_skipIf` paths.-->
<Action
ID="Script"
code="picked_at_wp1 := false; picked_at_wp2 := false; drained := false"
/>
<!--Loop the pick cycle until the scene is drained (drained=true), and surface
real pick errors as Objective FAILURE.

Each iteration: visit `View Boxes`, then `View Boxes 2` (only if the first
did not pick — saves a redundant scan once the first waypoint is the
productive one). Each Subtree returns SUCCESS with `picked_box` set true
or false on the happy path, and FAILURE only on real pick errors (the
inner Subtree's `Inverter > ForEach > Inverter` retains the existing "try
each mask, succeed if any" resilience — see that Subtree for the per-mask
FAILURE caveat).

The Script step computes whether anything was picked this cycle; the inner
`ScriptCondition(!drained)` converts a "nothing picked" cycle into FAILURE
that breaks the `RepeatUnlessFailureEachTick(-1)` loop.

The outer Fallback then distinguishes the two FAILURE causes of the loop:
- drained=true → expected exit, the post-loop ScriptCondition succeeds → SUCCESS.
- drained=false → real pick error propagated from a Subtree, the post-loop
ScriptCondition also fails → Objective FAILURE.-->
<Control ID="Fallback">
<Decorator ID="RepeatUnlessFailureEachTick" num_cycles="-1">
<Control ID="Sequence">
<Control ID="Fallback">
<SubTree
ID="Move Boxes to Loading Zone Start from Waypoint"
waypoint_name="View Boxes"
_collapsed="true"
threshold="0.25"
prompt="a small brown box"
negative_prompts="yellow ladder;"
prompts="small boxes"
place_pose="{place_pose}"
erosion_size="10"
/>
<SubTree
ID="Move Boxes to Loading Zone Start from Waypoint"
waypoint_name="View Boxes 2"
place_pose="{place_pose}"
threshold="0.4"
_collapsed="true"
prompts="small boxes"
negative_prompts="gray;gray;rails"
erosion_size="2"
prompt="{prompt}"
/>
</Control>
<SubTree
ID="Move Boxes to Loading Zone Start from Waypoint"
_collapsed="true"
waypoint_name="View Boxes"
confidence_threshold="0.25"
text_prompt="a single small orange cube box"
place_pose="{place_pose}"
picked_box="{picked_at_wp1}"
/>
<!--Reset picked_at_wp2 before the (possibly-skipped) wp2 SubTree so the
"did wp2 contribute this iteration?" reading is never stale from a prior
iteration. Without this, `_skipIf="picked_at_wp1"` would leave the wp2
SubTree's output port unwritten and the next Script would OR against a
stale value. The drained computation happens to be correct today (wp1
picked, so the OR short-circuits true), but any future read of
picked_at_wp2 — telemetry, an extra waypoint, etc. — would see stale
data and the invariant "picked_at_wpN iff wpN picked in this iteration"
would silently break.-->
<Action ID="Script" code="picked_at_wp2 := false" />
<SubTree
ID="Move Boxes to Loading Zone Start from Waypoint"
_collapsed="true"
_skipIf="picked_at_wp1"
waypoint_name="View Boxes 2"
confidence_threshold="0.25"
text_prompt="a single small orange cube box"
place_pose="{place_pose}"
picked_box="{picked_at_wp2}"
/>
<Action
ID="Script"
code="drained := !(picked_at_wp1 || picked_at_wp2)"
/>
<Action ID="ScriptCondition" code="!drained" />
<Action
ID="TransformPose"
input_pose="{place_pose}"
Expand All @@ -49,7 +100,8 @@
/>
</Control>
</Decorator>
</Decorator>
<Action ID="ScriptCondition" code="drained" />
</Control>
</Control>
</BehaviorTree>
<TreeNodesModel>
Expand Down
20 changes: 18 additions & 2 deletions src/hangar_sim/objectives/move_boxes_looping.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,27 @@
_skipIf="false"
ID="Sequence"
>
<!--Loop the drain-and-reset cycle and surface real pick errors as Objective
FAILURE.

Each iteration: run `ML Move Boxes to Loading Zone` (returns SUCCESS once
the scene is drained, FAILURE on a real pick error), then `Reset MuJoCo Sim`
to repopulate boxes for the next iteration.

`RepeatUnlessFailureEachTick` (vs. the previous `KeepRunningUntilFailure`)
propagates a child FAILURE up as Objective FAILURE instead of converting it
to SUCCESS. The old decorator made a real pick error indistinguishable from
a clean stop, which masked regressions in soak runs; this Objective is meant
to be a torture test, so a real failure should fail loudly.

Compatibility note: `RepeatUnlessFailureEachTick` requires moveit_pro 9.3+;
older runtimes will fail to load this Objective.-->
<Decorator
ID="KeepRunningUntilFailure"
name="KeepRunningUntilFailure"
ID="RepeatUnlessFailureEachTick"
name="RepeatUnlessFailureEachTick"
_collapsed="false"
_skipIf="false"
num_cycles="-1"
>
<Control
ID="Sequence"
Expand Down
Loading