Skip to content

feat: single bringup launch for gateway + fault_manager + bridges #424

@mfaferek93

Description

@mfaferek93

Today each medkit node ships its own single-node launch and nothing composes them: a user must run the gateway, the fault_manager, and each bridge as separate processes and know the topic/service wiring between them. No launch uses IncludeLaunchDescription, and fault_manager.launch.py passes zero parameters, so snapshot config is never loaded and rosbag/black-box capture is off on the default path.

Add a single bringup.launch.py (in ros2_medkit_gateway or a dedicated bringup package) that composes the full local stack in one command.

Scope

  • Compose the existing per-package launch files via IncludeLaunchDescription: gateway, fault_manager, and the bridges (diagnostic / log / action_status).
  • One shared params file that wires the fault_manager.namespace the gateway resolves, the snapshot + rosbag capture config, and sane fault defaults.
  • Launch args to toggle each component and to set bind host / port.
  • Quickstart docs in Install / Run / Verify form (Verify = induce a fault and see it).

Defaults shipped by the params file

The per-node defaults are conservative and leave the headline value off for a drop-in stack. The shared params turn it on:

  • healing_enabled:=true so recoveries clear faults (action SUCCEEDED -> HEALED); off by default leaves the heal path inert.
  • snapshots.rosbag.enabled:=true for black-box capture (crash-safe per fix(fault_manager): make rosbag capture enablement crash-safe #425).
  • Sane fault_reporter.local_filtering.* and severity_floor (ERROR confirms, WARN debounced), documented as tunable. entity_thresholds.config_file is available for per-entity overrides.

Known limit (not a blocker here): log faults have no recovery signal, so LOG_* faults do not auto-clear on a long run even with healing on.

Multi-domain note: a robot spanning multiple ROS_DOMAIN_ID / RMW_IMPLEMENTATION runs one bringup per domain, federated by the gateway's peer aggregation (aggregation.enabled + peer_urls or mDNS) into one tree / API / UI. Document this topology in the quickstart.

Acceptance

  • ros2 launch <pkg> bringup.launch.py starts gateway + fault_manager + bridges as one process group.
  • A fault induced on a connected stack is visible via the gateway API with a snapshot, with no further manual wiring.
  • Default params produce a confirmed fault plus a non-empty black-box on the happy path, and every value is overridable from a user params file.

Depends on the bridge packages (#420, #421).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions