Today the rosbag black-box default (snapshots.rosbag.topics: "config") resolves topics from snapshots.yaml; with no config file it subscribes to nothing, so a confirmed fault produces an empty bag. The existing "all" mode records the whole graph but is blunt (not fault-relevant), subscribes with a hardcoded SensorDataQoS (BEST_EFFORT / volatile, rosbag_capture.cpp:297) so RELIABLE / transient_local topics are captured unfaithfully, and the ring buffer is time-bounded only (prune_buffer, rosbag_capture.cpp:403) with no memory cap, so recording high-rate topics grows RAM without bound.
Make the black-box give fault-relevant value out of the box with zero per-stack config, while keeping full manual control.
Scope
- New default topic-selection mode (entity-scoped): buffer continuously, and on fault confirmation write only the topics published/subscribed by the faulting entity (resolved from the fault
source_id node FQN via get_publisher_names_and_types_by_node / get_subscription_names_and_types_by_node) plus always-on context (/tf, /tf_static). Keeps the pre-fault ring while keeping the on-disk bag small and relevant.
- QoS fidelity: subscribe with each topic's offered QoS instead of a fixed
SensorDataQoS, matching reliability/durability/depth per topic (the pattern already exists in snapshot_capture.cpp:199 via get_publishers_info_by_topic). Define a policy for topics whose publishers offer differing QoS.
- Bound the ring buffer by memory: add a byte/size cap on top of the time window, and ship sensible default
exclude_topics patterns for high-bandwidth topics (image / compressed / points / depth).
- Preserve all existing modes (
config, explicit, all, comma-separated list) and the include_topics / exclude_topics overrides, which keep applying on top of any mode. Manual modes stay literal (no entity flush-filter). Existing topics: config + snapshots.yaml behaviour is unchanged.
Acceptance
- On a stack with no
snapshots.yaml, a confirmed fault produces a non-empty bag containing the faulting entity's topics plus /tf, with no manual configuration.
- RELIABLE and transient_local topics are captured with matching QoS (verified on a stack that uses them, e.g. Nav2
/tf, action status).
- Ring buffer memory stays bounded while recording a high-rate topic.
topics: config with an existing snapshots.yaml behaves exactly as before; include_topics / exclude_topics still add/remove on top of the default mode.
Relates to #424.
Today the rosbag black-box default (
snapshots.rosbag.topics: "config") resolves topics fromsnapshots.yaml; with no config file it subscribes to nothing, so a confirmed fault produces an empty bag. The existing"all"mode records the whole graph but is blunt (not fault-relevant), subscribes with a hardcodedSensorDataQoS(BEST_EFFORT / volatile,rosbag_capture.cpp:297) so RELIABLE / transient_local topics are captured unfaithfully, and the ring buffer is time-bounded only (prune_buffer,rosbag_capture.cpp:403) with no memory cap, so recording high-rate topics grows RAM without bound.Make the black-box give fault-relevant value out of the box with zero per-stack config, while keeping full manual control.
Scope
source_idnode FQN viaget_publisher_names_and_types_by_node/get_subscription_names_and_types_by_node) plus always-on context (/tf,/tf_static). Keeps the pre-fault ring while keeping the on-disk bag small and relevant.SensorDataQoS, matching reliability/durability/depth per topic (the pattern already exists insnapshot_capture.cpp:199viaget_publishers_info_by_topic). Define a policy for topics whose publishers offer differing QoS.exclude_topicspatterns for high-bandwidth topics (image / compressed / points / depth).config,explicit,all, comma-separated list) and theinclude_topics/exclude_topicsoverrides, which keep applying on top of any mode. Manual modes stay literal (no entity flush-filter). Existingtopics: config+snapshots.yamlbehaviour is unchanged.Acceptance
snapshots.yaml, a confirmed fault produces a non-empty bag containing the faulting entity's topics plus/tf, with no manual configuration./tf, action status).topics: configwith an existingsnapshots.yamlbehaves exactly as before;include_topics/exclude_topicsstill add/remove on top of the default mode.Relates to #424.