Summary
Today every gateway test pulls rclcpp and the ament environment because the manager classes (DataAccessManager, OperationManager, ConfigurationManager, FaultManager facade, LogManager, TriggerManager) take rclcpp::Node * directly and route I/O inline. That coupling makes unit tests heavy (multi-second startup), forces every test fixture through MultiThreadedExecutor lifecycle, and makes test failures timing-sensitive - flaky process spawned but graph not yet seen and subscription destroyed mid-callback patterns appear repeatedly in CI history.
This issue tracks the next pass after #391: each manager keeps its routing and business logic but gains a register_provider(...) injection point, with the ROS-specific I/O extracted into Ros2*Provider classes living in the adapter layer. After the change, manager unit tests compose with mock providers and link only against gateway_core (the neutral library introduced in #391), which removes rclcpp from the unit-test link line and eliminates the executor-lifecycle setup that is the root cause of intermittent failures.
Proposed solution (optional)
Concretely:
- Six managers gain provider injection:
Data, Operation, Configuration, Fault, Log, Trigger. Each manager moves to core/managers/. The ROS-specific default behaviour is extracted into Ros2*Provider implementations under src/ros2/providers/ (or equivalent path in the adapter layer), registered statically by gateway_node at startup. Plugins continue to register per-entity providers via PluginManager exactly as today.
RuntimeDiscoveryStrategy becomes Ros2RuntimeIntrospection : IntrospectionProvider. The discovery framework in core/ then routes through the existing IntrospectionProvider chain and stops referencing rclcpp::Node. HybridDiscoveryStrategy collapses into the standard MergePipeline configuration.
TypeIntrospection moves to ros2_medkit_serialization, where rosidl_typesupport_cpp and rosidl_typesupport_introspection_cpp already live. The gateway then depends on the serializer for type schemas instead of duplicating the rosidl bridge inside the gateway tree.
- Runtime discovery trigger switches from cyclic
wall_timer(refresh_interval_ms_) polling to event-driven graph_event->check_and_clear(). The polling timer remains as a low-frequency safety backstop. Reduces idle CPU when the graph is stable and improves new-node detection latency.
Out of scope (intentionally deferred): folding core/ into a separate colcon package - that is a layout change and does not affect the test-link-line goal.
Additional context (optional)
Outcome metrics expected after the work:
- Manager unit tests link only
gateway_core + GTest (no ament_target_dependencies), proven by extending test_gateway_core_smoke to include a manager headers and have a representative manager test compile against the mock-provider variant.
- Idle gateway process CPU usage drops measurably once the discovery refresh stops firing on a stable graph (sample with
top before/after on the demo nodes scenario).
- Existing test suite (~2500 unit, ~3200 integration, ~2600 clang-tidy) stays green without timing-sensitive sleeps in tests we touch.
- New-node detection latency (time from
ros2 run to gateway /apps listing the new entity) drops below 500 ms in steady-state operation.
Summary
Today every gateway test pulls
rclcppand the ament environment because the manager classes (DataAccessManager,OperationManager,ConfigurationManager,FaultManagerfacade,LogManager,TriggerManager) takerclcpp::Node *directly and route I/O inline. That coupling makes unit tests heavy (multi-second startup), forces every test fixture throughMultiThreadedExecutorlifecycle, and makes test failures timing-sensitive - flakyprocess spawned but graph not yet seenandsubscription destroyed mid-callbackpatterns appear repeatedly in CI history.This issue tracks the next pass after #391: each manager keeps its routing and business logic but gains a
register_provider(...)injection point, with the ROS-specific I/O extracted intoRos2*Providerclasses living in the adapter layer. After the change, manager unit tests compose with mock providers and link only againstgateway_core(the neutral library introduced in #391), which removesrclcppfrom the unit-test link line and eliminates the executor-lifecycle setup that is the root cause of intermittent failures.Proposed solution (optional)
Concretely:
Data,Operation,Configuration,Fault,Log,Trigger. Each manager moves tocore/managers/. The ROS-specific default behaviour is extracted intoRos2*Providerimplementations undersrc/ros2/providers/(or equivalent path in the adapter layer), registered statically bygateway_nodeat startup. Plugins continue to register per-entity providers viaPluginManagerexactly as today.RuntimeDiscoveryStrategybecomesRos2RuntimeIntrospection : IntrospectionProvider. The discovery framework incore/then routes through the existingIntrospectionProviderchain and stops referencingrclcpp::Node.HybridDiscoveryStrategycollapses into the standardMergePipelineconfiguration.TypeIntrospectionmoves toros2_medkit_serialization, whererosidl_typesupport_cppandrosidl_typesupport_introspection_cppalready live. The gateway then depends on the serializer for type schemas instead of duplicating the rosidl bridge inside the gateway tree.wall_timer(refresh_interval_ms_)polling to event-drivengraph_event->check_and_clear(). The polling timer remains as a low-frequency safety backstop. Reduces idle CPU when the graph is stable and improves new-node detection latency.Out of scope (intentionally deferred): folding
core/into a separate colcon package - that is a layout change and does not affect the test-link-line goal.Additional context (optional)
Outcome metrics expected after the work:
gateway_core+ GTest (noament_target_dependencies), proven by extendingtest_gateway_core_smoketo include a manager headers and have a representative manager test compile against the mock-provider variant.topbefore/after on the demo nodes scenario).ros2 runto gateway/appslisting the new entity) drops below 500 ms in steady-state operation.