Fix bug with topic statistics for intra process comms#2913
Fix bug with topic statistics for intra process comms#2913roman-y-wu wants to merge 3 commits intoros2:rollingfrom
Conversation
517782d to
dd92601
Compare
fujitatomoya
left a comment
There was a problem hiding this comment.
thanks for creating PR. a couple of comments.
- did you try to build with your local environment, it seems build is failing. see https://build.ros2.org/job/Rpr__rclcpp__ubuntu_noble_amd64/718/console
- test would be idea to make sure the statistics works okay. could you also add the test case with intra-process communication for
test_subscription_topic_statistics.cpp?
| if (subscription_topic_statistics_) { | ||
| now = std::chrono::system_clock::now(); | ||
| auto nanos = std::chrono::time_point_cast<std::chrono::nanoseconds>(now); | ||
| msg_info.source_timestamp = nanos.time_since_epoch().count(); |
There was a problem hiding this comment.
as we know, we do not have message info from rmw implementation with intra-process communication.
if we do this, the message age statistics will be always 0 because msg_info.source_timestamp is equal to time.
i understand we can generate the statistics with this change, but is this really useful for user application? i am not even sure about this. instead of generating fake statistics, maybe warning once that tells intra-process communication does not provide age statistics would be better for the user application? or maybe it is always age is zero?
| if (subscription_topic_statistics_) { | ||
| const auto nanos = std::chrono::time_point_cast<std::chrono::nanoseconds>(now); | ||
| const auto time = rclcpp::Time(nanos.time_since_epoch().count()); | ||
| subscription_topic_statistics_->handle_message(msg_info, time); |
There was a problem hiding this comment.
at least for period statistics, this can enables it to calculate intra and inter-process communication messages all together. i think that is the right thing to do.
Signed-off-by: Roman Wu <me@romanwu.com>
Signed-off-by: Roman Wu <me@romanwu.com>
Signed-off-by: Roman Wu <me@romanwu.com>
b1a9c6b to
7417b0a
Compare
|
Hi @romanwu10 You stated in your description that you did not use Generative AI for this PR:
However, it seems extremely similar to this PR I noticed on your private repo that explicitly states that it was generated through ChatGPT. |
|
Hi @fujitatomoya and @roman-y-wu just wondering if we can revive this PR/discussion a bit? I wanted to share some real-world validation of this fix in the hope of reviving it. We've been using this in a production context for about a month and it resolves a critical issue for us. Background/use-case Before this fix, all statistics from our lidar subscriptions when IPC was on reported NaN values, even if the sensor was publishing data and working normally. This made our monitoring tools unable to distinguish a healthy sensor from a dead one. What we did Our kilted port + update with the above is here: botsandus#1 Our Results Re: the concern about
Overall, IMO, thee alternative of reporting NaN is strictly worse for users trying to monitor systems health via the stats topics, and the fix here is more sensible behaviour for IPC use-cases. On the implementation We'd love to see this merged. Is there anything we can do to help move it forward, like additional tests,revised implementation, or anything else you folks might need? we are happy to assist where we can, and happy to validate on our fleets. cc: @tonynajjar |
Description
Fixes #2911
Is this user-facing behavior change?
Yes, this patch updates
SubscriptionIntraProcessto pass along the topic statistics pointer and call the statistics handler during intra-process message deliveries. This ensures that topic statistics—such as message age and publishing period—are still computed whenuse_intra_process_commsis enabled, which is important for users monitoring performance metrics.Did you use Generative AI?
No.
Additional Information
n/a