Conversation
- RateLimiter component added between HostScanner and Output - Metrics added to monitor dropped events - Tests added to confirm rate limiting behavior
ovalenti
left a comment
There was a problem hiding this comment.
Thanks for the swift work on this one 👍
fact/src/config/mod.rs
Outdated
| /// means unlimited (no throttling). | ||
| /// | ||
| /// Default value is 0 (unlimited) | ||
| #[arg(long, short = 't', env = "FACT_RATE_LIMIT")] |
There was a problem hiding this comment.
Is there a particular reason to have an explicit short parameter name ?
There was a problem hiding this comment.
ahh this is left over from my first attempt. I'll remove the explicit name and let clap figure it out.
There was a problem hiding this comment.
having said that, an automatic name would clash with ringbuf_size. Perhaps -l (meaning limit)?
| metrics: EventCounter, | ||
| ) -> anyhow::Result<Self> { | ||
| let limiter = Self::build_limiter(*rate_config.borrow()); | ||
| let (tx, _) = broadcast::channel(100); |
There was a problem hiding this comment.
I am curious to know whether the channel starts blocking when it is full, or if it drops events ?
Is this what triggers the RecvError::Lagged(..) type of messages ?
If we can loose messages this way, maybe we should count those occurrences ?
There was a problem hiding this comment.
Good point. Back pressure will cause events to drop. I'll add a metric for this.
fact/Cargo.toml
Outdated
| env_logger = { workspace = true } | ||
| glob = { workspace = true } | ||
| globset = { workspace = true } | ||
| governor = { workspace = true} |
There was a problem hiding this comment.
| governor = { workspace = true} | |
| governor = { workspace = true } |
tests/test_rate_limit.py
Outdated
| assert total_accounted >= num_files - 5, \ | ||
| f'Too many unaccounted events: received={received_count}, dropped={dropped_count}, created={num_files}' |
There was a problem hiding this comment.
I don't understand the goal of this assertion. Ideally, total_accounted should be as close as possible to num_files, or I am missing something.
There was a problem hiding this comment.
I'm probably being a bit too defensive here. The idea was to have a timing tolerance to avoid flakes but perhaps a precise assertion is clearer
Description
Checklist
Automated testing
If any of these don't apply, please comment below.
Testing Performed
Integration tests cover the most common cases - unlimited rate limiting (send everything) and limited rate, confirming that the limit is restricting the event flow.