feat(shared-runtime)!: SharedRuntime Borrowed & Owned mode#2061
feat(shared-runtime)!: SharedRuntime Borrowed & Owned mode#2061Aaalibaba42 wants to merge 13 commits into
Conversation
b0f4da8 to
50536ba
Compare
📚 Documentation Check Results📦
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 13d5a971de
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
🔒 Cargo Deny Results📦
|
🎉 All green!🧪 All tests passed 🎯 Code Coverage (details) 🔗 Commit SHA: a0c98a9 | Docs | Datadog PR Page | Give us feedback! |
64603e7 to
4e49301
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4e49301fa3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2061 +/- ##
==========================================
+ Coverage 73.47% 73.57% +0.09%
==========================================
Files 475 477 +2
Lines 78999 79219 +220
==========================================
+ Hits 58048 58283 +235
+ Misses 20951 20936 -15
🚀 New features to boost your workflow:
|
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
|
It looks like the description is a bit off mentioning fixes and bugs that seem to be coming from previous commits of the branch and not from the current state of the feature |
I tried letting a cheap AI read the commits and make a description, I don't think it's too valuable and can even be detrimental, as it were. |
|
I rewrote it cleaner and straighter to the point |
There was a problem hiding this comment.
The code LGTM overall, but I'm not entirely sure to fully grasp in which context (sync or async) is this API supposed to be consumed. In particular we use standard sync thread mechanisms (condvars, mutexes) in an async context, which could be problematic (even in a multi-thread context, a condvar will block the current executor thread, which is not ideal).
On a different front, I remember @paullegranddc said he would rather spawn our own shared runtime in a separate thread than hooking in the client's runtime. I don't have a strong opinion myself, but I wonder if this discussion had a conclusion.
df92857 to
9d305e7
Compare
9d305e7 to
6b5b4db
Compare
6b5b4db to
5b21367
Compare
|
Since the implementation was overhauled, I'll just resolve past conversations here, as they don't fit the current implementation. |
5b21367 to
61fbe96
Compare
yannham
left a comment
There was a problem hiding this comment.
Left a few remarks but otherwise looks good (only skimmed through the native implementation, as I suppose it's just the original shared runtime moved)👍
ff3c135 to
eb913a1
Compare
eb913a1 to
ccae7a5
Compare
…afety instead of the memory model. Dropping currentThread "Owned" runtime for new sementics does not need it
|
As it is currently, I don't like how wasm is handled. Sementically it's attached to the ForkSafeRuntime, which is straight up a lie, but changing that would likely require a new implementer of the SharedRuntime trait. I'll sleep on it I'm not sure of the best design |
0517c21 to
9fffdcb
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9fffdcbe44
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
f292e28 to
10554e2
Compare
890b2ff to
a91e5c7
Compare
| pub struct DefaultExport<C, R> | ||
| where | ||
| C: HttpClientCapability + SleepCapability + MaybeSend + Sync + 'static, | ||
| R: SharedRuntime + std::fmt::Debug + Send + Sync + 'static, |
There was a problem hiding this comment.
| R: SharedRuntime + std::fmt::Debug + Send + Sync + 'static, | |
| R: SharedRuntime, |
Is there a reason to not make the SharedRuntime trait require these other trait itself ? Given that all 3 implementors already satisfy std::fmt::Debug + Send + Sync + 'static, I feel like this makes more sense
The trait definition could look like this:
pub trait SharedRuntime: std::fmt::Debug + Send + Sync + 'static {Also applies to HttpClientCapability.
| warn!( | ||
| "restart_on_fork is ignored on BasicRuntime: regular mode is not fork-safe; \ | ||
| use ForkSafeRuntime if you need fork hooks" | ||
| ); |
There was a problem hiding this comment.
Seems the kind of place a panic would be more appropriate for me: it's a developer error not a end-user configuration issue.
It should never happen for a end-user if the issue is spotted by the developer that misused libdatadog, and a panic is harder to miss than a warning.
| //! A shared tokio runtime for running background workers across multiple components. | ||
| //! | ||
| //! This crate provides [`SharedRuntime`], which owns a single tokio runtime and manages | ||
| //! [`PausableWorker`]s on it. Components such as the trace exporter can share one runtime | ||
| //! instead of each creating their own, reducing thread and resource overhead. | ||
| //! This crate provides three implementations of [`SharedRuntime`], distinguished by their | ||
| //! threading model and fork-safety guarantees: | ||
| //! | ||
| //! [`SharedRuntime`] also provides fork-safety hooks (`before_fork`, `after_fork_parent`, | ||
| //! `after_fork_child`) that pause and restart workers around `fork()` calls, preventing | ||
| //! deadlocks in child processes. | ||
| //! - [`ForkSafeRuntime`] *(native only)* — owns a multi-thread tokio runtime and exposes the full | ||
| //! fork protocol ([`ForkSafeRuntime::before_fork`] / [`ForkSafeRuntime::after_fork_parent`] / | ||
| //! [`ForkSafeRuntime::after_fork_child`]) that pauses and restarts workers around `fork()` calls, | ||
| //! preventing deadlocks in child processes. Also provides synchronous | ||
| //! [`ForkSafeRuntime::block_on`] and [`ForkSafeRuntime::shutdown`]. | ||
| //! - [`BasicRuntime`] *(native only)* — the regular (non-fork-safe) variant. Its internal tokio | ||
| //! runtime can be library-built ([`BasicRuntime::new`] / [`BasicRuntime::with_worker_threads`]) | ||
| //! or supplied by the caller as an `Arc<tokio::runtime::Runtime>` | ||
| //! ([`BasicRuntime::from_handle`]). | ||
| //! - [`LocalRuntime`] *(wasm32 only)* — single-threaded local executor; spawns workers via | ||
| //! `wasm_bindgen_futures::spawn_local`. No fork protocol, no `block_on`, async-only. | ||
| //! | ||
| //! Components such as the trace exporter can share one runtime instead of each creating their | ||
| //! own, reducing thread and resource overhead. |
There was a problem hiding this comment.
| //! A shared tokio runtime for running background workers across multiple components. | |
| //! | |
| //! This crate provides [`SharedRuntime`], which owns a single tokio runtime and manages | |
| //! [`PausableWorker`]s on it. Components such as the trace exporter can share one runtime | |
| //! instead of each creating their own, reducing thread and resource overhead. | |
| //! This crate provides three implementations of [`SharedRuntime`], distinguished by their | |
| //! threading model and fork-safety guarantees: | |
| //! | |
| //! [`SharedRuntime`] also provides fork-safety hooks (`before_fork`, `after_fork_parent`, | |
| //! `after_fork_child`) that pause and restart workers around `fork()` calls, preventing | |
| //! deadlocks in child processes. | |
| //! - [`ForkSafeRuntime`] *(native only)* — owns a multi-thread tokio runtime and exposes the full | |
| //! fork protocol ([`ForkSafeRuntime::before_fork`] / [`ForkSafeRuntime::after_fork_parent`] / | |
| //! [`ForkSafeRuntime::after_fork_child`]) that pauses and restarts workers around `fork()` calls, | |
| //! preventing deadlocks in child processes. Also provides synchronous | |
| //! [`ForkSafeRuntime::block_on`] and [`ForkSafeRuntime::shutdown`]. | |
| //! - [`BasicRuntime`] *(native only)* — the regular (non-fork-safe) variant. Its internal tokio | |
| //! runtime can be library-built ([`BasicRuntime::new`] / [`BasicRuntime::with_worker_threads`]) | |
| //! or supplied by the caller as an `Arc<tokio::runtime::Runtime>` | |
| //! ([`BasicRuntime::from_handle`]). | |
| //! - [`LocalRuntime`] *(wasm32 only)* — single-threaded local executor; spawns workers via | |
| //! `wasm_bindgen_futures::spawn_local`. No fork protocol, no `block_on`, async-only. | |
| //! | |
| //! Components such as the trace exporter can share one runtime instead of each creating their | |
| //! own, reducing thread and resource overhead. | |
| //! A shared tokio runtime for running background workers across multiple components. | |
| //! | |
| //! Components such as the trace exporter can share one runtime instead of each creating their | |
| //! own, reducing thread and resource overhead. | |
| //! | |
| //! # Choosing a runtime | |
| //! | |
| //! | Runtime | Target | Threads | Fork-safe | `block_on` | When to use | | |
| //! |---------------------|--------|---------|---------------------|------------|---------------------------------------------------------------------------------------------------------------------------------------------| | |
| //! | [`ForkSafeRuntime`] | native | multi | yes — full protocol | yes | Default for native code that may run in a forking process (e.g. Ruby, Python runtimes). | | |
| //! | [`BasicRuntime`] | native | multi | no | yes | Native code where `fork()` is not a concern; optionally share an existing `Arc<tokio::runtime::Runtime>` via [`BasicRuntime::from_handle`]. | | |
| //! | [`LocalRuntime`] | wasm32 | single | n/a | no | WebAssembly; spawns via `wasm_bindgen_futures::spawn_local`. | | |
| //! | |
| //! ## Fork protocol ([`ForkSafeRuntime`] only) | |
| //! | |
| //! Call these around every `fork()` to prevent deadlocks in child processes: | |
| //! | |
| //! 1. [`ForkSafeRuntime::before_fork`] — pauses workers | |
| //! 2. `fork()` | |
| //! 3. parent: [`ForkSafeRuntime::after_fork_parent`] — resumes workers | |
| //! 4. child: [`ForkSafeRuntime::after_fork_child`] — restarts workers on a fresh runtime | |
Easier to read as a table than a dense list imo, especially once rendered. (IA generated table please double check)
What ?
Split
SharedRuntimeinto a trait with three implementations:ForkSafeRuntime(native only): owns a multi-thread tokio runtime, supports forkhooks (
before_fork/after_fork_parent/after_fork_child) and synchronousshutdown.BasicRuntime(native only): wraps a library-built or caller-providedArc<tokio::runtime::Runtime>, no fork hooks, no synchronous shutdown.LocalRuntime(wasm32 only): single-threaded executor that spawns workers viawasm_bindgen_futures::spawn_local, no fork protocol, async-only.BlockingRuntime(native only) is a sub-trait ofSharedRuntimethat addsblock_on; implemented byForkSafeRuntimeandBasicRuntime. Sync facades on thetrace exporter (
send/shutdown/build) bound their runtime parameter on it.TraceExporteris generic over the runtime —TraceExporter<C, R: SharedRuntime>— so callers pick
ForkSafeRuntime,BasicRuntime, orLocalRuntimeexplicitly. TheFFI pins
R = ForkSafeRuntimefor ABI stability.Why ?
Some callers already have a tokio runtime they want to reuse instead of letting
libdatadog create its own. The previous
SharedRuntimeonly supported thefork-safe owned case. Splitting into distinct types makes the lifecycle and
fork-safety contract explicit and lets callers pick the model that matches their
environment.
The wasm32 target previously shared a file with the native implementation behind
#[cfg]walls; it is now a dedicated module with a clean separation.How ?
SharedRuntimetrait (new,spawn_worker,shutdown_async).BlockingRuntime: SharedRuntimesub-trait withblock_on.ForkSafeRuntime; fork hooks and syncshutdownare inherent methods (not part of the trait).BasicRuntime::with_worker_threads(library-built) andBasicRuntime::from_handle(Arc<Runtime>)(caller-provided).spawn_localpath fromfork_safe.rsinto its ownlocal.rsmodule asLocalRuntime.TraceExporter<C, R>andTraceExporterBuilder<R>generic over the runtime;sync entry points additionally require
R: BlockingRuntime.Defaultfor thebuilder is impl'd only for
R = ForkSafeRuntimeon native soTraceExporterBuilder::default()resolves unambiguously.restart_on_fork = trueis silently ignored (with awarn!) onBasicRuntimeand
LocalRuntimesince they do not implement a fork protocol.R = ForkSafeRuntimein itsTraceExportertype alias.Additional Notes
libdd_shared_runtime:SharedRuntimeisnow a trait; use
ForkSafeRuntime::with_worker_threads(1)(or the trait methodSharedRuntime::new()with the trait in scope) instead ofSharedRuntime::new().SharedRuntimetoForkSafeRuntime.