Skip to content

Add timeout to debugger captures#4003

Open
bwoebi wants to merge 1 commit into
masterfrom
bob/debugger-limit
Open

Add timeout to debugger captures#4003
bwoebi wants to merge 1 commit into
masterfrom
bob/debugger-limit

Conversation

@bwoebi

@bwoebi bwoebi commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Adding DD_DYNAMIC_INSTRUMENTATION_CAPTURE_TIMEOUT_MS config to enforce limits on capture times.

Signed-off-by: Bob Weinand <bob.weinand@datadoghq.com>
@bwoebi bwoebi requested a review from a team as a code owner June 22, 2026 11:17
@datadog-datadog-prod-us1-2

datadog-datadog-prod-us1-2 Bot commented Jun 22, 2026

Copy link
Copy Markdown

Pipelines  Tests

Fix all issues with BitsAI

⚠️ Warnings

🚦 69 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-php | ASAN test_c: [7.4, arm64]   View in Datadog   GitLab

🧪 3 Tests failed

All test failures are known flaky.

❄️ Known flaky: tmp/build_extension/tests/ext/ffe/remote_config_lifecycle.phpt (FFE Remote Config loads and removes UFC config) from PHP.tmp.build_extension.tests.ext.ffe   View in Datadog
002&#43; AddressSanitizer:DEADLYSIGNAL
003&#43; =================================================================
004&#43; ==4123==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0xf5fdbe6b4f08 bp 0xf5fdbd60a810 sp 0xf5fdbd60a6c0 T1)
005&#43; ==4123==The signal is caused by a READ memory access.
006&#43; ==4123==Hint: address points to the zero page.
007&#43; /usr/bin/llvm-symbolizer-20: error: &#39;linux-vdso.so.1&#39;: No such file or directory
002- loaded=true
003- has_config_after_add=true
004- success={&#34;valueJson&#34;:&#34;\&#34;blue\&#34;&#34;,&#34;variant&#34;:&#34;blue&#34;,&#34;allocationKey&#34;:&#34;alloc-string&#34;,&#34;reason&#34;:0,&#34;errorCode&#34;:0,&#34;doLog&#34;:true,&#34;providerState&#34;:[],&#34;errorMessage&#34;:null,&#34;hasConfig&#34;:null,&#34;configVersion&#34;:null}
005- removed=true
...

Not introduced in this PR.

❄️ Known flaky: tmp/build_extension/tests/ext/live-debugger/debugger_log_probe.phpt (Installing a live debugger log probe) from PHP.tmp.build_extension.tests.ext.live.debugger   View in Datadog
001&#43; AddressSanitizer:DEADLYSIGNAL
002&#43; =================================================================
003&#43; ==5685==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0xefd5362b4f08 bp 0xefd53520a810 sp 0xefd53520a6c0 T1)
004&#43; ==5685==The signal is caused by a READ memory access.
005&#43; ==5685==Hint: address points to the zero page.
006&#43; /usr/bin/llvm-symbolizer-20: error: &#39;linux-vdso.so.1&#39;: No such file or directory
007&#43;     #0 0xefd5362b4f08 in dd_sigvtalarm_handler tmp/build_extension/ext/remote_config.c:77:25
008&#43;     #1 0xf3d540bdd8f4  (linux-vdso.so.1&#43;0x8f4) (BuildId: 62742094c68a16b2584fdf48cdf77f12a7619a73)
009&#43;     #2 0xf3d53c8ce834 in __futex_abstimed_wait_common64 nptl/futex-internal.c:57:12
010&#43;     #3 0xf3d53c8ce834 in __futex_abstimed_wait_common nptl/futex-internal.c:87:9
...

Not introduced in this PR.

View all 3 test failures

DataDog/apm-reliability/dd-trace-php | ASAN test_c with multiple observers: [8.1]   View in Datadog   GitLab

DataDog/apm-reliability/dd-trace-php | ASAN test_c with multiple observers: [8.5]   View in Datadog   GitLab

View all 69 failed jobs.

❄️ 2 New flaky tests detected

tmp/build_extension/tests/ext/live-debugger/debugger_log_probe_capture_timeout.phpt (Live debugger log probe capture timeout with large data structure) from PHP.tmp.build_extension.tests.ext.live.debugger   View in Datadog
001&#43; AddressSanitizer:DEADLYSIGNAL
002&#43; =================================================================
003&#43; ==5375==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0xe7ba94ab4f08 bp 0xe7ba93a0a810 sp 0xe7ba93a0a6c0 T1)
001- bool(true)
002- bool(true)
003- bool(true)
004&#43; ==5375==The signal is caused by a READ memory access.
005&#43; ==5375==Hint: address points to the zero page.
006&#43; /usr/bin/llvm-symbolizer-20: error: &#39;linux-vdso.so.1&#39;: No such file or directory
007&#43;     #0 0xe7ba94ab4f08 in dd_sigvtalarm_handler tmp/build_extension/ext/remote_config.c:77:25
...

New test introduced in this PR is flaky.

tmp/build_extension/tests/ext/live-debugger/debugger_log_probe_capture_timeout.phpt (Live debugger log probe capture timeout with large data structure) from php.tmp.build_extension.tests.ext.live.debugger   View in Datadog
001&#43; AddressSanitizer:DEADLYSIGNAL
002&#43; =================================================================
003&#43; ==4286==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0xe373aa2b0444 bp 0xe373a91fa7f0 sp 0xe373a91fa6a0 T1)
001- bool(true)
002- bool(true)
003- bool(true)
004&#43; ==4286==The signal is caused by a READ memory access.
005&#43; ==4286==Hint: address points to the zero page.
006&#43; /usr/bin/llvm-symbolizer-20: error: &#39;linux-vdso.so.1&#39;: No such file or directory
007&#43;     #0 0xe373aa2b0444 in dd_sigvtalarm_handler tmp/build_extension/ext/remote_config.c:77:25
...

New test introduced in this PR is flaky.

View in Flaky Test Management

ℹ️ Info

🔄 Datadog auto-retried 2 jobs - 4 passed on retry View in Datadog

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 54.08% (+0.00%)

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 793daee | Docs | Datadog PR Page | Give us feedback!

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 793daeec61

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracer/live_debugger.c
usec = dd_find_lowest_dealine_timer();
#endif
struct itimerval it = {
.it_value = { .tv_sec = usec / 10000000, .tv_usec = usec % 1000000 },

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use microseconds-per-second for setitimer

On the non-Linux setitimer path, usec is already in microseconds, so tv_sec must divide by 1,000,000 rather than 10,000,000. When DD_DYNAMIC_INSTRUMENTATION_CAPTURE_TIMEOUT_MS is configured above 999 ms on macOS/BSD, values like 1000 ms or 2000 ms produce {0, 0} and disarm the timeout, while other multi-second values fire much too early; the same conversion should be fixed in the stop/re-arm paths as well.

Useful? React with 👍 / 👎.

@pr-commenter

pr-commenter Bot commented Jun 22, 2026

Copy link
Copy Markdown

Benchmarks [ tracer ]

Benchmark execution time: 2026-06-22 12:35:55

Comparing candidate commit 793daee in PR branch bob/debugger-limit with baseline commit ba2b056 in branch master.

Found 2 performance improvements and 0 performance regressions! Performance is the same for 191 metrics, 1 unstable metrics.

Explanation

This is an A/B test comparing a candidate commit's performance against that of a baseline commit. Performance changes are noted in the tables below as:

  • 🟩 = significantly better candidate vs. baseline
  • 🟥 = significantly worse candidate vs. baseline

We compute a confidence interval (CI) over the relative difference of means between metrics from the candidate and baseline commits, considering the baseline as the reference.

If the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD), the change is considered significant.

Feel free to reach out to #apm-benchmarking-platform on Slack if you have any questions.

More details about the CI and significant changes

You can imagine this CI as a range of values that is likely to contain the true difference of means between the candidate and baseline commits.

CIs of the difference of means are often centered around 0%, because often changes are not that big:

---------------------------------(------|---^--------)-------------------------------->
                              -0.6%    0%  0.3%     +1.2%
                                 |          |        |
         lower bound of the CI --'          |        |
sample mean (center of the CI) -------------'        |
         upper bound of the CI ----------------------'

As described above, a change is considered significant if the CI is entirely outside the configured SIGNIFICANT_IMPACT_THRESHOLD (or the deprecated UNCONFIDENCE_THRESHOLD).

For instance, for an execution time metric, this confidence interval indicates a significantly worse performance:

----------------------------------------|---------|---(---------^---------)---------->
                                       0%        1%  1.3%      2.2%      3.1%
                                                  |   |         |         |
       significant impact threshold --------------'   |         |         |
                      lower bound of CI --------------'         |         |
       sample mean (center of the CI) --------------------------'         |
                      upper bound of CI ----------------------------------'

scenario:MessagePackSerializationBench/benchMessagePackSerialization

  • 🟩 execution_time [-6.266µs; -3.634µs] or [-5.622%; -3.260%]

scenario:MessagePackSerializationBench/benchMessagePackSerialization-opcache

  • 🟩 execution_time [-4.560µs; -3.260µs] or [-4.124%; -2.948%]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant