Skip to content

Native malloc fixes and cleanup#550

Open
jbachorik wants to merge 2 commits into
mainfrom
fix/malloc-tracer-double-accounting
Open

Native malloc fixes and cleanup#550
jbachorik wants to merge 2 commits into
mainfrom
fix/malloc-tracer-double-accounting

Conversation

@jbachorik
Copy link
Copy Markdown
Collaborator

@jbachorik jbachorik commented May 27, 2026

What does this PR do?:
Fixes reentrancy double-counting in the native malloc tracer, cleans up hook dispatch for musl, and removes dead code.

  • Add CriticalSection guard in maybeRecord to prevent profiler-internal allocations (e.g. JFR buffer growth) from re-entering the hook and being double-counted
  • Replace calloc_hook_dummy / posix_memalign_hook_dummy pass-through hooks with a simple skip: on musl where calloc delegates to malloc and posix_memalign delegates to aligned_alloc internally, we leave those GOT entries unpatched and let the inner hook record the allocation
  • Remove _orig_free / SAVE_IMPORT(free) — free is not tracked for liveness; the resolved pointer was only used for a probe cleanup where plain free() is correct
  • Remove the _calloc_hook_fn / _posix_memalign_hook_fn pre-computed pointer members and the NO_OPTIMIZE macro that existed solely for the dummy hooks
  • Gate MallocTracer::installHooks() in Libraries::refresh() on MallocTracer::running() to avoid unnecessary GOT patching work between profiling sessions

Motivation:
Profiler-internal allocations triggered inside recordMalloc (e.g. sample buffer allocation) re-enter the patched GOT and were being silently double-counted. The dummy hook approach for musl added function-call overhead on every calloc/posix_memalign without benefit.

Additional Notes:
detectNestedMalloc() is kept — it still drives the decision of whether to skip patching calloc/posix_memalign on musl.

How to test the change?:
Covered by existing integration tests:

  • NativememProfilerTest — verifies profiler.Malloc JFR events have positive size, non-zero address, and Java stack trace
  • NativememSampledProfilerTest — verifies weight and sample count from repeated allocations

For Datadog employees:

  • This PR doesn't touch any of that.
  • JIRA: PROF-14807

Unsure? Have a question? Request a review!

@jbachorik jbachorik marked this pull request as ready for review May 27, 2026 20:53
@jbachorik jbachorik requested a review from a team as a code owner May 27, 2026 20:53
@datadog-prod-us1-3
Copy link
Copy Markdown

datadog-prod-us1-3 Bot commented May 27, 2026

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 4 Pipeline jobs failed

DataDog/java-profiler | gtest-asan-arm64   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). LeakSanitizer detected memory leaks in Arguments::parse(char const*) at /go/src/github.com/DataDog/java-profiler/ddprof-lib/src/main/cpp/arguments.cpp:119:18

DataDog/java-profiler | gtest-tsan-amd64   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). Segmentation fault while executing the binary with GTEST_DEATH_TEST_STYLE=threadsafe.

DataDog/java-profiler | gtest-tsan-arm64   View in Datadog   GitLab

🛟 This job is unlikely to succeed on retry. Please review your pipeline configuration. Containers with unready status: [build helper docker].

View all 4 failed jobs.

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 7f7ae89 | Docs | Datadog PR Page | Give us feedback!

@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 27, 2026

CI Test Results

Run: #26542176878 | Commit: 613cb9b | Duration: 12m 13s (longest job)

All 32 test jobs passed

Status Overview

JDK glibc-aarch64/debug glibc-amd64/debug musl-aarch64/debug musl-amd64/debug
8 - - -
8-ibm - - -
8-j9 - -
8-librca - -
8-orcl - - -
11 - - -
11-j9 - -
11-librca - -
17 - -
17-graal - -
17-j9 - -
17-librca - -
21 - -
21-graal - -
21-librca - -
25 - -
25-graal - -
25-librca - -

Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled

Summary: Total: 32 | Passed: 32 | Failed: 0


Updated: 2026-05-27 22:37:23 UTC

…cking

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant