Skip to content

feat(instr): INVOKEDYNAMIC handler isolation with detach-safe dispatch#830

Open
jbachorik wants to merge 24 commits intodevelopfrom
phase3-invokedynamic-dispatch
Open

feat(instr): INVOKEDYNAMIC handler isolation with detach-safe dispatch#830
jbachorik wants to merge 24 commits intodevelopfrom
phase3-invokedynamic-dispatch

Conversation

@jbachorik
Copy link
Copy Markdown
Collaborator

@jbachorik jbachorik commented Apr 11, 2026

Summary

Replaces INVOKESTATIC handler copying with INVOKEDYNAMIC dispatch. Probe handler methods live in the probe class — defined in a per-probe ClassLoader on JDK 8/9-14 or as a hidden class on JDK 15+ (no longer pinned to the bootstrap CL) — and are resolved via MethodHandles.publicLookup().findStatic(); no bytecode is copied into target classes. Every dispatch site is a MutableCallSite, so when a probe is unregistered the live call sites can be relinked to a noop — preventing crashes in the instrumented application if a probe handler would otherwise run against torn-down BTraceRuntime state. This restores the safety invariant the older cushion bytecode-rewrite provided, at the dispatch layer instead of via redefineClasses. Works on Java 8+.

Because probe classes now live in per-probe residency, probe detach actually releases the class for unloading (previously every attach/detach cycle leaked one probe class into Metaspace for the JVM lifetime). Handler MethodHandle caching moved to a per-probe instance field that dies with the probe object, and Client.cleanupTransformers wires BTraceRuntimes.removeRuntime so the runtime registry entry is released too.

Cross-classloader wiring

The dispatcher lives in the bootstrap CL (it has to — it's the BSM for INDY sites emitted from anywhere). The repository implementation lives in the agent CL (it needs BTraceRuntime, SLF4J, ASM). A single volatile HandlerRepository repository field on IndyDispatcher bridges the two, populated reflectively from HandlerRepositoryImpl's static init. Probe classes themselves sit in a per-probe CL / hidden class, reachable from the dispatcher only via the MethodHandle returned by resolution — no name-based lookup is required from either loader.

flowchart LR
  subgraph BootCL["Bootstrap ClassLoader"]
    IND["IndyDispatcher<br/><br/>• bootstrap BSM<br/>• invalidateProbe<br/>• LIVE_SITES registry<br/>• repository bridge field"]
    HR["HandlerRepository<br/>(interface)"]
  end
  subgraph AgentCL["Agent ClassLoader"]
    HRI["HandlerRepositoryImpl<br/><br/>• probeMap (by name)<br/>• resolveHandler<br/>• register / unregisterProbe"]
    RT["BTraceRuntime<br/>ASM · SLF4J"]
  end
  HRI -. "implements via method ref" .-> HR
  HRI -- "reflective Field.set in static{}" --> IND
  IND -- "repository.resolveHandler" --> HRI
  HRI --> RT
Loading

Dispatch chain — happy path

Hot path after JIT warm-up: the MutableCallSite target is the resolved handle, HotSpot treats it as @Stable, inlines directly.

sequenceDiagram
  autonumber
  participant IM as Instrumented method
  participant JVM as JVM linker
  participant IND as IndyDispatcher (boot CL)
  participant HRI as HandlerRepositoryImpl (agent CL)
  participant P as Probe handler (per-probe CL / hidden)

  IM->>JVM: invokedynamic (first hit)
  JVM->>IND: bootstrap(caller, name, type, probeClassName)
  IND->>HRI: repository.resolveHandler(probe, name, type)
  HRI->>HRI: probeMap.get(probe), read probe.getProbeClass()
  HRI->>P: publicLookup().findStatic(probeClass, short, type)
  HRI-->>IND: MethodHandle
  IND->>IND: mcs = new MutableCallSite(type), setTarget(mh), register weakRef
  IND-->>JVM: MutableCallSite
  Note over IM,P: subsequent invocations — JIT inlines mcs.target directly
  IM->>P: probeHandler(args)
Loading

Dispatch chain — unresolved path (self-relinking trampoline)

When bootstrap runs before the probe is registered (or the repository bridge isn't wired yet), we install a trampoline. Each invocation retries resolution; on success the MCS target is updated and the trampoline is bypassed from then on.

sequenceDiagram
  autonumber
  participant IM as Instrumented method
  participant MCS as MutableCallSite
  participant TR as Trampoline handle
  participant IND as IndyDispatcher.relink
  participant HRI as HandlerRepositoryImpl

  Note over IM,HRI: Invocation N — probe not yet registered
  IM->>MCS: invoke(args)
  MCS->>TR: dispatch to current target
  TR->>IND: relink(mcs, probe, name, type)
  IND->>HRI: tryResolve(...)
  HRI-->>IND: null
  IND-->>TR: noop(type)
  TR->>TR: invoke noop(args)
  TR-->>IM: default value / void

  Note over IM,HRI: registerProbe(probe) happens here

  Note over IM,HRI: Invocation N+1
  IM->>MCS: invoke(args)
  MCS->>TR: dispatch
  TR->>IND: relink(mcs, probe, name, type)
  IND->>HRI: tryResolve(...)
  HRI-->>IND: MethodHandle
  IND->>MCS: setTarget(mh), MutableCallSite.syncAll
  IND-->>TR: mh
  TR->>TR: invoke mh(args)
  TR-->>IM: probe handler result

  Note over IM,MCS: Invocation N+2 onward — JIT re-optimises, inlines through mcs.target
Loading

Detach — invalidate-live-sites flow

The safety property: after unregisterProbe, no call site may enter probe handler code. This is what the old cushion bytecode rewrite ensured; we now do it at the dispatch layer.

sequenceDiagram
  autonumber
  participant CL as Client (detach)
  participant PN as BTraceProbeNode.unregister
  participant HRI as HandlerRepositoryImpl
  participant IND as IndyDispatcher.invalidateProbe
  participant MCS1 as live MCS #1
  participant MCS2 as live MCS #2

  CL->>PN: unregister()
  PN->>HRI: unregisterProbe(probe)
  HRI->>HRI: probeMap.remove(name)
  HRI->>HRI: handlerCache.removeIf(key startsWith name)
  HRI->>IND: invalidateProbe(name)
  IND->>IND: q = LIVE_SITES[name]
  loop each weakRef in q
    IND->>IND: site = ref.get(), drop if null
    IND->>MCS1: setTarget(noop(type))
    IND->>MCS2: setTarget(noop(type))
  end
  IND->>IND: MutableCallSite.syncAll(collected)
  Note over MCS1,MCS2: next invocation lands on noop — JIT sites deopt on syncAll
Loading

Call-site state machine

stateDiagram-v2
  [*] --> Trampoline: bootstrap — probe not yet registered
  [*] --> Resolved: bootstrap — probe registered (setTarget MH)
  Trampoline --> Resolved: registerProbe then invoke (relink + syncAll)
  Trampoline --> Trampoline: invoke while unregistered (noop this call)
  Resolved --> Noop: unregisterProbe (invalidateProbe + syncAll)
  Trampoline --> Noop: unregisterProbe
  Noop --> [*]: call site eligible for GC (weakRef collected)
Loading

Data layout

IndyDispatcher.LIVE_SITES (bootstrap CL)

Per-probe weak registry of dispatch sites, used only by invalidateProbe.

ConcurrentMap<String, ConcurrentLinkedQueue<WeakReference<MutableCallSite>>>
    │
    ├── "com/example/MyTrace"      → [weakRef(mcs-A), weakRef(mcs-B), ...]
    ├── "com/example/OtherTrace"   → [weakRef(mcs-C)]
    └── ...
  • Key: probe class internal name (slashes).
  • Value: queue of weak references so unreachable call sites (e.g. unloaded instrumented classes) don't leak. invalidateProbe prunes nulls opportunistically.
  • Entries added once per successful bootstrap(...) call; never removed except by weak-reference collection.

HandlerRepositoryImpl.probeMap + per-probe MH cache (agent CL)

probeMap  : ConcurrentMap<String probeName, BTraceProbe>

# On each BTraceProbeSupport instance:
handlerCache  : ConcurrentMap<HandlerSubKey, MethodHandle>
HandlerSubKey : (String handlerName, MethodType type) with precomputed hash
  • probeMap is the source of truth for "is probe X registered?"
  • The handler MH cache is per-probe — it lives on BTraceProbeSupport and is reached via default interface methods BTraceProbe.getCachedHandler / cacheHandler. Because each cache instance is owned by one probe, unregister is probeMap.remove(name) (no cross-probe scan) and the cache evicts naturally when the probe object is collected. MH references into the defined probe class are also dropped when BTraceProbeSupport.clearProbeClass() runs, so they don't retain the class beyond detach.
  • Cache is positive-only: a null return from resolveHandler means "try again on next invocation". A negative entry would defeat the MutableCallSite self-heal (next bootstrap would get a stale null and install a trampoline that immediately sees the cached null again).

Bootstrap static arguments (what the JVM passes)

bootstrap(
  MethodHandles.Lookup caller,   // lookup context of the instrumented class
  String               name,     // probe-prefixed handler name, e.g. "MyTrace$onEntry"
  MethodType           type,     // call-site signature
  String               probeClassName  // BSM extra const — probe class internal name
)

The fourth argument is emitted as a CONSTANT_String BSM argument by Instrumentor.invokeBTraceAction. name is stripped to its short form inside resolveHandler ("MyTrace$onEntry""onEntry") before the findStatic call.

New files

  • IndyDispatcher — Java 8+ invokedynamic bootstrap class in boot CL. Resolves the probe handler, installs a MutableCallSite, tracks live sites, and supports invalidateProbe(name) for detach-time relink.
  • HandlerRepositoryImpl (rewritten) — per-probe positive-only cache (delegated to BTraceProbeSupport), no sentinel, no eviction-on-register. unregisterProbe drops the probeMap entry and propagates to IndyDispatcher.invalidateProbe; no cross-probe keyset scan.
  • ProbeAnchor — shared helper (JDK 8 source set, visible to both the java9 and java11 source sets) that generates a per-probe anchor class in a fresh unnamed ClassLoader so the probe class lands in its own loader (JDK 9-14 path).
  • ProbeClassUnloadingTest — weak-reachability test covering the new residency. Verifies the probe Class<?> and its CL become weakly reachable after detach + GC, and that the same probe name can be defined twice in one JVM (previously a LinkageError on JDK 8).
  • DispatchBenchmark — JMH benchmark: baseline direct call, ConstantCallSite dispatch, MutableCallSite dispatch.

Deleted

  • Indy.java — JDK 15-only defineHiddenClass bootstrap, no longer needed.
  • CopyingVisitor.java — handler bytecode copying, no longer needed.
  • static/ golden files (~198) — replaced by unified dynamic/ files.

Refactored

  • Instrumentor.invokeBTraceAction — always emits INVOKEDYNAMIC; removed useHiddenClasses dual-mode gate; BSM owner: IndyIndyDispatcher.
  • Probe lifecycle symmetry: registerProbe moved to BTraceProbeNode/BTraceProbePersisted.register() after defineClass; unregisterProbe added to both unregister() methods; removed premature registration in BTraceProbeFactory and redundant cleanup in Client.onExit. unregisterProbe now also relinks live call sites via IndyDispatcher.invalidateProbe.
  • BTraceRuntime.Impl.defineClass — signature reduced from (byte[], boolean) to (byte[]). The mustBeBootstrap parameter became dead once every impl created a per-probe residency regardless; dropped from the interface and all three impls. Sole production caller BTraceProbeSupport.defineClass updated.
  • BTraceRuntimeImpl_8.defineClass — always uses new ClassLoader(null){} (previously bootstrap for transforming probes).
  • BTraceRuntimeImpl_9.defineClass — per-probe anchor class in a fresh unnamed loader, then privateLookupIn(anchor, …).defineClass(bytes) so the probe lands in the per-probe loader (not Auxiliary's bootstrap loader).
  • BTraceRuntimeImpl_11.defineClassdefineHiddenClass(bytes, true) on JDK 15+ (reflective; no ClassOptionSTRONG would pin lifetime to the defining loader), per-probe anchor on JDK 11-14.
  • BTraceRuntimeImpl_9/_11StackWalker frames filtered for org.openjdk.btrace.runtime.auxiliary.* in getCallerClassLoader() / getCallerClass().
  • HandlerRepositoryImpl.resolveHandler — reads the probe Class<?> via probe.getProbeClass() (cached by BTraceProbeSupport.defineClass) instead of Class.forName(probeName). Decouples resolution from bootstrap-CL visibility.
  • HandlerRepositoryImpl.handlerCache — removed. Cache is now a per-probe ConcurrentMap<HandlerSubKey, MethodHandle> on BTraceProbeSupport; accessed via BTraceProbe.getCachedHandler / cacheHandler default interface methods. Dies with the probe object.
  • BTraceRuntimes.removeRuntime(String) — new public primitive; Client.cleanupTransformers calls it after probe.unregister() so the static runtime registry no longer leaks a BTraceRuntime.Impl (and its probe class) per attach/detach cycle.
  • Cushion-method infrastructure removed from BTraceClassWriter, BTraceTransformer, and Instrumentor. The crash-safety contract now lives at the dispatch layer.

JDK 8 reflection-inflation cascade fix

testTraceAll() failed on JDK 8 only with StackOverflowError during agent init. Root cause: agent's reflective Method.invoke() crosses the JDK 8 inflation threshold (~15 calls/Method), JVM defines sun/reflect/GeneratedMethodAccessorN via DelegatingClassLoader, fires BTraceTransformer.transform() for the synthetic class, transformer instruments it → ASM frame computation issues more reflective calls → another inflation → another transform callback → recursion → SOE.

Prior commit 4d53e21 added a ThreadLocal re-entrancy guard in isBootstrapClass(); the guard had a documented timing hole and silently mis-reported bootstrap-class status during the inflation window — and the cascade still formed via OTHER reflective calls in ASM's getCommonSuperClass() path. Reverted.

Structural fix: BTraceTransformer.transform() early-exits for JVM-synthesized accessor classes (sun/reflect/Generated*, jdk/internal/reflect/Generated*) regardless of class loader, BEFORE any reflection-driven ASM analysis runs on them. sun/reflect/ also added to ClassFilter.SENSITIVE_CLASSES for defense-in-depth.

Dispatch-cost measurement

From DispatchBenchmark on JDK 21 (local run):

variant ns/op vs CCS
baseline (direct static call) 0.40 0.11×
instrumented (ConstantCallSite) 3.72 1.00×
instrumentedMutable (MutableCallSite, stable target) 3.63 0.97×

MutableCallSite is within noise of ConstantCallSite once the JIT treats the target as stable — dropping CCS has no measurable dispatch-throughput cost, and is what makes detach-time relink-to-noop possible at all.

Test plan

  • ./gradlew :btrace-instr:test — 663 tests pass (incl. new ClassFilterSensitiveTest, BTraceTransformerEarlyExitTest; HandlerRepositoryImplTest expanded from 3 to 10 tests).
  • ./gradlew :btrace-instr:test -PupdateTestData — 382 dynamic golden files regenerated.
  • ./gradlew :benchmarks:runtime-benchmarks:jmh — numbers above.
  • ./gradlew :integration-tests:test -Pintegration — JDK 8 path requires CI (no local JDK 8).

Coverage added to HandlerRepositoryImplTest

# Test What it guards
1 testRegisterAndResolveReturnsNullForUnknownHandler No crash when probe class isn't loadable
2 testUnregisterClearsCacheForProbe Cache is scoped to probe name prefix
3 testUnregisterDoesNotAffectOtherProbes Per-probe eviction isolation
4 testResolvesRealHandlerFromLoadedClass Happy-path publicLookup().findStatic resolution produces an invocable handle
5 testBootstrapReturnsMutableCallSiteOnImmediateResolution Every call site is an MCS, not a CCS (so it remains relinkable)
6 testBootstrapBeforeRegistrationHealsViaMutableCallSite Trampoline self-heals once probe is registered
7 testBootstrapWithNullRepositoryDoesNotCrash Null-repository path returns a no-crash MCS
8 testUnregisterRelinksLiveCallSiteToNoop Core crash-safety: handler does NOT fire after unregister
9 testUnregisterProbeADoesNotInvalidateProbeB invalidateProbe is properly scoped
10 testUnregisterRelinksAllLiveSitesForProbe Multi-site relink — no off-by-one in the registry iteration
11 testResolveHandlerUsesProbeClassAccessorNotClassForName Resolution path goes through probe.getProbeClass() — no Class.forName dependency
12 testGetProbeClassExposesClassOnStubProbe Accessor plumbing

Coverage added in ProbeClassUnloadingTest

# Test What it guards
1 probeClassWeaklyReachableAfterDefine Probe Class<?> + its ClassLoader are weakly reachable after detach + removeRuntime
2 sameProbeNameCanBeDefinedTwice Defining the same internal probe name twice in one JVM no longer throws LinkageError — each attach has its own residency

Run locally on JDK 8 via JAVA_TEST_HOME=$(sdk home java 8.0.482-librca) ./gradlew :btrace-instr:test and on JDK 21 via the default build.

Acceptance criteria

  • IndyDispatcher works from Java 8+.
  • Every dispatch site is a MutableCallSite (including the immediate-success path) so unregister can relink to noop.
  • IndyDispatcher.invalidateProbe(name) relinks every live call site for that probe to a type-correct noop handle, and calls MutableCallSite.syncAll to drop optimised JIT targets promptly.
  • HandlerRepositoryImpl resolves handlers via publicLookup().findStatic() with a positive-only ConcurrentHashMap cache; negative results are not cached (MCS trampoline handles retry).
  • HandlerRepositoryImpl.unregisterProbe clears the cache for the probe and calls IndyDispatcher.invalidateProbe.
  • ~400 redundant static/ golden files replaced with unified dynamic/ files.
  • HandlerRepositoryImplTest covers the 10 scenarios above.
  • DispatchBenchmark covers baseline + ConstantCallSite + MutableCallSite.
  • All instrumentor tests pass.
  • testTraceAll() passes on JDK 8 — pending CI run.
  • Probe classes are defined in a per-probe ClassLoader (JDK 8/9-14) or hidden class (JDK 15+), never bootstrap CL.
  • After probe.unregister() + BTraceRuntimes.removeRuntime, the probe Class<?> and its ClassLoader become weakly reachable (ProbeClassUnloadingTest.probeClassWeaklyReachableAfterDefine).
  • Same probe name can be defined twice in one JVM (ProbeClassUnloadingTest.sameProbeNameCanBeDefinedTwice).
  • HandlerRepositoryImpl holds no module-level handler cache — per-probe only.
  • BTraceRuntime.Impl.defineClass signature is (byte[]); mustBeBootstrap removed everywhere.
  • Client.cleanupTransformers calls BTraceRuntimes.removeRuntime after probe.unregister().
  • Implementation plan committed at docs/plans/2026-04-20-probe-class-unloading.md.

🤖 Generated with Claude Code


This change is Reviewable

jbachorik and others added 2 commits April 11, 2026 16:36
Replace INVOKESTATIC handler copying with INVOKEDYNAMIC dispatch so probe
handler methods stay in the probe class (bootstrap CL) and are called via
ConstantCallSite, eliminating bytecode copying into target classes.

New:
- IndyDispatcher: Java 8+ bootstrap method using publicLookup().findStatic()
  on the probe class; wired from HandlerRepositoryImpl's static initializer
  via reflection on IndyDispatcher.repository (volatile bridge field)
- HandlerRepositoryImpl: rewrites resolveHandler() to return MethodHandle
  instead of byte[]; uses ConcurrentHashMap with UNRESOLVED sentinel for
  failed lookups; registerProbe() now evicts stale UNRESOLVED entries so
  late-resolving probes work after a registration race
- DispatchBenchmark: JMH benchmark for ConstantCallSite dispatch overhead

Deleted:
- Indy.java (Java-15-specific defineHiddenClass bootstrap, no longer needed)
- CopyingVisitor.java (handler bytecode copying, no longer needed)
- static/ golden files (~198 files, replaced by unified dynamic/ files)

Refactored:
- Instrumentor.invokeBTraceAction: always emits INVOKEDYNAMIC; removed
  useHiddenClasses dual-mode gate; bootstrap handle owner: Indy → IndyDispatcher
- Probe lifecycle symmetry: registerProbe() moved to BTraceProbeNode/
  BTraceProbePersisted.register() (after defineClass); unregisterProbe()
  moved to both unregister() methods; removed premature registration in
  BTraceProbeFactory and redundant unregisterProbe in Client.onExit()
- BTraceRuntimeImpl_9/_11: StackWalker frames filtered for
  org.openjdk.btrace.runtime.auxiliary.* to skip IndyDispatcher frames in
  getCallerClassLoader() and getCallerClass()
- HandlerRepositoryImpl: dead probe.getClass() condition simplified to
  always load probe script from bootstrap CL (where it is defined)
- IndyDispatcher: added diagnostic logging for handler resolution failures

Test plan:
  ./gradlew :btrace-instr:test            — all instrumentor tests pass
  ./gradlew :btrace-instr:test -PupdateTestData — 382 dynamic golden files regenerated

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the develop-only branch model, build commands, module layout,
and PR checklist so automated review sessions always target the correct
integration branch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
jbachorik and others added 6 commits April 11, 2026 17:13
…ilure

IndyDispatcher lives in the bootstrap classloader. Adding SLF4J Logger
initialization to it caused LoggerFactory.getLogger() to run during
bootstrap class init, where no SLF4J provider is available. This made
Class.forName("...IndyDispatcher") throw ExceptionInInitializerError,
which HandlerRepositoryImpl's static initializer caught and swallowed —
leaving IndyDispatcher.repository null and all @OnMethod handlers as
permanent noops.

Handler resolution failures are already logged by HandlerRepositoryImpl
on the agent-classloader side. No logging is needed in the bootstrap
dispatcher itself.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dispatch

getBytecode(true) filtered to only BCP-required methods, which excluded
@OnMethod handlers (isBcpRequired() returns false when om != null).
The bootstrap-CL probe class therefore had no handler methods, causing
IndyDispatcher → HandlerRepositoryImpl.resolveHandler() →
publicLookup().findStatic() to throw NoSuchMethodException, which was
caught and stored as UNRESOLVED, permanently installing a noop
ConstantCallSite for every instrumented call site.

Fix: also include methods where getOnMethod() != null in the
bootstrap-CL class, and include their callees transitively.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…strap class

The INDY call site descriptor (generated by Instrumentor.invokeBTraceAction)
replaces AnyType with Object for JVM stack compatibility. The bootstrap-CL
probe class must have matching method descriptors so that
publicLookup().findStatic(probeClass, name, handlerType) succeeds.

Without this, handlers using @return AnyType or AnyType[] parameters
(oneliners with args, return-value captures, etc.) resolve as UNRESOLVED
and get a permanent noop CallSite, silencing those probe points.

Apply the same AnyType→Object descriptor substitution that copy() uses
for handler bytecode sent to defineHiddenClass (the old Java-15 path).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…r lookup

Two fixes:

1. BTraceProbeNode.getBytecode(true): also include @OnProbe handler methods
   (op != null) in the bootstrap-CL probe class. mapOnProbes() maps @OnProbe
   to synthetic @OnMethod entries which generate INDY call sites; the handler
   method bodies must be present in the bootstrap class for resolveHandler()
   to find them via findStatic().

2. BTraceRuntimeImpl_8.getCallerClassLoader/getCallerClass: skip
   org.openjdk.btrace.runtime.auxiliary.* frames when walking the call stack,
   mirroring BTraceRuntimeImpl_9's StackWalker-based skip of those frames.
   Before this fix, the probe handler frame (in bootstrap CL) was counted as
   the application caller, causing Class.forName inside BTraceUtils.field()
   and similar reflection utilities to use the bootstrap CL and fail with
   ClassNotFoundException for application classes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…trapClass

On JDK 8, BTraceRuntimeImpl_8.isBootstrapClass() calls
findBootstrapOrNullMtd.invoke() to check bootstrap classloader membership.
After 15 invocations the JVM inflates the reflective accessor, creating
sun/reflect/GeneratedMethodAccessorN via ClassLoader.defineClass().

This class-definition callback triggers BTraceTransformer.transform() which
calls BTraceClassWriter.getCommonSuperClass() → ClassInfo.inferClassLoader()
→ isBootstrapClass() → invoke() again. At call count > 15 the JVM tries to
inflate ANOTHER accessor (N+1), whose definition triggers another transform()
→ isBootstrapClass() → invoke() → accessor N+2, and so on indefinitely.

The resulting StackOverflowError propagates out of agent initialization,
causing testTraceAll (which instruments all classes via @OnMethod(clazz="/.*/"))
to fail on JDK 8 with "FATAL: Initialization failed: StackOverflowError".

Fix: add a ThreadLocal re-entrancy guard in isBootstrapClass(). While the
guard is set (i.e., we are already inside a bootstrap check that triggered
accessor inflation), any re-entrant call returns false conservatively.
This breaks the infinite recursion; the accessor class is defined once and
subsequent invoke() calls are served directly by the generated accessor.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4d53e21

testTraceAll() on JDK 8 fails with `[BTrace Agent] FATAL: Initialization
failed: java.lang.StackOverflowError` because the agent's reflective
Method.invoke() crosses the JDK 8 inflation threshold (~15 calls per Method)
during agent init, and the resulting class-definition callback re-enters the
transformer:

  BTraceTransformer.transform(SomeAppClass)
    -> BTraceClassWriter.instrument()
       -> ASM ClassWriter.getCommonSuperClass(...)
          -> ClassInfo.inferClassLoader(...)
             -> BTraceRuntimeImpl_8.isBootstrapClass(...)
                -> findBootstrapClassOrNull.invoke(...)         <-- crosses threshold
                   -> sun.reflect.MethodAccessorGenerator.generateMethod()
                      -> ClassLoader.defineClass("sun/reflect/GeneratedMethodAccessorN")
                         -> BTraceTransformer.transform("sun/reflect/GeneratedMethodAccessorN")
                            -> BTraceClassWriter.instrument()      <-- recurses
                               -> ASM getCommonSuperClass(...)
                                  -> ... another reflective call -> another GMA -> ...
                                     -> StackOverflowError

The prior fix (commit 4d53e21) added a `ThreadLocal<Boolean>` re-entrancy
guard inside `isBootstrapClass()` to short-circuit the recursive return path.
That guard does not break the cascade for two reasons:

  (1) The ThreadLocal is set/cleared inside a try/finally around invoke().
      The JVM's deferred `defineClass` callback runs while the original
      invoke() is still on the stack — the guard IS technically set on the
      recursive entry, but returning `false` from a re-entrant
      `isBootstrapClass()` only mis-reports the bootstrap status of one
      type lookup; ASM's `getCommonSuperClass()` continues to issue OTHER
      reflective calls (Class.forName, Method lookups) during frame
      computation, each of which can trigger inflation of a DIFFERENT
      Method instance and the next `defineClass` callback.

  (2) Returning `false` for a class that IS in the bootstrap CL silently
      corrupts probe matching for the duration of the inflation window.
      Dead defensive code whose stated invariant doesn't hold is worse
      than no code: it advertises a protection that isn't there.

The structural fix: short-circuit `BTraceTransformer.transform()` for
JVM-synthesized reflective accessor classes BEFORE any reflection-driven
ASM analysis runs on them. These classes are 1:1 trampolines to a target
Method that user probes can already trace directly — there is no tracing
value to instrumenting them.

Three changes:

* btrace-instr/.../BTraceTransformer.java — early-exit at line ~130, after
  the MethodHandleNatives special case but BEFORE the loader-gated
  isSensitiveClass() check. Loader-independent because `sun.reflect.Generated*`
  is loaded by `sun.reflect.DelegatingClassLoader` (neither null nor system),
  so the existing isSensitiveClass branch never fires for it. Matches both
  JDK 8 (`sun/reflect/Generated...`) and JDK 9-16 (`jdk/internal/reflect/
  Generated...`); JDK 17+ uses hidden classes that are never reported by
  name to ClassFileTransformer.

* btrace-instr/.../ClassFilter.java — add `sun/reflect/` to SENSITIVE_CLASSES
  for symmetry with the existing `jdk/internal/`, `sun/invoke/` entries and
  as defense-in-depth: if a future refactor reorders the transformer entry
  path, the sensitive-class list still prevents instrumentation when the
  loader happens to be bootstrap or system.

* btrace-runtime/.../BTraceRuntimeImpl_8.java — REVERT the ThreadLocal guard
  added in 4d53e21. With the structural fix the recursion can't form, so
  the guard is dead code. `isBootstrapClass()` returns to its original
  one-liner, matching `Impl_9` and `Impl_11` which never had a guard and
  never reproduced the SOE — confirming the cascade is JDK-8-specific and
  now structurally cut.

Tests:

* ClassFilterSensitiveTest — pins `isSensitiveClass()` returning true for
  `sun/reflect/Generated{Method,Constructor,SerializationConstructor}Accessor`
  and `jdk/internal/reflect/Generated*`, false for ordinary classes.

* BTraceTransformerEarlyExitTest — pins the load-bearing structural early-exit
  by calling `transform()` directly with a non-null/non-system class loader
  (mock DelegatingClassLoader) and asserting the result is null. Runs on any
  JDK, so a future refactor that moves the early-exit below the loader gate
  would fail loudly on the JDK 11+ build CI even though the SOE itself is
  not reproducible there.

Reviewed via /muse implement chorus (hypnos-augur, rune-augury) — both voices
validated the structural approach over the prior MethodHandle-based plan and
flagged the 4d53e21 ThreadLocal as both incorrect and dead given the
structural fix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jbachorik jbachorik force-pushed the phase3-invokedynamic-dispatch branch from 8542c81 to 7b69b4a Compare April 13, 2026 08:29
The structural fix in 7b69b4a (early-exit for sun/reflect/Generated*)
did NOT address the actual failure mode of testTraceAll on JDK 8 —
instrumenting BTraceTransformer.transform() during a reproduction run
shows 737 transform invocations during agent init and ZERO matches
against sun/reflect/Generated*. The claimed reflection-inflation
cascade is not what's happening on the JDK 8u352/482 builds CI runs.

The actual failure is that @OnMethod(clazz="/.*/") matches
JVM-synthesized lambda wrappers (Main$$Lambda$N on JDK 8,
Main$$Lambda$N/0x<hex> on JDK 11+) as well as no-name classes
(JDK 8 Unsafe.defineAnonymousClass outputs, JDK 15+ hidden classes).
BTraceTransformer used to normalize className == null to "<anonymous>"
and let both categories through, which meant:

  Main.handleNewClient's lambda capture
    -> JVM creates synthetic Main$$Lambda$N
    -> transformer instruments its get$Lambda with Assembler.openLinkerCheck
    -> injected `LinkingFlag.get() == 0` prologue + Phase 3 INDY probe dispatch
    -> invokedynamic probe dispatch synthesises another lambda/LambdaForm
    -> that lambda is also instrumented
    -> ... unbounded recursion through the LambdaMetafactory linker
    -> StackOverflowError during agent init

LinkerInstrumentor's guardLinking/reset pair (around
MethodHandleNatives.linkCallSite) is not sufficient: it only covers
linking that happens inside those two methods, not the subsequent
invocation of a lambda whose body has been rewritten to invoke a probe
handler via indy.

Fix: in BTraceTransformer.transform(), early-exit BEFORE the
"<anonymous>" normalization for:
  * className == null (no binary name — JDK 8 host-anonymous classes,
    JDK 15+ hidden classes; never a user-authored tracing target)
  * className matches <owner>$$Lambda$<digit>... (LambdaMetafactory's
    reserved synthetic-lambda naming convention, JDK 8 and JDK 11+)

Keep the existing sun/reflect/Generated* / jdk/internal/reflect/Generated*
skip as defense-in-depth against the theoretical accessor-inflation
cascade described in 7b69b4a, but soften the comment — that cascade was
not the failing path on JDK 8u352/482.

Tests:
* BTraceTransformerEarlyExitTest gains coverage for null className,
  JDK-8-style Lambda wrappers, JDK-11-style named-hidden Lambda wrappers,
  and the anchored isSyntheticLambda predicate (ensures user classes
  with "$$" in their names are not incorrectly skipped).
* testTraceAll passes on JDK 8.0.352-tem locally (previously failed
  with the 7b69b4a structural fix alone).
* Full :integration-tests:test suite green on JDK 8.0.352-tem (22/22
  passed, 1 Docker-gated ignore).
* :btrace-instr:test green on JDK 17 build toolchain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jbachorik jbachorik marked this pull request as ready for review April 19, 2026 14:09
jbachorik and others added 3 commits April 19, 2026 16:09
Live invokedynamic call sites must stop invoking probe handler bodies
once a probe is unregistered — otherwise the probe method can run
against torn-down BTraceRuntime state and crash the instrumented app.
This restores the crash-safety that the older cushion bytecode-rewrite
mechanism provided, but at the dispatch layer instead of via
Instrumentation.redefineClasses.

- IndyDispatcher.bootstrap() always returns a MutableCallSite (not a
  ConstantCallSite). On immediate resolution the target is the resolved
  MethodHandle; on transient failure it's a self-relinking trampoline
  that retries on each invocation until the probe is registered.
- IndyDispatcher keeps a per-probe weak-reference registry of live
  call sites. New invalidateProbe(probeClassName) sets every target
  for a probe to a type-correct noop handle and calls
  MutableCallSite.syncAll, so JIT-compiled sites drop the optimized
  target promptly.
- HandlerRepositoryImpl.unregisterProbe now calls invalidateProbe
  after clearing probeMap and the resolution cache. Dropped the
  UNRESOLVED sentinel and its eviction path: with a self-healing
  trampoline, a negative cache would just defeat the healing.
- HandlerRepositoryImpl.resolveHandler uses Class.forName(name)
  instead of the loader=null variant; bootstrap-defined probe classes
  resolve via parent delegation on any CL that has bootstrap as
  ancestor, which keeps production behaviour and makes the code
  unit-testable.
- Fixed misleading javadoc on registerProbe that claimed retry on
  re-registration — that was false for ConstantCallSite; it is now
  true for the MutableCallSite path.

DispatchBenchmark gets an instrumentedMutable variant so the
MCS-vs-CCS dispatch cost can be measured directly; a local JDK 21 run
showed 3.63 ns/op (MCS) vs 3.72 ns/op (CCS) — statistically identical,
so dropping ConstantCallSite has no meaningful throughput cost.

HandlerRepositoryImplTest goes from 3 error-path smoke tests to 10:
happy-path resolution, MCS on immediate success, trampoline self-heal
for bootstrap-before-register, null-repository safety, unregister
relinks to noop, unrelated-probe isolation, and multi-site relink.

Also cleaned stale CopyingVisitor references from
docs/architecture/BTraceInstrAnalysis.md and removed internal
phase-label terminology from a couple of code comments that shipped
with public builds. StatsdBenchmark updated to match the current
Statsd extension API so the benchmark module compiles again.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
The project-advisor file lives under a directory named for the
chorus framework, which is now "muse". Update the directory name
to match. Pure rename, no content change.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@jbachorik jbachorik changed the title feat(instr): Phase 3 — INVOKEDYNAMIC handler isolation feat(instr): NVOKEDYNAMIC handler isolation Apr 19, 2026
@jbachorik jbachorik changed the title feat(instr): NVOKEDYNAMIC handler isolation feat(instr): INVOKEDYNAMIC handler isolation with detach-safe dispatch Apr 19, 2026
…ted string

resolveHandler() is invoked once per linked indy call site (and re-invoked on
retransformations and late class loads), so broad matchers can trigger it
thousands of times. Replace the "probe#handler+descriptor" string key with a
small HandlerKey(probe, handler, MethodType) object: one allocation per lookup
instead of four, no MethodType.toMethodDescriptorString() synthesis, and
register/unregister evict by equality on probe instead of a prefix-string scan.

HandlerKey pre-computes its hash so MethodType.hashCode()'s parameter walk
runs once per key. Java-8 compatible (plain final class, not a record).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jbachorik jbachorik force-pushed the phase3-invokedynamic-dispatch branch from 5aec34f to 07842ea Compare April 19, 2026 22:09
jbachorik and others added 11 commits April 20, 2026 00:50
Groundwork for making probe classes unloadable. BTraceProbeSupport caches
the Class<?> returned by defineClass; BTraceProbeNode/BTraceProbePersisted
expose it via getProbeClass() and clear it on unregister(). Subsequent
work will (a) switch HandlerRepositoryImpl.resolveHandler to read the
class via this accessor instead of Class.forName, and (b) move probes off
bootstrap CL so detach actually releases the class for unloading.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…(JDK 8, 9-14, 15+)

Move probe class definition off the bootstrap ClassLoader so that probe
classes can be unloaded when a probe is detached. The indirect dispatch
via INVOKEDYNAMIC (Tasks 1-2) already decoupled the target from the
probe class name, so isolating the probe in its own loader / hidden
class no longer requires the bootstrap CL for linkage.

Three runtime-impl paths, one per source set:

  * JDK 8 (BTraceRuntimeImpl_8): Unsafe.defineClass now always targets
    a fresh, parent-bootstrap new ClassLoader(null){} rather than passing
    null (= bootstrap). The mustBeBootstrap parameter is kept in the
    signature for source compatibility; dropping it is a follow-up.

  * JDK 9-14 (BTraceRuntimeImpl_9 and the <15 branch of
    BTraceRuntimeImpl_11): a tiny per-probe anchor class
    (org.openjdk.btrace.runtime.auxiliary.Anchor$<seq>) is generated
    and defined in a new unnamed ClassLoader. The probe is then defined
    via privateLookupIn(anchor, ...).defineClass(), placing it inside
    the isolated anchor loader. The anchor bytes are hand-assembled to
    avoid pulling ASM onto the runtime module classpath.

  * JDK 15+ (BTraceRuntimeImpl_11, version.feature() >= 15):
    MethodHandles.Lookup.defineHiddenClass(code, true) with no
    ClassOption.STRONG. The hidden class is unloadable as soon as its
    Class<?> mirror is no longer strongly reachable. The call is made
    reflectively because this source set targets JDK 11 bytecode.
    STRONG is explicitly avoided: it would tie the hidden class's
    lifetime to the (shared) Auxiliary loader and defeat unloading.

A new test, ProbeClassUnloadingTest, asserts weak reachability of the
probe Class<?> after callers drop their references. It does NOT assert
Metaspace unload — weak reachability is the reliable precondition; the
actual class unload under System.gc() is JVM-policy-dependent. The
test exercises whichever BTraceRuntimeImpl the host JDK selects.
…dd BTraceRuntimes.removeRuntime

- Extract the hand-assembled per-probe anchor class-file emitter (previously
  duplicated in BTraceRuntimeImpl_9 and _11) into a shared package-private
  ProbeAnchor in the JDK 8 source set, visible to both java9 and java11
  source sets via the main compile output. ANCHOR_SEQ lives there too.
- Drop the unused mustBeBootstrap parameter from BTraceRuntime.Impl.defineClass
  and all three implementations; every impl now creates a fresh loader/hidden
  class regardless. Update the lone caller in BTraceProbeSupport.
- Add BTraceRuntimes.removeRuntime(String) (and package-private
  BTraceRuntimeAccessImpl.removeRuntime) so callers that create a runtime
  through BTraceRuntimes.getRuntime(...) and then abort can release the
  registry's strong GC root on the runtime (and transitively its class,
  loader, and method handles). Not wired into any agent lifecycle here.
- Rename the test-only JDK 9 helper from defineClass(byte[]) to
  defineClassInAuxiliary(byte[]) to avoid a signature collision with the
  instance defineClass(byte[]); update InstrumentorTestBase accordingly.
- ProbeClassUnloadingTest: use BTraceRuntimes.removeRuntime instead of
  reflecting into BTraceRuntimeAccessImpl.runtimes; correct the class-level
  javadoc to describe the actual JDK 15+ path (defineHiddenClass(code, true)
  with no ClassOption); add a "do not inline" banner on defineAndDropProbe.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Client.initialize() registers the probe's BTraceRuntime.Impl in the static
BTraceRuntimes registry keyed by probe.getClassName() (dotted). The detach
path (onExit -> cleanupTransformers) previously unregistered the transformer
but never removed the registry entry. Every attach/detach cycle therefore
leaked one Impl (and transitively the probe Class<?> and its per-probe
ClassLoader), defeating the unloading work shipped earlier on this branch.

Pair the register with a matching BTraceRuntimes.removeRuntime(probe name)
in cleanupTransformers, after probe.unregister(). The key must be the exact
string that was passed to getRuntime() — probe.getClassName() dotted — or
the remove silently no-ops. Keeping both calls inside Client confines
registry-lifecycle management to one owner; adding the remove in
BTraceProbeNode.unregister / BTraceProbePersisted.unregister would fire on
test-harness paths that never went through BTraceRuntimes.

Verification: covered indirectly by the existing weak-reachability assertions
in ProbeClassUnloadingTest, which exercise the same removeRuntime primitive.
Adding a dedicated agent unit test was skipped because cleanupTransformers
is private on abstract Client and btrace-agent has no existing harness for
this level of integration.
HandlerRepositoryImpl kept a single static ConcurrentHashMap<HandlerKey,
MethodHandle> across all probes. Unregister had to scan its key set to drop
entries for the departing probe, and the map held MethodHandles into every
probe's Class — a static retention path that worked against per-probe class
residency.

Move the cache onto BTraceProbeSupport as an instance field. The 3-tuple key
collapses to (handlerName, MethodType) because the map itself is per-probe.
Expose via two default methods on BTraceProbe (getCachedHandler / cacheHandler)
— BTraceProbeNode and BTraceProbePersisted override to delegate to the support
object; test stubs inherit the no-op defaults. HandlerRepositoryImpl now looks
up the probe first, then consults its cache; on success it writes back via
probe.cacheHandler.

Unregister no longer scans — the per-probe cache is dropped when the probe
object is GC'd. BTraceProbeSupport.clearProbeClass() also clears the cache
explicitly so stale MethodHandles do not keep a just-released probe Class
reachable through the probe-object graph during the detach window.

Visibility choice: default methods on the public BTraceProbe interface. The
alternative (package-private accessor threaded through instanceof at the
repository) would have forced HandlerRepositoryImpl to know about every probe
implementation. Defaults cost nothing on non-production probes (test stubs)
and keep the storage hidden inside BTraceProbeSupport.
… race

Add synchronized modifier to Client.loadClass to ensure mutual exclusion with
onExit (which is also synchronized). This prevents the race where onExit clears
BTraceRuntime registry entries via removeRuntime before loadClass finishes
probe class definition and <clinit>, which would cause NPE when <clinit> tries
to look up the runtime.

The race window: Thread-1 (command listener started from probe <clinit>) could
fire an ExitCommand and call onExit before Thread-0 finishes defineClass. By
synchronizing loadClass on the same monitor as onExit, Thread-1 blocks until
Thread-0 completes probe registration and class initialization.

Note: This fix prevents a potential race condition but does not address the
separate test failure where probe methods don't execute after the per-probe
ClassLoader change (906d924). That requires separate investigation.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Bisection identified commit 906d924 (per-probe ClassLoader) as breaking
integration tests, not the race condition fix. Traced app produces zero output
from probes post-change.

Root causes to investigate:
1. Visibility issue: per-probe CL (only bootstrap parent) can't see app CL classes like BTraceUtils
2. INDY dispatch issue: MethodHandle resolution might fail from isolated CL
3. Output capture issue: less likely but test harness might not capture from per-probe CL

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
…eUtils visibility

Updated BTraceRuntimeImpl_8/9/11 to pass app ClassLoader as parent when creating
the per-probe ClassLoader, allowing probes to access BTraceUtils and other agent
classes without breaking isolation guarantees.

Also added debug logging to HandlerRepositoryImpl to help diagnose handler resolution issues.

Issue: testOnMethodLevel and similar tests still fail after this change,
suggesting the problem is not just ClassLoader visibility but possibly related
to INVOKEDYNAMIC dispatch or some other mechanism when probes are in isolated CLs.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
ClassLoader visibility fix (adding app CL as parent to per-probe CL) did not
resolve the test failures. This rules out visibility as the root cause.

New hypothesis: The issue is INVOKEDYNAMIC dispatch or handler resolution
failure when probe is defined in an isolated ClassLoader. The fact that
tests produce ZERO output from probes (not different output) suggests the
instrumented bytecode is not invoking the probes at all, rather than the
probes running but failing internally.

Key evidence: handler cache refactoring (73c3422) was done in parallel with
per-probe CL change (906d924), and might have introduced a bug specific to
isolated CL scenarios.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant