diff --git a/JFR.md b/JFR.md new file mode 100644 index 000000000..3516e18a2 --- /dev/null +++ b/JFR.md @@ -0,0 +1,160 @@ +# JFR Memory Monitoring + +The driver can publish DuckDB memory-usage statistics as a periodic +Java Flight Recorder (JFR) event, so any JFR-aware tool (JMC, `jfr` +CLI, async-profiler, Datadog/New Relic continuous profilers, …) can +ingest DuckDB memory metrics alongside the rest of the JVM signal. + +The feature is **strictly opt-in** per connection and is a silent +no-op on JVMs without JFR support. Nothing is emitted unless the +application both sets the JDBC property on a connection and has an +active JFR recording that enables the event. + +## Enabling emission + +Pass `jdbc_jfr_memory_monitor=` when opening a +connection: + +```java +Properties props = new Properties(); +props.setProperty(DuckDBDriver.JDBC_JFR_MEMORY_MONITOR, "pricing-service"); +Connection conn = DriverManager.getConnection("jdbc:duckdb:/tmp/pricing.db", props); +``` + +…or in the URL: + +``` +jdbc:duckdb:/tmp/pricing.db;jdbc_jfr_memory_monitor=pricing-service +``` + +The `` is an arbitrary label chosen by the application; +it is attached to every event as the `component` field so operators +can attribute memory to logical components in dashboards and queries. + +Rules: + +| Property value | Effect | +| ------------------------- | ---------------------------------------------- | +| absent | no events emitted for this connection | +| empty string | no events emitted (same as absent) | +| non-empty string | events emitted, tagged with the given value | + +The JDBC property is purely an opt-in switch and a label. It does +**not** control whether JFR is actually recording, nor the sampling +period — those are governed by JFR recording settings (see below). + +## Controlling period and enabled state + +Sampling rate and enabled state are JFR-native settings. Configure +them in a `.jfc` profile, via JMC, or programmatically: + +```java +try (Recording r = new Recording()) { + r.enable("duckdb.MemoryUsage").withPeriod(Duration.ofSeconds(1)); + r.start(); + // ... application work ... + r.stop(); + r.dump(Path.of("app.jfr")); +} +``` + +Equivalent `.jfc` snippet: + +```xml + + true + 1 s + +``` + +When no recording enables the event, the driver performs zero work — +the DuckDB `duckdb_memory()` query is never issued. + +## Event schema + +Event name: **`duckdb.MemoryUsage`** — one event per memory tag per +JFR tick. + +| Field | Type | Meaning | +| ----------------------- | ------ | ----------------------------------------------------------------- | +| `component` | String | Application-supplied identifier (the JDBC property value). | +| `tag` | String | DuckDB memory tag (e.g. `BASE_TABLE`, `HASH_TABLE`, `ALLOCATOR`). | +| `dbAddress` | long | Native address of the DuckDB instance — stable per-instance id. | +| `memoryUsageBytes` | long | Bytes currently allocated for this tag. | +| `temporaryStorageBytes` | long | Bytes spilled to temporary storage for this tag. | + +Plus the standard JFR fields `startTime`, `duration`, `eventThread` +(stack traces are disabled for this event). + +## Attribution model + +The monitor is keyed on the **native DuckDB instance address** +(exposed as `dbAddress`), not on the JDBC connection: + +- One sample stream per distinct DuckDB instance — no double-counting + of shared memory. +- `component` is captured from the first opted-in connection to an + instance; later opted-in connections to the same instance do not + change the label. +- The monitor is created when the first opted-in connection opens and + torn down when the last one closes; a subsequent `getConnection` + starts a fresh monitor. + +### When two `getConnection` calls share an instance + +| URL / operation | Same `dbAddress`? | +| ---------------------------------------- | -------------------------------- | +| `jdbc:duckdb:` (unnamed in-memory) | **No** — fresh instance per call | +| `jdbc:duckdb::memory:` | Yes, when `` matches | +| `jdbc:duckdb:/path/to/file.db` | Yes, when the path matches | +| `conn.duplicate()` | Yes, always | + +Connections that share an instance share a `dbAddress` and therefore +a single `component` label (the one supplied by the first opted-in +connection). For per-component attribution, open each component +against a URL that yields a fresh instance — unnamed in-memory URLs +are the simplest choice — and give each connection a unique +`jdbc_jfr_memory_monitor` value. + +## Requirements + +A JFR-capable JVM: + +- OpenJDK/HotSpot 11 and newer: JFR is included. +- Amazon Corretto 8, OpenJDK 8u272+, and several other Java 8 + distributions: JFR backport included (`jdk.jfr` package). +- JVMs without `jdk.jfr` (e.g. some stripped Java 8 builds): the + feature is a silent no-op; the `jdbc_jfr_memory_monitor` property + is ignored and no classes that depend on `jdk.jfr` are loaded. + +No additional JVM flags are required. + +## Inspecting a recording + +With the `jfr` CLI bundled with the JDK: + +``` +jfr summary app.jfr | grep duckdb.MemoryUsage +jfr print --events duckdb.MemoryUsage app.jfr | head -12 +jfr metadata app.jfr | sed -n '/class MemoryUsage/,/^}/p' +``` + +Or open `app.jfr` in JMC for an interactive view. + +## Manual verification + +Two shell scripts reproduce the above end-to-end and are the +recommended way to sanity-check a new build: + +``` +./scripts/verify-jfr.sh # Java >= 9 +./scripts/verify-jfr-java8.sh # Java 8 (covers both JFR and no-JFR paths) +``` + +Switch the active JDK first (for example +`sdk u java 25.0.3-amzn` or `sdk u java 8.0.462-amzn`). Each script +builds any missing artifacts, runs the four `test_jfr_memory_event*` +unit tests, and — on Java ≥ 9 — captures a live recording and +verifies the event with `jfr summary` and `jfr print`. The Java 8 +script additionally asserts the JFR-less fallback by running the +driver with `jfr.jar` stripped from the bootclasspath. diff --git a/README.md b/README.md index 3f1b60c52..6f06e4353 100644 --- a/README.md +++ b/README.md @@ -22,3 +22,5 @@ java -cp "build/release/duckdb_jdbc_tests.jar:build/release/duckdb_jdbc.jar" or ``` Scalar function usage examples: [UDF.MD](UDF.MD) + +JFR memory monitoring usage: [JFR.md](JFR.md) diff --git a/duckdb_java.def b/duckdb_java.def index 7ae4d081b..4593c41f5 100644 --- a/duckdb_java.def +++ b/duckdb_java.def @@ -26,6 +26,7 @@ Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1connect Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1create_1appender Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1create_1db_1ref Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1destroy_1db_1ref +Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1db_1address Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1create_1extension_1type Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1disconnect Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1execute diff --git a/duckdb_java.exp b/duckdb_java.exp index 1e92c19c7..8ca137161 100644 --- a/duckdb_java.exp +++ b/duckdb_java.exp @@ -23,6 +23,7 @@ _Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1connect _Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1create_1appender _Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1create_1db_1ref _Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1destroy_1db_1ref +_Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1db_1address _Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1create_1extension_1type _Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1disconnect _Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1execute diff --git a/duckdb_java.map b/duckdb_java.map index f3b476f8c..c780493dc 100644 --- a/duckdb_java.map +++ b/duckdb_java.map @@ -25,6 +25,7 @@ DUCKDB_JAVA { Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1create_1appender; Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1create_1db_1ref; Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1destroy_1db_1ref; + Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1db_1address; Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1create_1extension_1type; Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1disconnect; Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1execute; diff --git a/scripts/verify-jfr-java8.sh b/scripts/verify-jfr-java8.sh new file mode 100755 index 000000000..0104a0b8a --- /dev/null +++ b/scripts/verify-jfr-java8.sh @@ -0,0 +1,88 @@ +#!/usr/bin/env bash +# +# Manual verification of the JFR memory-monitoring feature on Java 8. +# +# Usage: ./scripts/verify-jfr-java8.sh +# Assumes: `java`, `javac` on PATH point at JDK 8 (e.g. `sdk u java 8.0.462-amzn`). +# `make release` has been run so that the platform-specific native +# library is available; this script rebuilds only the Java jars. + +set -euo pipefail + +REPO="$(cd "$(dirname "$0")/.." && pwd)" +RELEASE="$REPO/build/release" +BUILD="$REPO/build/java8" +WORK="$(mktemp -d -t duckdb-jfr-verify8.XXXXXX)" +trap 'rm -rf "$WORK"' EXIT + +die() { echo "error: $*" >&2; exit 1; } +step() { printf '\n== %s ==\n' "$*"; } + +# Preconditions --------------------------------------------------------------- + +java_major=$(java -version 2>&1 | awk -F\" '/version/ {split($2,a,"."); print (a[1]=="1")?a[2]:a[1]; exit}') +[[ "$java_major" == "8" ]] || die "need Java 8, got $java_major (use verify-jfr.sh for Java >= 9)" +[[ -d "$RELEASE" ]] || die "run 'make release' first (native library not found)" +NATIVE_LIB=$(ls "$RELEASE"/libduckdb_java.so_* 2>/dev/null | head -n1) \ + || die "no libduckdb_java.so_* under $RELEASE" + +# Build Java 8 jars (idempotent) --------------------------------------------- + +if [[ ! -f "$BUILD/duckdb_jdbc_tests.jar" ]]; then + step "building Java 8 jars" + mkdir -p "$BUILD" + (cd "$BUILD" \ + && cmake -DCMAKE_BUILD_TYPE=Release "$REPO" >/dev/null \ + && cmake --build . --target duckdb_jdbc_tests >/dev/null) + cp "$BUILD/duckdb_jdbc_nolib.jar" "$BUILD/duckdb_jdbc.jar" + jar uf "$BUILD/duckdb_jdbc.jar" -C "$(dirname "$NATIVE_LIB")" "$(basename "$NATIVE_LIB")" +fi + +JAR="$BUILD/duckdb_jdbc.jar" +TESTS="$BUILD/duckdb_jdbc_tests.jar" + +# Confirm bytecode 52 (Java 8) ------------------------------------------------ + +bc_hex=$(unzip -p "$JAR" org/duckdb/DuckDBMemoryEvent.class | od -An -N8 -tx1 | awk '{print $8}') +[[ "$bc_hex" == "34" ]] || die "expected bytecode 0x34 (Java 8), got 0x$bc_hex" + +# 1. Unit tests on Java 8 + JFR ---------------------------------------------- + +step "running JFR unit tests on Java 8 (jdk.jfr backport present)" +java -cp "$TESTS:$JAR" org/duckdb/TestDuckDBJDBC test_jfr_memory + +# 2. Fallback path: jfr.jar stripped from the bootclasspath ------------------- + +step "verifying the JFR-absent fallback path" +cat > "$WORK/NoJfrDemo.java" <<'EOF' +import java.lang.reflect.Method; +import java.sql.*; +import java.util.Properties; + +public class NoJfrDemo { + public static void main(String[] a) throws Exception { + try { Class.forName("jdk.jfr.FlightRecorder"); throw new AssertionError("JFR present"); } + catch (ClassNotFoundException ok) {} + Class.forName("org.duckdb.DuckDBDriver"); + Properties p = new Properties(); + p.setProperty("jdbc_jfr_memory_monitor", "ignored"); + try (Connection c = DriverManager.getConnection("jdbc:duckdb:", p); + Statement s = c.createStatement(); + ResultSet r = s.executeQuery("SELECT 42")) { r.next(); } + Method f = ClassLoader.class.getDeclaredMethod("findLoadedClass", String.class); + f.setAccessible(true); + ClassLoader cl = ClassLoader.getSystemClassLoader(); + if (f.invoke(cl, "org.duckdb.DuckDBMemoryMonitor") != null + || f.invoke(cl, "org.duckdb.DuckDBMemoryEvent") != null) + throw new AssertionError("JFR-dependent class was loaded"); + System.out.println("OK"); + } +} +EOF +javac -d "$WORK" -cp "$JAR" "$WORK/NoJfrDemo.java" + +JRE_LIB="$JAVA_HOME/jre/lib" +BOOT="$JRE_LIB/rt.jar:$JRE_LIB/jsse.jar:$JRE_LIB/jce.jar:$JRE_LIB/charsets.jar" +java -Xbootclasspath:"$BOOT" -cp "$WORK:$JAR" NoJfrDemo + +printf '\nOK\n' diff --git a/scripts/verify-jfr.sh b/scripts/verify-jfr.sh new file mode 100755 index 000000000..3ab6c3df8 --- /dev/null +++ b/scripts/verify-jfr.sh @@ -0,0 +1,74 @@ +#!/usr/bin/env bash +# +# Manual verification of the JFR memory-monitoring feature on Java >= 9. +# +# Usage: ./scripts/verify-jfr.sh +# Assumes: `java`, `javac`, `jfr` on PATH point at Java 9+ (same major version). +# `make release` has been run (artifacts under build/release). + +set -euo pipefail + +REPO="$(cd "$(dirname "$0")/.." && pwd)" +JAR="$REPO/build/release/duckdb_jdbc.jar" +TESTS="$REPO/build/release/duckdb_jdbc_tests.jar" +WORK="$(mktemp -d -t duckdb-jfr-verify.XXXXXX)" +trap 'rm -rf "$WORK"' EXIT + +die() { echo "error: $*" >&2; exit 1; } +step() { printf '\n== %s ==\n' "$*"; } + +# Preconditions --------------------------------------------------------------- + +java_major=$(java -version 2>&1 | awk -F\" '/version/ {split($2,a,"."); print (a[1]=="1")?a[2]:a[1]; exit}') +[[ "$java_major" -ge 9 ]] || die "need Java >= 9, got $java_major (use verify-jfr-java8.sh for Java 8)" +command -v jfr >/dev/null || die "'jfr' CLI not found on PATH" +[[ -f "$JAR" && -f "$TESTS" ]] || die "build artifacts missing; run 'make release' first" + +echo "java $java_major -- $JAR" + +# 1. Unit tests --------------------------------------------------------------- + +step "running JFR unit tests" +java --enable-native-access=ALL-UNNAMED \ + -cp "$TESTS:$JAR" \ + org/duckdb/TestDuckDBJDBC test_jfr_memory + +# 2. End-to-end demo + jfr CLI inspection ------------------------------------ + +step "capturing a live recording" +cat > "$WORK/JfrDemo.java" <<'EOF' +import java.nio.file.*; +import java.sql.*; +import java.time.Duration; +import java.util.Properties; +import jdk.jfr.Recording; + +public class JfrDemo { + public static void main(String[] a) throws Exception { + try (Recording r = new Recording()) { + r.enable("duckdb.MemoryUsage").withPeriod(Duration.ofMillis(500)); + r.start(); + Properties p = new Properties(); + p.setProperty("jdbc_jfr_memory_monitor", "verify-jfr"); + try (Connection c = DriverManager.getConnection("jdbc:duckdb:", p); + Statement s = c.createStatement()) { + s.execute("CREATE TABLE t AS SELECT range AS i FROM range(2000000)"); + Thread.sleep(2000); + } + r.stop(); + r.dump(Path.of(a[0])); + } + } +} +EOF +javac -d "$WORK" -cp "$JAR" "$WORK/JfrDemo.java" +java --enable-native-access=ALL-UNNAMED -cp "$WORK:$JAR" JfrDemo "$WORK/demo.jfr" + +step "jfr summary (expect a non-zero count for duckdb.MemoryUsage)" +jfr summary "$WORK/demo.jfr" | grep duckdb.MemoryUsage \ + || die "duckdb.MemoryUsage event not found in recording" + +step "first event (expect component=verify-jfr, non-zero dbAddress)" +jfr print --events duckdb.MemoryUsage "$WORK/demo.jfr" | sed -n '1,12p' + +printf '\nOK\n' diff --git a/src/jni/duckdb_java.cpp b/src/jni/duckdb_java.cpp index b49036531..45d8a047d 100644 --- a/src/jni/duckdb_java.cpp +++ b/src/jni/duckdb_java.cpp @@ -95,6 +95,11 @@ jobject _duckdb_jdbc_create_db_ref(JNIEnv *env, jclass, jobject conn_ref_buf) { return env->NewDirectByteBuffer(db_ref, 0); } +jlong _duckdb_jdbc_db_address(JNIEnv *env, jclass, jobject conn_ref_buf) { + auto conn_ref = get_connection_ref(env, conn_ref_buf); + return (jlong)conn_ref->db.get(); +} + void _duckdb_jdbc_destroy_db_ref(JNIEnv *env, jclass, jobject db_ref_buf) { if (nullptr == db_ref_buf) { return; diff --git a/src/jni/functions.cpp b/src/jni/functions.cpp index 2bff3ee86..160c0f71a 100644 --- a/src/jni/functions.cpp +++ b/src/jni/functions.cpp @@ -37,6 +37,16 @@ JNIEXPORT jobject JNICALL Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1create_1db_ } } +JNIEXPORT jlong JNICALL Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1db_1address(JNIEnv * env, jclass param0, jobject param1) { + try { + return _duckdb_jdbc_db_address(env, param0, param1); + } catch (const std::exception &e) { + duckdb::ErrorData error(e); + ThrowJNI(env, error.Message().c_str()); + + } +} + JNIEXPORT void JNICALL Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1destroy_1db_1ref(JNIEnv * env, jclass param0, jobject param1) { try { return _duckdb_jdbc_destroy_db_ref(env, param0, param1); diff --git a/src/jni/functions.hpp b/src/jni/functions.hpp index e92e92bfc..d6b2c452f 100644 --- a/src/jni/functions.hpp +++ b/src/jni/functions.hpp @@ -21,6 +21,10 @@ jobject _duckdb_jdbc_create_db_ref(JNIEnv * env, jclass param0, jobject param1); JNIEXPORT jobject JNICALL Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1create_1db_1ref(JNIEnv * env, jclass param0, jobject param1); +jlong _duckdb_jdbc_db_address(JNIEnv * env, jclass param0, jobject param1); + +JNIEXPORT jlong JNICALL Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1db_1address(JNIEnv * env, jclass param0, jobject param1); + void _duckdb_jdbc_destroy_db_ref(JNIEnv * env, jclass param0, jobject param1); JNIEXPORT void JNICALL Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1destroy_1db_1ref(JNIEnv * env, jclass param0, jobject param1); diff --git a/src/main/java/org/duckdb/DuckDBConnection.java b/src/main/java/org/duckdb/DuckDBConnection.java index d51c0c00e..0bc04e942 100644 --- a/src/main/java/org/duckdb/DuckDBConnection.java +++ b/src/main/java/org/duckdb/DuckDBConnection.java @@ -48,31 +48,57 @@ public final class DuckDBConnection implements java.sql.Connection { private final boolean readOnly; private final String sessionInitSQL; + /** + * User-supplied identifier for JFR memory monitoring (the value of the + * {@value DuckDBDriver#JDBC_JFR_MEMORY_MONITOR} property). Either {@code null} + * (the user did not opt in, or this is the monitor's own internal duplicate + * connection) or a non-empty string; empty or absent property values are + * normalised to {@code null} in the constructor. + */ + final String monitorName; + + /** + * Native address of the underlying DuckDB instance. Captured once at construction so that + * {@link #close()} can notify {@link JfrMemoryMonitor} without an additional JNI call + * and so the JFR event can expose it as a stable per-instance identifier. + */ + final long dbAddress; + public static DuckDBConnection newConnection(String url, boolean readOnly, Properties properties) throws Exception { return newConnection(url, readOnly, null, properties); } public static DuckDBConnection newConnection(String url, boolean readOnly, String sessionInitSQL, Properties properties) throws SQLException { + // Ensure the JFR periodic memory-usage event is registered for callers + // that bypass DuckDBDriver (which also calls this in its static init). + // Idempotent and a no-op on JVMs without JFR. + JfrMemoryMonitor.init(); if (null == properties) { properties = new Properties(); } String dbName = dbNameFromUrl(url); String autoCommitStr = removeOption(properties, JDBC_AUTO_COMMIT); boolean autoCommit = isStringTruish(autoCommitStr, true); + String monitorName = removeOption(properties, DuckDBDriver.JDBC_JFR_MEMORY_MONITOR); ByteBuffer nativeReference = DuckDBNative.duckdb_jdbc_startup(dbName.getBytes(UTF_8), readOnly, properties); - return new DuckDBConnection(nativeReference, url, readOnly, sessionInitSQL, autoCommit); + return new DuckDBConnection(nativeReference, url, readOnly, sessionInitSQL, autoCommit, monitorName); } private DuckDBConnection(ByteBuffer connectionReference, String url, boolean readOnly, String sessionInitSQL, - boolean autoCommit) throws SQLException { + boolean autoCommit, String monitorName) throws SQLException { this.connRef = connectionReference; this.url = url; this.readOnly = readOnly; this.autoCommit = autoCommit; this.sessionInitSQL = sessionInitSQL; + this.monitorName = (monitorName != null && !monitorName.isEmpty()) ? monitorName : null; + this.dbAddress = DuckDBNative.duckdb_jdbc_db_address(connectionReference); // Hardcoded 'true' here is intentional, autocommit is handled in stmt#execute() DuckDBNative.duckdb_jdbc_set_auto_commit(connectionReference, true); + if (this.monitorName != null) { + JfrMemoryMonitor.connectionOpened(this); + } } public Statement createStatement(int resultSetType, int resultSetConcurrency, int resultSetHoldability) @@ -99,12 +125,24 @@ public Statement createStatement() throws SQLException { } public DuckDBConnection duplicate() throws SQLException { + return duplicate(this.monitorName); + } + + /** + * Creates a duplicate connection that is invisible to the JFR memory monitor. + * Used exclusively by the monitor itself to avoid re-entrant lifecycle callbacks. + */ + DuckDBConnection duplicateForMonitor() throws SQLException { + return duplicate(null); + } + + private DuckDBConnection duplicate(String monitorName) throws SQLException { checkOpen(); connRefLock.lock(); try { checkOpen(); ByteBuffer dupRef = DuckDBNative.duckdb_jdbc_connect(connRef); - return new DuckDBConnection(dupRef, url, readOnly, sessionInitSQL, autoCommit); + return new DuckDBConnection(dupRef, url, readOnly, sessionInitSQL, autoCommit, monitorName); } finally { connRefLock.unlock(); } @@ -128,6 +166,10 @@ public void close() throws SQLException { if (isClosed()) { return; } + // Notify the memory monitor only from the call that actually performed + // the disconnect, and after releasing connRefLock so the monitor's own + // native disconnect does not block the caller under our lock. + boolean notifyMonitor = false; connRefLock.lock(); try { if (isClosed()) { @@ -173,9 +215,18 @@ public void close() throws SQLException { DuckDBNative.duckdb_jdbc_disconnect(connRef); connRef = null; + notifyMonitor = (monitorName != null); } finally { connRefLock.unlock(); } + if (notifyMonitor) { + try { + JfrMemoryMonitor.connectionClosed(dbAddress); + } catch (Throwable t) { + // The connection is already disconnected at this point; a failure + // in the monitor teardown must not propagate to close() callers. + } + } } public boolean isClosed() throws SQLException { diff --git a/src/main/java/org/duckdb/DuckDBDriver.java b/src/main/java/org/duckdb/DuckDBDriver.java index 1b606600c..31938bdb6 100644 --- a/src/main/java/org/duckdb/DuckDBDriver.java +++ b/src/main/java/org/duckdb/DuckDBDriver.java @@ -31,6 +31,7 @@ public class DuckDBDriver implements java.sql.Driver { public static final String JDBC_AUTO_COMMIT = "jdbc_auto_commit"; public static final String JDBC_PIN_DB = "jdbc_pin_db"; public static final String JDBC_IGNORE_UNSUPPORTED_OPTIONS = "jdbc_ignore_unsupported_options"; + public static final String JDBC_JFR_MEMORY_MONITOR = "jdbc_jfr_memory_monitor"; static final String DUCKDB_URL_PREFIX = "jdbc:duckdb:"; static final String MEMORY_DB = ":memory:"; @@ -75,6 +76,11 @@ public Thread newThread(Runnable r) { } catch (SQLException e) { throw new RuntimeException(e); } + // Eagerly register the JFR periodic memory-usage event (when JFR is available) + // so that recordings started before any monitored connection is opened see the + // event type and honor its period setting. On JVMs without JFR (e.g. Java 8) + // this is a silent no-op. + JfrMemoryMonitor.init(); } public Connection connect(String url, Properties info) throws SQLException { @@ -165,6 +171,12 @@ public DriverPropertyInfo[] getPropertyInfo(String url, Properties info) throws "Do not close the DB instance after all connections to it are closed")); list.add(createDriverPropInfo(JDBC_IGNORE_UNSUPPORTED_OPTIONS, "", "Silently discard unsupported connection options")); + list.add( + createDriverPropInfo(JDBC_JFR_MEMORY_MONITOR, "", + "User-assigned identifier under which this connection's DuckDB instance is tracked" + + " in the duckdb.MemoryUsage JFR event. Leave empty to disable monitoring." + + " JFR controls the event's enabled state and period via recording settings." + + " Requires a JFR-capable JVM.")); list.sort((o1, o2) -> o1.name.compareToIgnoreCase(o2.name)); return list.toArray(new DriverPropertyInfo[0]); } diff --git a/src/main/java/org/duckdb/DuckDBMemoryEvent.java b/src/main/java/org/duckdb/DuckDBMemoryEvent.java new file mode 100644 index 000000000..44f9750e7 --- /dev/null +++ b/src/main/java/org/duckdb/DuckDBMemoryEvent.java @@ -0,0 +1,64 @@ +package org.duckdb; + +import jdk.jfr.Category; +import jdk.jfr.DataAmount; +import jdk.jfr.Description; +import jdk.jfr.Event; +import jdk.jfr.Label; +import jdk.jfr.Name; +import jdk.jfr.StackTrace; + +/** + * JFR event that records DuckDB memory usage for a single memory tag. + * + *

One event is emitted per memory tag per firing interval. Emission is + * driven by JFR's periodic-event machinery: a single hook is registered via + * {@link jdk.jfr.FlightRecorder#addPeriodicEvent} in + * {@link DuckDBMemoryMonitor}, and JFR invokes it at the period configured on + * the recording. Configure both the enabled state and the period in a JFR + * configuration file or via JMC: + * + *

{@code
+ * 
+ *   true
+ *   1 s
+ * 
+ * }
+ * + *

Participation is opt-in per connection: set the + * {@link DuckDBDriver#JDBC_JFR_MEMORY_MONITOR} connection property to the + * identifier under which this connection's DuckDB instance should be tracked. + * An absent or empty value disables emission for that connection. The JDBC + * property is a pure enable/label switch; JFR controls whether and how often + * the event fires. + */ +@Name("duckdb.MemoryUsage") +@Label("DuckDB Memory Usage") +@Description("Periodic snapshot of DuckDB internal memory consumption per tag") +@Category("DuckDB") +@StackTrace(false) +final class DuckDBMemoryEvent extends Event { + + @Label("Component") + @Description( + "User-assigned identifier of the DuckDB instance (value of the jdbc_jfr_memory_monitor connection property)") + String component; + + @Label("Tag") + @Description("DuckDB internal memory tag (e.g. \"BASE_TABLE\", \"HASH_TABLE\", \"ALLOCATOR\")") + String tag; + + @Label("Database Address") + @Description("Native address of the underlying DuckDB instance; disambiguates databases when names collide") + long dbAddress; + + @Label("Memory Usage") + @Description("Bytes currently allocated for this tag") + @DataAmount(DataAmount.BYTES) + long memoryUsageBytes; + + @Label("Temporary Storage Usage") + @Description("Bytes spilled to the temporary storage for this tag") + @DataAmount(DataAmount.BYTES) + long temporaryStorageBytes; +} diff --git a/src/main/java/org/duckdb/DuckDBMemoryMonitor.java b/src/main/java/org/duckdb/DuckDBMemoryMonitor.java new file mode 100644 index 000000000..030fc84aa --- /dev/null +++ b/src/main/java/org/duckdb/DuckDBMemoryMonitor.java @@ -0,0 +1,271 @@ +package org.duckdb; + +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.util.concurrent.ConcurrentHashMap; +import java.util.logging.Level; +import java.util.logging.Logger; +import jdk.jfr.FlightRecorder; + +/** + * Manages per-database JFR memory monitors. + * + *

Activation

+ *

Monitoring is opt-in per connection. The caller supplies a user-assigned identifier + * via the {@value DuckDBDriver#JDBC_JFR_MEMORY_MONITOR} connection property; that value + * becomes the {@code component} field of every {@link DuckDBMemoryEvent} emitted for the + * connection's DuckDB instance. A monitor is created for a DuckDB instance when the first + * opted-in connection to it is opened and destroyed when the last such connection is closed. + * + *

Event scheduling

+ *

This class does not own a scheduler. It registers a single + * {@linkplain FlightRecorder#addPeriodicEvent periodic JFR hook} for + * {@link DuckDBMemoryEvent}; JFR invokes the hook at the period configured on + * the recording (e.g. {@code 1 s}) and only + * while at least one active recording has the event enabled. Consequently, the + * JDBC property is a pure enable/label switch and JFR alone governs the + * sampling rate and the enabled state of the event. + * + *

Attribution model

+ *

The monitor registry is keyed on the native DuckDB instance address so that multiple + * connections to the same underlying database share a single sample stream — avoiding + * double-counting of shared memory. The user-supplied component identifier is captured from + * the first opted-in connection and emitted on every event for that monitor. When attributing + * memory to distinct application components, use a distinct DuckDB instance per component and + * give each one a unique {@value DuckDBDriver#JDBC_JFR_MEMORY_MONITOR} value. + * + *

Thread safety

+ *

Lifecycle transitions (start+insert, stop+remove) are performed inside + * {@link ConcurrentHashMap#compute}, which provides per-key mutual exclusion. + * The JFR periodic hook iterates {@link ConcurrentHashMap#values} without + * locking and relies on volatile reads in {@link PerDbMonitor} for visibility. + */ +final class DuckDBMemoryMonitor { + + private static final Logger logger = Logger.getLogger(DuckDBMemoryMonitor.class.getName()); + + /** Registry: native DuckDB* address -> per-database monitor. */ + private static final ConcurrentHashMap monitors = new ConcurrentHashMap<>(); + + private static boolean initialized; + + // Non-instantiable + private DuckDBMemoryMonitor() { + } + + /** + * Registers the periodic JFR hook for {@link DuckDBMemoryEvent}. Idempotent + * and called from {@link DuckDBDriver}'s static initializer so that recordings + * started before the first monitored connection still see the event type. + * Iterating an empty monitor map at each tick is cheap, so there is no + * downside to registering unconditionally. + * + *

Any failure from JFR (e.g. {@link SecurityException} under a + * {@code SecurityManager}, or an unexpected {@link Error} from a non-standard + * JFR implementation) is caught and logged: the feature must never prevent + * {@link DuckDBDriver} from loading. + */ + static synchronized void init() { + if (initialized) { + return; + } + initialized = true; + try { + FlightRecorder.addPeriodicEvent(DuckDBMemoryEvent.class, DuckDBMemoryMonitor::firePeriodicEvent); + } catch (Throwable t) { + logger.log(Level.WARNING, "JFR periodic event registration failed; memory monitoring disabled", t); + } + } + + /** + * Called when a new connection is opened with JFR memory monitoring enabled. + * The caller guarantees {@code conn.monitorName} is non-null and that + * {@link #init()} has already been called (which is the contract of + * {@link DuckDBConnection#newConnection}). + * + *

A zero {@code dbAddress} is treated as a "no such instance" sentinel and + * skipped: the native layer is expected to return a non-zero pointer for every + * live DuckDB instance, so a zero value indicates a bug or an unexpected state. + * Registering under key {@code 0} would silently alias every such connection + * into a single monitor entry. + */ + static void connectionOpened(DuckDBConnection conn) { + if (conn.dbAddress == 0L) { + logger.log(Level.FINE, "Skipping JFR memory monitor registration: native DuckDB address is 0"); + return; + } + monitors.compute(conn.dbAddress, (k, existing) -> { + PerDbMonitor m = (existing != null) ? existing : new PerDbMonitor(); + m.open(conn); + return m; + }); + } + + /** + * Called when a monitored connection is closed. Symmetric with + * {@link #connectionOpened(DuckDBConnection)}: a zero {@code dbAddress} means + * no entry was ever registered, so there is nothing to tear down. + * + * @param dbAddress the native db address captured at construction time + */ + static void connectionClosed(long dbAddress) { + if (dbAddress == 0L) { + return; + } + monitors.compute(dbAddress, (k, existing) -> { + if (existing == null) { + return null; + } + return existing.close() ? null : existing; + }); + } + + /** + * JFR-invoked hook. Runs on a JFR thread at the period configured on the + * active recording. Must never throw. + */ + private static void firePeriodicEvent() { + for (PerDbMonitor m : monitors.values()) { + try { + m.sample(); + } catch (Throwable t) { + // Defensive: a failure in one monitor must not prevent emission for others. + logger.log(Level.FINE, "JFR memory sample failed", t); + } + } + } + + /** + * Per-database monitor state. + * + *

Two threads may touch an instance: the user thread closing a monitored + * connection (via {@link #close()}, invoked inside {@link ConcurrentHashMap#compute}) + * and the JFR periodic thread sampling the database (via {@link #sample()}). + * They are mutually exclusive under the instance's intrinsic lock so that + * {@link #close()} can never free a {@link PreparedStatement} that + * {@link #sample()} is executing against. + * + *

{@link #open(DuckDBConnection)} is only invoked inside {@code compute}, which + * already serialises it against itself and {@link #close()} per key, but it still + * synchronises on {@code this} to establish a happens-before with any concurrent + * {@link #sample()}. + */ + static final class PerDbMonitor { + + private static final String QUERY = + "SELECT tag, memory_usage_bytes, temporary_storage_bytes FROM duckdb_memory()"; + + // All fields below are guarded by the instance's intrinsic lock. + private int openConnections = 0; + private DuckDBConnection monitorConn; + private PreparedStatement sampleStmt; + private String component; + private long dbAddress; + + /** + * Opens (or re-attempts opening) the monitor connection and increments + * the ref count. Failure to create the monitor connection or prepare the + * sampling statement is logged and the ref count is still incremented so + * close-balance is preserved; a subsequent {@code open()} will retry + * while {@code monitorConn == null}. + */ + synchronized void open(DuckDBConnection conn) { + if (monitorConn == null) { + DuckDBConnection mc = null; + try { + mc = conn.duplicateForMonitor(); + sampleStmt = mc.prepareStatement(QUERY); + component = conn.monitorName; + dbAddress = conn.dbAddress; + monitorConn = mc; + } catch (SQLException e) { + logger.log(Level.WARNING, "Failed to open JFR memory-monitor connection; will retry on next open()", + e); + sampleStmt = null; + if (mc != null) { + try { + mc.close(); + } catch (SQLException ce) { + // best-effort cleanup on setup failure + } + } + } + } + openConnections++; + } + + /** + * Decrements the ref count; when it reaches zero, releases the cached + * statement and monitor connection, and signals the caller to remove + * the map entry. Blocks any in-flight {@link #sample()} until it + * completes, ensuring the statement is never closed mid-execution. + * + * @return {@code true} when the entry should be removed + */ + synchronized boolean close() { + if (--openConnections > 0) { + return false; + } + PreparedStatement ps = sampleStmt; + DuckDBConnection mc = monitorConn; + sampleStmt = null; + monitorConn = null; + if (ps != null) { + try { + ps.close(); + } catch (SQLException e) { + logger.log(Level.FINE, "Failed to close JFR memory-monitor statement", e); + } + } + if (mc != null) { + try { + mc.close(); + } catch (SQLException e) { + logger.log(Level.FINE, "Failed to close JFR memory-monitor connection", e); + } + } + return true; + } + + /** + * Invoked by the JFR periodic hook. Must not throw. Holds the monitor's + * intrinsic lock for the duration so that {@link #close()} cannot release + * the cached statement mid-execution. The query against {@code duckdb_memory()} + * is a quick system-table scan; any contention with a concurrent connection + * close is bounded by its runtime. + */ + synchronized void sample() { + PreparedStatement ps = sampleStmt; + if (ps == null) { + return; + } + String componentSnap = component; + long addr = dbAddress; + try (ResultSet rs = ps.executeQuery()) { + while (rs.next()) { + String tag = rs.getString(1); + long memoryUsageBytes = rs.getLong(2); + long temporaryStorageBytes = rs.getLong(3); + emitEvent(componentSnap, addr, tag, memoryUsageBytes, temporaryStorageBytes); + } + } catch (Throwable t) { + // Propagating would break JFR's periodic dispatch; log at FINE so + // operators can diagnose silent emission gaps without flooding logs. + logger.log(Level.FINE, "JFR memory sample query failed", t); + } + } + + private static void emitEvent(String component, long addr, String tag, long memoryUsageBytes, + long temporaryStorageBytes) { + DuckDBMemoryEvent event = new DuckDBMemoryEvent(); + event.begin(); + event.component = component; + event.tag = tag; + event.dbAddress = addr; + event.memoryUsageBytes = memoryUsageBytes; + event.temporaryStorageBytes = temporaryStorageBytes; + event.commit(); + } + } +} diff --git a/src/main/java/org/duckdb/DuckDBNative.java b/src/main/java/org/duckdb/DuckDBNative.java index 2267bae40..30b1c0e22 100644 --- a/src/main/java/org/duckdb/DuckDBNative.java +++ b/src/main/java/org/duckdb/DuckDBNative.java @@ -149,6 +149,9 @@ private static void loadFromCurrentJarDir(String libName) throws Exception { static native void duckdb_jdbc_destroy_db_ref(ByteBuffer db_ref) throws SQLException; + /** Returns the native address of the underlying DuckDB instance as a stable identity key. */ + static native long duckdb_jdbc_db_address(ByteBuffer conn_ref) throws SQLException; + static native void duckdb_jdbc_set_auto_commit(ByteBuffer conn_ref, boolean auto_commit) throws SQLException; static native boolean duckdb_jdbc_get_auto_commit(ByteBuffer conn_ref) throws SQLException; diff --git a/src/main/java/org/duckdb/JfrMemoryMonitor.java b/src/main/java/org/duckdb/JfrMemoryMonitor.java new file mode 100644 index 000000000..83e7325d7 --- /dev/null +++ b/src/main/java/org/duckdb/JfrMemoryMonitor.java @@ -0,0 +1,58 @@ +package org.duckdb; + +import java.util.logging.Level; +import java.util.logging.Logger; + +/** + * Indirection over {@link DuckDBMemoryMonitor} that is safe to reference on JVMs without JFR + * support (e.g. stripped Java 8 builds). When {@code jdk.jfr.FlightRecorder} is not available + * at runtime, every method on this class is a silent no-op and {@code DuckDBMemoryMonitor} — + * which imports {@code jdk.jfr.*} — is never resolved. + * + *

This relies on the JVM's lazy class resolution: an {@code invokestatic} against + * {@link DuckDBMemoryMonitor} only triggers resolution of that class when the instruction + * actually executes. The {@link #AVAILABLE} guard therefore prevents the JFR-dependent class + * from ever being loaded on non-JFR JVMs. + * + *

Only this class may reference {@code DuckDBMemoryMonitor} or {@code DuckDBMemoryEvent}; + * any direct reference from the other main classes would risk eager resolution on class load. + */ +final class JfrMemoryMonitor { + + private static final Logger logger = Logger.getLogger(JfrMemoryMonitor.class.getName()); + + private static final boolean AVAILABLE; + + static { + boolean available; + try { + Class.forName("jdk.jfr.FlightRecorder"); + available = true; + } catch (Throwable t) { + available = false; + logger.log(Level.FINE, "JFR memory monitor is not available on this JVM", t); + } + AVAILABLE = available; + } + + private JfrMemoryMonitor() { + } + + static void init() { + if (AVAILABLE) { + DuckDBMemoryMonitor.init(); + } + } + + static void connectionOpened(DuckDBConnection conn) { + if (AVAILABLE) { + DuckDBMemoryMonitor.connectionOpened(conn); + } + } + + static void connectionClosed(long dbAddress) { + if (AVAILABLE) { + DuckDBMemoryMonitor.connectionClosed(dbAddress); + } + } +} diff --git a/src/test/java/org/duckdb/TestDuckDBJDBC.java b/src/test/java/org/duckdb/TestDuckDBJDBC.java index 3d9572acc..9386b7f0f 100644 --- a/src/test/java/org/duckdb/TestDuckDBJDBC.java +++ b/src/test/java/org/duckdb/TestDuckDBJDBC.java @@ -40,6 +40,7 @@ import java.util.List; import java.util.Map; import java.util.Properties; +import java.util.Set; import java.util.TimeZone; import java.util.UUID; import java.util.concurrent.Callable; @@ -50,6 +51,9 @@ import java.util.logging.Logger; import javax.sql.rowset.CachedRowSet; import javax.sql.rowset.RowSetProvider; +import jdk.jfr.Recording; +import jdk.jfr.consumer.RecordedEvent; +import jdk.jfr.consumer.RecordingFile; import org.duckdb.test.TempDirectory; public class TestDuckDBJDBC { @@ -2276,6 +2280,173 @@ public static void DISABLED_test_extension_excel() throws Exception { } } + /** + * Verifies that: + *

    + *
  • No events are emitted when the {@code jdbc_jfr_memory_monitor} property is absent.
  • + *
  • Events are emitted when the property is set, for each independent in-memory database, + * tagged with the user-supplied component identifier.
  • + *
  • Each event carries a non-empty tag and non-negative memory values.
  • + *
+ */ + public static void test_jfr_memory_event() throws Exception { + // --- Part 1: no property -> no events --- + try (Recording recOff = new Recording()) { + recOff.enable("duckdb.MemoryUsage").withPeriod(Duration.ofMillis(100)); + recOff.start(); + try (Connection conn = DriverManager.getConnection(JDBC_URL)) { + Thread.sleep(300); + } + recOff.stop(); + List events = dumpEvents(recOff); + assertTrue(events.isEmpty(), "Expected no events when monitor property is not set"); + } + + // --- Part 2: property set -> events emitted under each supplied component --- + Properties propsA = new Properties(); + propsA.setProperty(DuckDBDriver.JDBC_JFR_MEMORY_MONITOR, "component-a"); + Properties propsB = new Properties(); + propsB.setProperty(DuckDBDriver.JDBC_JFR_MEMORY_MONITOR, "component-b"); + + try (Recording rec = new Recording()) { + rec.enable("duckdb.MemoryUsage").withPeriod(Duration.ofMillis(100)); + rec.start(); + + // Two independent in-memory databases, each with its own component name. + try (Connection conn1 = DriverManager.getConnection(JDBC_URL, propsA); + Connection conn2 = DriverManager.getConnection(JDBC_URL, propsB)) { + Thread.sleep(400); + } + + rec.stop(); + List events = dumpEvents(rec); + assertFalse(events.isEmpty(), "Expected at least one DuckDBMemory JFR event"); + + Set components = new HashSet<>(); + for (RecordedEvent event : events) { + String component = event.getString("component"); + assertTrue(component != null && !component.isEmpty(), "component field must be non-empty"); + components.add(component); + + String tag = event.getString("tag"); + assertTrue(tag != null && !tag.isEmpty(), "tag field must be non-empty"); + + long dbAddress = event.getLong("dbAddress"); + assertTrue(dbAddress != 0L, "dbAddress must be non-zero"); + + long memUsage = event.getLong("memoryUsageBytes"); + assertTrue(memUsage >= 0, "memoryUsageBytes must be >= 0"); + long tmpStorage = event.getLong("temporaryStorageBytes"); + assertTrue(tmpStorage >= 0, "temporaryStorageBytes must be >= 0"); + } + + // Each component must emit events under its own identifier. + assertTrue(components.contains("component-a") && components.contains("component-b"), + "Expected events for both component identifiers, got: " + components); + } + } + + /** + * After the last monitored connection is closed, the monitor entry must be removed so + * a subsequent recording observes no events. + */ + public static void test_jfr_memory_event_cleanup_after_close() throws Exception { + Properties props = new Properties(); + props.setProperty(DuckDBDriver.JDBC_JFR_MEMORY_MONITOR, "cleanup-test"); + + // Prime: open and close a monitored connection outside of any recording. + try (Connection conn = DriverManager.getConnection(JDBC_URL, props)) { + // no-op + } + + // Start a fresh recording with no monitored connection open. + try (Recording rec = new Recording()) { + rec.enable("duckdb.MemoryUsage").withPeriod(Duration.ofMillis(100)); + rec.start(); + Thread.sleep(400); + rec.stop(); + List events = dumpEvents(rec); + assertTrue(events.isEmpty(), + "Expected no events after all monitored connections were closed, got " + events.size()); + } + } + + /** + * For a file-based database, two monitored connections share a single underlying DuckDB + * instance, so the monitor samples it once and events carry a single {@code dbAddress}. + * The monitor must remain active while at least one monitored connection is open. + */ + public static void test_jfr_memory_event_file_db_refcount() throws Exception { + try (TempDirectory dir = new TempDirectory()) { + Path dbFile = dir.path().resolve("refcount.db"); + String url = JDBC_URL + dbFile; + Properties monitored = new Properties(); + monitored.setProperty(DuckDBDriver.JDBC_JFR_MEMORY_MONITOR, "shared-db"); + + try (Recording rec = new Recording()) { + rec.enable("duckdb.MemoryUsage").withPeriod(Duration.ofMillis(100)); + rec.start(); + + Connection conn1 = DriverManager.getConnection(url, monitored); + Connection conn2 = DriverManager.getConnection(url, monitored); + try { + Thread.sleep(250); + // Close one connection; monitor must stay alive via conn2. + conn1.close(); + Thread.sleep(250); + } finally { + conn2.close(); + } + + rec.stop(); + List events = dumpEvents(rec); + assertFalse(events.isEmpty(), "Expected events for the shared file-based DB"); + + Set addresses = new HashSet<>(); + Set components = new HashSet<>(); + for (RecordedEvent e : events) { + addresses.add(e.getLong("dbAddress")); + components.add(e.getString("component")); + } + assertEquals(addresses.size(), 1, + "Two connections to the same file DB must share dbAddress, got " + addresses); + assertEquals(components, new HashSet<>(singletonList("shared-db")), + "All events must be tagged with the supplied component, got: " + components); + } + } + } + + /** + * An empty {@code jdbc_jfr_memory_monitor} value must be treated as "not set" and emit + * no events, mirroring the absent-property behaviour. + */ + public static void test_jfr_memory_event_empty_property_disables() throws Exception { + Properties props = new Properties(); + props.setProperty(DuckDBDriver.JDBC_JFR_MEMORY_MONITOR, ""); + + try (Recording rec = new Recording()) { + rec.enable("duckdb.MemoryUsage").withPeriod(Duration.ofMillis(100)); + rec.start(); + try (Connection conn = DriverManager.getConnection(JDBC_URL, props)) { + Thread.sleep(300); + } + rec.stop(); + List events = dumpEvents(rec); + assertTrue(events.isEmpty(), + "Expected no events when jdbc_jfr_memory_monitor is empty, got " + events.size()); + } + } + + private static List dumpEvents(Recording rec) throws Exception { + Path jfrPath = Files.createTempFile("duckdb-jfr-", ".jfr"); + try { + rec.dump(jfrPath); + return RecordingFile.readAllEvents(jfrPath); + } finally { + Files.deleteIfExists(jfrPath); + } + } + public static void main(String[] args) throws Exception { String arg1 = args.length > 0 ? args[0] : ""; final int statusCode;