Skip to content

feat: Add remote address tracking for JDBC/Avatica SQL queries#19231

Open
dprmfl wants to merge 11 commits into
apache:masterfrom
dprmfl:19230-fix-jdbc-avatica-remote-address
Open

feat: Add remote address tracking for JDBC/Avatica SQL queries#19231
dprmfl wants to merge 11 commits into
apache:masterfrom
dprmfl:19230-fix-jdbc-avatica-remote-address

Conversation

@dprmfl

@dprmfl dprmfl commented Mar 30, 2026

Copy link
Copy Markdown

Fixes #19230.

Description

remoteAddress is always null for JDBC/Avatica queries, even though it's a documented dimension for sqlQuery/time, sqlQuery/bytes, and sqlQuery/planningTimeMs. This PR fixes that.

Problem

See #19230. In short, remoteAddress was hardcoded as null in PreparedStatement.java and SqlStatementFactory.java for the Avatica path, while SqlResource.java (the HTTP path) correctly passes
httpRequest.getRemoteAddr().

Solution

  1. DruidAvaticaJsonHandler — extracts remoteAddr from the HTTP request and stores it in a ThreadLocal. Cleans up in finally to prevent memory leaks.
  2. DruidMeta — reads from ThreadLocal in openConnection() and injects remoteAddress into the connection context map.
  3. SqlStatementFactory & PreparedStatement — reads remoteAddress from context and passes it to DirectStatement.
  4. SqlExecutionReporter — emits remoteAddress as a dimension in metrics and includes it in request logs.

No changes to DruidConnection or any existing data structures —remoteAddress piggybacks on the context map that's already there.

Testing

Added three test methods to DruidAvaticaHandlerTest:

  • testRemoteAddressInLogs — verifies remote address is captured for normal queries
  • testRemoteAddressInFailedQuery — verifies remote address is captured even when queries fail
  • testRemoteAddressInPreparedStatement — verifies remote address is captured for prepared statements

Release note

sqlQuery/time, sqlQuery/bytes, and sqlQuery/planningTimeMs metrics now include remoteAddress for JDBC/Avatica queries, consistent with the HTTP SQL API.


Key changed/added classes in this PR
  • DruidAvaticaJsonHandler — ThreadLocal-based remote address capture
  • DruidMeta — connection context enrichment with remote address
  • SqlStatementFactory — remote address extraction and propagation
  • PreparedStatement — remote address handling for prepared statements
  • SqlExecutionReporter — remote address emission in metrics and logs
image

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.

@dprmfl dprmfl changed the title Add remote address tracking for JDBC Avatica SQL queries Add remote address tracking for JDBC/Avatica SQL queries Mar 30, 2026
@dprmfl dprmfl force-pushed the 19230-fix-jdbc-avatica-remote-address branch 2 times, most recently from 17dd93d to 5aac04a Compare March 31, 2026 06:14
test: verify remote address capture in Avatica SQL queries

test: Replace forbidden API calls in DruidAvaticaHandlerTest

test: Update test expectations to include remoteAddress in query context & Update Quidem test files for remote address tracking

test: Fix checkstyle violations
@dprmfl dprmfl force-pushed the 19230-fix-jdbc-avatica-remote-address branch from 1563f92 to fc1d96f Compare March 31, 2026 06:16
public boolean handle(Request request, Response response, Callback callback) throws Exception
{
String requestURI = request.getHttpURI().getPath();
String remoteAddr = Request.getRemoteAddr(request);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Remote address is still missing for protobuf Avatica clients

This only captures the remote address in the JSON handler, but Druid also exposes the protobuf Avatica endpoint and DruidAvaticaProtobufHandlerTest runs the same JDBC coverage with serialization=protobuf. Protobuf requests never set THREAD_LOCAL_REMOTE_ADDRESS, so openDruidConnection leaves remoteAddress absent and JDBC/protobuf queries still emit null remoteAddress, leaving half of the Avatica API unfixed.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 655299d — added remoteAddress collection for the protobuf Avatica endpoint as well.

{
String remoteAddress = THREAD_LOCAL_REMOTE_ADDRESS.get();
if (remoteAddress != null) {
context.put("remoteAddress", remoteAddress);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Internal remoteAddress is treated as a user context key

This adds remoteAddress to the JDBC session context before SqlQueryPlus.withContext(..., sessionContext) derives authContextKeys from sessionContext.keySet(). In clusters with query context authorization enabled, remoteAddress is not in AuthConfig.ALLOWED_CONTEXT_KEYS, so ordinary Avatica queries can now require WRITE permission on QUERY_CONTEXT remoteAddress even though the server, not the user, injected it. Pass this value outside the user context or mark it as an allowed internal key.

@dprmfl dprmfl force-pushed the 19230-fix-jdbc-avatica-remote-address branch from e49e8ea to 655299d Compare May 12, 2026 01:39

@FrankChen021 FrankChen021 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity Findings
P0 0
P1 1
P2 0
P3 0
Total 1

Reviewed 12 of 12 changed files.


This is an automated review by Codex GPT-5.5

{
String remoteAddress = THREAD_LOCAL_REMOTE_ADDRESS.get();
if (remoteAddress != null) {
context.put("remoteAddress", remoteAddress);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Allow the injected remoteAddress key through context auth

This stores the server-injected remoteAddress in the JDBC session context, and later SqlQueryPlus.withContext treats every sessionContext key as an authContextKey. In clusters with authorizeQueryContextParams enabled, remoteAddress is not in AuthConfig.ALLOWED_CONTEXT_KEYS, so every Avatica query opened through this path can require WRITE permission on QUERY_CONTEXT:remoteAddress and fail authorization even though the user did not provide that key. Please either keep remoteAddress out of the user/session query context used for auth, or add this server-injected key to the allowed-context list.

@dprmfl dprmfl May 14, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aefdd64
Thanks for the review. I've addressed this by passing remoteAddress as a dedicated parameter through the call chain instead of injecting it into the session context map.

Before:

  • DruidMeta.openConnection() puts remoteAddress into the session context map
  • SqlQueryPlus.withContext() picks it up as an authContextKey
  • With authorizeQueryContextParams enabled, this triggers a WRITE permission check on QUERY_CONTEXT:remoteAddress

After:

  • remoteAddress is removed from the session context map entirely
  • Passed as a separate constructor/method parameter: DruidMeta → DruidConnection → DruidJdbcStatement / PreparedStatement → SqlStatementFactory → DirectStatement
  • No longer treated as a user-supplied context key, so no unintended auth checks

This keeps it out of authContextKey entirely while still being available for metrics/logging. Verified by building only the druid-sql jar and swapping it into a running
cluster — works as expected.

image

(test version: 37.0.0)

@dprmfl dprmfl requested a review from FrankChen021 May 14, 2026 03:32

@FrankChen021 FrankChen021 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity Findings
P0 0
P1 0
P2 1
P3 0
Total 1

Reviewed 10 of 10 changed files. The earlier reply follow-ups look resolved: remoteAddress is no longer injected into session context, and the protobuf handler now sets the thread-local remote address. I found one remaining issue below.


This is an automated review by Codex GPT-5.5

final String remoteAddress
)
{
super(lifecycleToolbox, queryPlus, null);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Pass remoteAddress into the PreparedStatement reporter

The Avatica path now passes remoteAddress into the PreparedStatement constructor, but the constructor still calls super(lifecycleToolbox, queryPlus, null), so the PreparedStatement own SqlExecutionReporter is created without the address. DruidJdbcPreparedStatement.close() later calls sqlStatement.close(), which emits the prepare-phase SQL log and metrics, so prepared-statement prepare logs still contain an empty remoteAddress even though execution logs have it. Pass remoteAddress to super here and tighten the prepared-statement test to assert the relevant log entries all carry a remote address.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

problem

  • super(..., null) meant the prepare-phase SqlExecutionReporter was built without the address,
  • DruidJdbcPreparedStatement.close() → sqlStatement.close() emitted prepare-phase logs/metrics with an empty remoteAddress.

Fixed by forwarding remoteAddress to super() in PreparedStatement's three-arg constructor
90b6a85

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed, thanks. The current head forwards remoteAddress into the PreparedStatement reporter via super(...), and the prepared-statement test now checks that every emitted prepared-statement log entry has a non-empty remote address, so this thread looks resolved.

Reviewed 10 of 10 changed files.


This is an automated review by Codex GPT-5.5

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 2 CI failures:

  • Added feat: prefix to PR title
  • Added missing ConnectionMetaData mock to DruidAvaticaJsonHandlerTest (needed for remoteAddress collection) 440b7aa

@dprmfl dprmfl requested a review from FrankChen021 May 15, 2026 01:01
@dprmfl dprmfl changed the title Add remote address tracking for JDBC/Avatica SQL queries feat: remote address tracking for JDBC/Avatica SQL queries May 18, 2026
@dprmfl dprmfl changed the title feat: remote address tracking for JDBC/Avatica SQL queries feat: Add remote address tracking for JDBC/Avatica SQL queries May 18, 2026

@gianm gianm left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I had some minor comments. Let me know if you want to do them prior to merge or not.

remoteAddress.contains("localhost") ||
remoteAddress.contains("127.0.0.1") ||
remoteAddress.contains("0:0:0:0:0:0:0:1") ||
(remoteAddress.contains(".") && remoteAddress.length() >= 7)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to make this look specifically for IP addresses, for example by using a regex. It doesn't have to be super strict. ^\d+\.\d+\.\d+\.\d+$ is fine. The idea is just to make sure that extraneous stuff isn't included in the remoteAddress.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed a931b6e

this(lifecycleToolbox, queryPlus, null);
}

public PreparedStatement(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, but I'd mark remoteAddress as @Nullable and remove the 2-arg constructor. Generally it's nicer not to have a lot of different constructors.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed 8a88750

return new DirectStatement(lifecycleToolbox, sqlRequest, null);
}

public DirectStatement directStatement(final SqlQueryPlus sqlRequest, final String remoteAddress)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, but I'd mark remoteAddress as @Nullable and remove the 1-arg form of directStatement and preparedStatement.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed 637e1cb

@dprmfl dprmfl requested a review from gianm May 20, 2026 07:36

@FrankChen021 FrankChen021 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the follow-up on the previous PreparedStatement remoteAddress thread. The current code still forwards remoteAddress into the PreparedStatement reporter and prepared-statement execution path, and the latest reply only covers CI/test-harness fixes, so no additional inline reply is needed.

Reviewed 13 of 13 changed files.


This is an automated review by Codex GPT-5.5

@gianm

gianm commented May 22, 2026

Copy link
Copy Markdown
Contributor

@dprmfl approved, but there are some check errors. Could you please look into it? It looks like a few of them are this:

Error:  Forbidden method invocation: java.lang.String#matches(java.lang.String) [Use startsWith(), endsWith(), contains(), or compile and cache a Pattern explicitly]
Error:    in org.apache.druid.sql.avatica.DruidAvaticaHandlerTest (DruidAvaticaHandlerTest.java:1976)
Error:  Forbidden method invocation: java.lang.String#matches(java.lang.String) [Use startsWith(), endsWith(), contains(), or compile and cache a Pattern explicitly]
Error:    in org.apache.druid.sql.avatica.DruidAvaticaHandlerTest (DruidAvaticaHandlerTest.java:1977)

@dprmfl dprmfl force-pushed the 19230-fix-jdbc-avatica-remote-address branch from 4787c35 to f166fd0 Compare May 26, 2026 00:55
@dprmfl

dprmfl commented May 26, 2026

Copy link
Copy Markdown
Author

@gianm
f166fd0
Fixed — replaced String#matches() with pre-compiled Pattern constants (IPV4_PATTERN, IPV6_PATTERN) to satisfy the forbidden-apis check.
Verified locally with mvn test-compile forbiddenapis:testCheck -pl sql and it passes. Thanks for the quick review!

@dprmfl dprmfl requested a review from gianm May 26, 2026 00:58
@dprmfl dprmfl requested a review from FrankChen021 May 27, 2026 05:44

@FrankChen021 FrankChen021 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the code for correctness, edge cases, concurrency, and integration risks; no issues found.

The latest update replaces forbidden String#matches calls with precompiled Pattern constants in the Avatica remote-address test.

Reviewed 13 of 13 changed files.


This is an automated review by Codex GPT-5.5

@dkhwangbo

Copy link
Copy Markdown
Contributor

@FrankChen021 Hi. #19231 (review) It means, You approved this PR?

@dprmfl

dprmfl commented Jun 11, 2026

Copy link
Copy Markdown
Author

@gianm
Hi, the only change since your last approval is the forbidden-APIs fix. Could you re-approve when you get a chance? Thanks!

@dprmfl

dprmfl commented Jun 16, 2026

Copy link
Copy Markdown
Author

@gianm

❯ 📦 Process test results                                                                                                                                                      
  Preparing 1 report as configured.                                                                                                                                            
  embedded-tests/target/test-classes/org/apache/druid/testing/embedded/indexer/ITHttpInputSourceTest.class:38 | Task[ddimjpfm] failed with error[Failed in phase[segment       
generation]. See task logs for details.] ==> expected: <SUCCESS> but was: <FAILED> 3.1s                                                                                        
  ℹ️ Posting with conclusion 'failure' to https://github.com/apache/druid/pull/19231 (sha: f166fd0e0bea69325a4ab95c4e48d6204304e529)                                           
🚀 Publish results                                                                                                                                                             
  ℹ️ - Test Report - test-jdk25-[I*,A*,U*] - 18602 tests run, 17847 passed, 754 skipped, 1 failed.                                                                             
     🧪 - embedded-tests/target/test-classes/org/apache/druid/testing/embedded/indexer/ITHttpInputSourceTest.class | Task[ddimjpfm] failed with error[Failed in phase[segment  
generation]. See task logs for details.] ==> expected: <SUCCESS> but was: <FAILED>                                                                                             
  Error: Task[ddimjpfm] failed with error[Failed in phase[segment generation]. See task logs for details.] ==> expected: <SUCCESS> but was: <FAILED>                           
  ⏩ Skipped creation of job summary                                                                                                                                           
  Error: ❌ Tests reported 1 failures   

ITHttpInputSourceTest failure seems unrelated — it fetches from an external URL (druid.apache.org) which likely had a network issue in CI. Could a committer re-run the failed job?

cc. @FrankChen021

@FrankChen021 FrankChen021 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity Findings
P0 0
P1 0
P2 1
P3 0
Total 1

Reviewed 13 of 13 changed files.


This is an automated review by Codex GPT-5.5

final Map<String, Object> context
)
{
String remoteAddress = THREAD_LOCAL_REMOTE_ADDRESS.get();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Do not freeze the remote address at connection open

The handlers set THREAD_LOCAL_REMOTE_ADDRESS for every Avatica RPC, but only openDruidConnection reads it and then stores the value on DruidConnection. Later createStatement, prepare, prepareAndExecute, and execute paths use that stored address, so a long-lived JDBC connection that reconnects through a different proxy/socket, or any reused Avatica connection id from another remote, has SQL request logs and the remoteAddress metric attributed to the opener instead of the actual requester. Please pass the current request address into statement/execute creation or refresh the connection address per RPC.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4d5460d - fixed. The remote address is no longer frozen on DruidConnection at open time. Since the JSON/Protobuf handlers already set THREAD_LOCAL_REMOTE_ADDRESS on every RPC, createStatement and prepare now read it at call time and pass it down, so request logs and the remoteAddress metric are attributed to the actual caller of each request rather than the connection opener.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the follow-up. The connection-open freeze is fixed, but I think this is still stale one level later: createStatement and prepare now read THREAD_LOCAL_REMOTE_ADDRESS, then store that value on the statement/prepared statement. Later execution RPCs do not refresh it: DruidMeta.prepareAndExecute calls DruidJdbcStatement.execute(...), which builds the DirectStatement with DruidJdbcStatement.remoteAddress, and DruidMeta.execute calls DruidJdbcPreparedStatement.execute(...), whose PreparedStatement.execute(...) uses the address captured during prepare. So a long-lived Statement/PreparedStatement reused after reconnect or from another remote is still attributed to the create/prepare caller rather than the actual execute caller. Could you pass the current THREAD_LOCAL_REMOTE_ADDRESS.get() through the execute paths as well?

Reviewed 8 of 13 changed files for apache/druid#19231.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sqlQuery metrics always emit null remoteAddress for JDBC/Avatica queries

4 participants