feat: Add remote address tracking for JDBC/Avatica SQL queries#19231
feat: Add remote address tracking for JDBC/Avatica SQL queries#19231dprmfl wants to merge 11 commits into
Conversation
17dd93d to
5aac04a
Compare
test: verify remote address capture in Avatica SQL queries test: Replace forbidden API calls in DruidAvaticaHandlerTest test: Update test expectations to include remoteAddress in query context & Update Quidem test files for remote address tracking test: Fix checkstyle violations
1563f92 to
fc1d96f
Compare
| public boolean handle(Request request, Response response, Callback callback) throws Exception | ||
| { | ||
| String requestURI = request.getHttpURI().getPath(); | ||
| String remoteAddr = Request.getRemoteAddr(request); |
There was a problem hiding this comment.
[P2] Remote address is still missing for protobuf Avatica clients
This only captures the remote address in the JSON handler, but Druid also exposes the protobuf Avatica endpoint and DruidAvaticaProtobufHandlerTest runs the same JDBC coverage with serialization=protobuf. Protobuf requests never set THREAD_LOCAL_REMOTE_ADDRESS, so openDruidConnection leaves remoteAddress absent and JDBC/protobuf queries still emit null remoteAddress, leaving half of the Avatica API unfixed.
There was a problem hiding this comment.
Fixed in 655299d — added remoteAddress collection for the protobuf Avatica endpoint as well.
| { | ||
| String remoteAddress = THREAD_LOCAL_REMOTE_ADDRESS.get(); | ||
| if (remoteAddress != null) { | ||
| context.put("remoteAddress", remoteAddress); |
There was a problem hiding this comment.
[P2] Internal remoteAddress is treated as a user context key
This adds remoteAddress to the JDBC session context before SqlQueryPlus.withContext(..., sessionContext) derives authContextKeys from sessionContext.keySet(). In clusters with query context authorization enabled, remoteAddress is not in AuthConfig.ALLOWED_CONTEXT_KEYS, so ordinary Avatica queries can now require WRITE permission on QUERY_CONTEXT remoteAddress even though the server, not the user, injected it. Pass this value outside the user context or mark it as an allowed internal key.
e49e8ea to
655299d
Compare
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 1 |
| P2 | 0 |
| P3 | 0 |
| Total | 1 |
Reviewed 12 of 12 changed files.
This is an automated review by Codex GPT-5.5
| { | ||
| String remoteAddress = THREAD_LOCAL_REMOTE_ADDRESS.get(); | ||
| if (remoteAddress != null) { | ||
| context.put("remoteAddress", remoteAddress); |
There was a problem hiding this comment.
[P1] Allow the injected remoteAddress key through context auth
This stores the server-injected remoteAddress in the JDBC session context, and later SqlQueryPlus.withContext treats every sessionContext key as an authContextKey. In clusters with authorizeQueryContextParams enabled, remoteAddress is not in AuthConfig.ALLOWED_CONTEXT_KEYS, so every Avatica query opened through this path can require WRITE permission on QUERY_CONTEXT:remoteAddress and fail authorization even though the user did not provide that key. Please either keep remoteAddress out of the user/session query context used for auth, or add this server-injected key to the allowed-context list.
There was a problem hiding this comment.
aefdd64
Thanks for the review. I've addressed this by passing remoteAddress as a dedicated parameter through the call chain instead of injecting it into the session context map.
Before:
- DruidMeta.openConnection() puts remoteAddress into the session context map
- SqlQueryPlus.withContext() picks it up as an authContextKey
- With authorizeQueryContextParams enabled, this triggers a WRITE permission check on QUERY_CONTEXT:remoteAddress
After:
- remoteAddress is removed from the session context map entirely
- Passed as a separate constructor/method parameter: DruidMeta → DruidConnection → DruidJdbcStatement / PreparedStatement → SqlStatementFactory → DirectStatement
- No longer treated as a user-supplied context key, so no unintended auth checks
This keeps it out of authContextKey entirely while still being available for metrics/logging. Verified by building only the druid-sql jar and swapping it into a running
cluster — works as expected.
(test version: 37.0.0)
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 0 |
| P2 | 1 |
| P3 | 0 |
| Total | 1 |
Reviewed 10 of 10 changed files. The earlier reply follow-ups look resolved: remoteAddress is no longer injected into session context, and the protobuf handler now sets the thread-local remote address. I found one remaining issue below.
This is an automated review by Codex GPT-5.5
| final String remoteAddress | ||
| ) | ||
| { | ||
| super(lifecycleToolbox, queryPlus, null); |
There was a problem hiding this comment.
[P2] Pass remoteAddress into the PreparedStatement reporter
The Avatica path now passes remoteAddress into the PreparedStatement constructor, but the constructor still calls super(lifecycleToolbox, queryPlus, null), so the PreparedStatement own SqlExecutionReporter is created without the address. DruidJdbcPreparedStatement.close() later calls sqlStatement.close(), which emits the prepare-phase SQL log and metrics, so prepared-statement prepare logs still contain an empty remoteAddress even though execution logs have it. Pass remoteAddress to super here and tighten the prepared-statement test to assert the relevant log entries all carry a remote address.
There was a problem hiding this comment.
problem
- super(..., null) meant the prepare-phase SqlExecutionReporter was built without the address,
- DruidJdbcPreparedStatement.close() → sqlStatement.close() emitted prepare-phase logs/metrics with an empty remoteAddress.
Fixed by forwarding remoteAddress to super() in PreparedStatement's three-arg constructor
90b6a85
There was a problem hiding this comment.
Confirmed, thanks. The current head forwards remoteAddress into the PreparedStatement reporter via super(...), and the prepared-statement test now checks that every emitted prepared-statement log entry has a non-empty remote address, so this thread looks resolved.
Reviewed 10 of 10 changed files.
This is an automated review by Codex GPT-5.5
There was a problem hiding this comment.
Fixed 2 CI failures:
- Added feat: prefix to PR title
- Added missing ConnectionMetaData mock to DruidAvaticaJsonHandlerTest (needed for remoteAddress collection) 440b7aa
… logs aren't empty
gianm
left a comment
There was a problem hiding this comment.
Looks good to me. I had some minor comments. Let me know if you want to do them prior to merge or not.
| remoteAddress.contains("localhost") || | ||
| remoteAddress.contains("127.0.0.1") || | ||
| remoteAddress.contains("0:0:0:0:0:0:0:1") || | ||
| (remoteAddress.contains(".") && remoteAddress.length() >= 7) |
There was a problem hiding this comment.
It would be better to make this look specifically for IP addresses, for example by using a regex. It doesn't have to be super strict. ^\d+\.\d+\.\d+\.\d+$ is fine. The idea is just to make sure that extraneous stuff isn't included in the remoteAddress.
| this(lifecycleToolbox, queryPlus, null); | ||
| } | ||
|
|
||
| public PreparedStatement( |
There was a problem hiding this comment.
nit, but I'd mark remoteAddress as @Nullable and remove the 2-arg constructor. Generally it's nicer not to have a lot of different constructors.
| return new DirectStatement(lifecycleToolbox, sqlRequest, null); | ||
| } | ||
|
|
||
| public DirectStatement directStatement(final SqlQueryPlus sqlRequest, final String remoteAddress) |
There was a problem hiding this comment.
nit, but I'd mark remoteAddress as @Nullable and remove the 1-arg form of directStatement and preparedStatement.
FrankChen021
left a comment
There was a problem hiding this comment.
I reviewed the follow-up on the previous PreparedStatement remoteAddress thread. The current code still forwards remoteAddress into the PreparedStatement reporter and prepared-statement execution path, and the latest reply only covers CI/test-harness fixes, so no additional inline reply is needed.
Reviewed 13 of 13 changed files.
This is an automated review by Codex GPT-5.5
|
@dprmfl approved, but there are some check errors. Could you please look into it? It looks like a few of them are this: |
…vaticaHandlerTest
4787c35 to
f166fd0
Compare
FrankChen021
left a comment
There was a problem hiding this comment.
I have reviewed the code for correctness, edge cases, concurrency, and integration risks; no issues found.
The latest update replaces forbidden String#matches calls with precompiled Pattern constants in the Avatica remote-address test.
Reviewed 13 of 13 changed files.
This is an automated review by Codex GPT-5.5
|
@FrankChen021 Hi. #19231 (review) It means, You approved this PR? |
|
@gianm |
cc. @FrankChen021 |
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 0 |
| P2 | 1 |
| P3 | 0 |
| Total | 1 |
Reviewed 13 of 13 changed files.
This is an automated review by Codex GPT-5.5
| final Map<String, Object> context | ||
| ) | ||
| { | ||
| String remoteAddress = THREAD_LOCAL_REMOTE_ADDRESS.get(); |
There was a problem hiding this comment.
[P2] Do not freeze the remote address at connection open
The handlers set THREAD_LOCAL_REMOTE_ADDRESS for every Avatica RPC, but only openDruidConnection reads it and then stores the value on DruidConnection. Later createStatement, prepare, prepareAndExecute, and execute paths use that stored address, so a long-lived JDBC connection that reconnects through a different proxy/socket, or any reused Avatica connection id from another remote, has SQL request logs and the remoteAddress metric attributed to the opener instead of the actual requester. Please pass the current request address into statement/execute creation or refresh the connection address per RPC.
There was a problem hiding this comment.
4d5460d - fixed. The remote address is no longer frozen on DruidConnection at open time. Since the JSON/Protobuf handlers already set THREAD_LOCAL_REMOTE_ADDRESS on every RPC, createStatement and prepare now read it at call time and pass it down, so request logs and the remoteAddress metric are attributed to the actual caller of each request rather than the connection opener.
There was a problem hiding this comment.
Thanks for the follow-up. The connection-open freeze is fixed, but I think this is still stale one level later: createStatement and prepare now read THREAD_LOCAL_REMOTE_ADDRESS, then store that value on the statement/prepared statement. Later execution RPCs do not refresh it: DruidMeta.prepareAndExecute calls DruidJdbcStatement.execute(...), which builds the DirectStatement with DruidJdbcStatement.remoteAddress, and DruidMeta.execute calls DruidJdbcPreparedStatement.execute(...), whose PreparedStatement.execute(...) uses the address captured during prepare. So a long-lived Statement/PreparedStatement reused after reconnect or from another remote is still attributed to the create/prepare caller rather than the actual execute caller. Could you pass the current THREAD_LOCAL_REMOTE_ADDRESS.get() through the execute paths as well?
Reviewed 8 of 13 changed files for apache/druid#19231.
Fixes #19230.
Description
remoteAddress is always
nullfor JDBC/Avatica queries, even though it's a documented dimension forsqlQuery/time,sqlQuery/bytes, and sqlQuery/planningTimeMs. This PR fixes that.Problem
See #19230. In short, remoteAddress was hardcoded as
nullinPreparedStatement.javaandSqlStatementFactory.javafor the Avatica path, whileSqlResource.java(the HTTP path) correctly passeshttpRequest.getRemoteAddr().
Solution
remoteAddrfrom the HTTP request and stores it in aThreadLocal. Cleans up infinallyto prevent memory leaks.ThreadLocalinopenConnection()and injectsremoteAddressinto the connection context map.remoteAddressfrom context and passes it toDirectStatement.remoteAddressas a dimension in metrics and includes it in request logs.No changes to
DruidConnectionor any existing data structures —remoteAddresspiggybacks on the context map that's already there.Testing
Added three test methods to DruidAvaticaHandlerTest:
Release note
sqlQuery/time,sqlQuery/bytes, and sqlQuery/planningTimeMs metrics now includeremoteAddressfor JDBC/Avatica queries, consistent with the HTTP SQL API.Key changed/added classes in this PR
DruidAvaticaJsonHandler— ThreadLocal-based remote address captureDruidMeta— connection context enrichment with remote addressSqlStatementFactory— remote address extraction and propagationPreparedStatement— remote address handling for prepared statementsSqlExecutionReporter— remote address emission in metrics and logsThis PR has: