Skip to content

[telemetry improvement]: emit sql_operation, auth_type, driver_connection_params - ship even if null#396

Open
samikshya-db wants to merge 3 commits into
mainfrom
samikshya/telemetry-align-receiver-schema
Open

[telemetry improvement]: emit sql_operation, auth_type, driver_connection_params - ship even if null#396
samikshya-db wants to merge 3 commits into
mainfrom
samikshya/telemetry-align-receiver-schema

Conversation

@samikshya-db
Copy link
Copy Markdown
Collaborator

@samikshya-db samikshya-db commented May 27, 2026

Summary

Although these can be extracted from close monitoring of logs, it is good to populate these upfront. The producer + aggregator already had the data; the exporter was dropping it on the floor and the receiver-side rows were showing up with empty sql_operation and missing auth_type / driver_connection_params.

  • sql_operation.operation_detail.operation_type — on both connection (CREATE_SESSION / DELETE_SESSION) and statement (EXECUTE_STATEMENT / LIST_*) events. Already aggregated, just not exported.
  • sql_operation.is_compressed — surfaced from the aggregator's existing compressed flag (true-on-any-CloudFetch-chunk).
  • sql_operation.execution_resultmapResultFormatToTelemetryType(undefined) now resolves to FORMAT_UNSPECIFIED (matching its existing default-case fallback) instead of undefined. This makes the existing if (metric.resultFormat || ...) gate light up for DDL/DML and SELECTs that were closed without iterating rows, so every statement-complete row gets a populated sql_operation block.
  • Top-level auth_type — from DriverConfiguration.authType.
  • New top-level driver_connection_params block — host_info.host_url, http_path, enable_arrow, enable_direct_results, socket_timeout, enable_metric_view_metadata, cloud_fetch_enabled, lz4_enabled, retry_max_attempts, cloud_fetch_concurrent_downloads. Field names mirror the JDBC DriverConnectionParameters schema.

auth_type and driver_connection_params are gated behind the same includeCorrelation (authenticated-export) guard as system_configurationhost_url and http_path are workspace-correlated identifiers and must not ship on the unauthenticated endpoint. Existing test omits workspace_id, session_id, statement_id from unauth payload continues to enforce that contract.

driverConfig only rides on connection metrics today, so the new top-level blocks ride along with CREATE_SESSION / DELETE_SESSION events; receivers can join statement-level rows to the corresponding connection row by session_id.

Test plan

  • npm run type-check (build) — clean
  • npx mocha --require ts-node/register tests/unit/telemetry/*.test.ts — 300 passing
  • Re-run the lumberjack query (entry.sql_statement_id is not null) after deploy and confirm sql_operation is non-null on every statement row and carries operation_detail.operation_type
  • Confirm entry.auth_type and entry.driver_connection_params.host_info.host_url populate on CREATE_SESSION rows in the authenticated stream and remain absent in the unauthenticated stream

This pull request and its description were written by Isaac.

…tion_params

Aligns the Node.js telemetry payload with the receiver schema (JDBC parity).
Several fields were being collected by the producer/aggregator but dropped
on the floor in the exporter:

- `sql_operation.operation_detail.operation_type` (CREATE_SESSION /
  DELETE_SESSION / EXECUTE_STATEMENT / LIST_*) on both connection and
  statement events.
- `sql_operation.is_compressed` on statement events when CloudFetch
  compression observability is available.
- `sql_operation.execution_result` now resolves to `FORMAT_UNSPECIFIED`
  when result-set metadata isn't available (DDL/DML, or SELECTs closed
  without fetching), so the `sql_operation` block fires on every
  statement-complete event instead of being skipped.
- Top-level `auth_type` from `DriverConfiguration.authType`.
- New `driver_connection_params` block carrying `host_info.host_url`,
  `http_path`, `enable_arrow`, `enable_direct_results`, `socket_timeout`,
  `enable_metric_view_metadata`, `cloud_fetch_enabled`, `lz4_enabled`,
  `retry_max_attempts`, `cloud_fetch_concurrent_downloads`.

Both `auth_type` and `driver_connection_params` are gated behind the
existing authenticated-export guard (same path as `system_configuration`),
since `host_url` and `http_path` are workspace-correlated identifiers
that must not ship on the unauthenticated endpoint.

Co-authored-by: Isaac
@samikshya-db samikshya-db changed the title telemetry: emit sql_operation, auth_type, driver_connection_params [telemetry improvement]: emit sql_operation, auth_type, driver_connection_params - ship even if null May 27, 2026
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Drop four Node-specific fields that aren't in the receiver's
`DriverConnectionParameters` proto schema:

  - cloud_fetch_enabled
  - lz4_enabled
  - retry_max_attempts
  - cloud_fetch_concurrent_downloads

These would be ignored at deserialization on the receiver side anyway.
Remaining block contains only proto-defined fields: `host_info`,
`http_path`, `enable_arrow`, `enable_direct_results`, `socket_timeout`,
`enable_metric_view_metadata`.

Co-authored-by: Isaac
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

The four Node-specific fields dropped in the previous commit have no
proto equivalent (CloudFetch enablement, LZ4, retry config, concurrent
downloads aren't tracked at the receiver). But two unrelated proto
fields can be filled from existing Node config:

  - `mode = "THRIFT"` — Node.js always uses Thrift transport (no SEA
    client). JDBC's `DatabricksClientType` enum is `SEA | THRIFT`.
  - `use_proxy` — derived from `ConnectionOptions.proxy`. Stored on
    DBSQLClient at `connect()` time alongside the existing host /
    httpPath / authType fields, then surfaced through a new
    `DriverConfiguration.useProxy` field.

Co-authored-by: Isaac
@samikshya-db samikshya-db deployed to azure-prod May 27, 2026 15:33 — with GitHub Actions Active
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant