feat: add OpenLineage request logger extension#19107
Conversation
690e0ef to
639b081
Compare
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 0 |
| P2 | 1 |
| P3 | 0 |
| Total | 1 |
This is an automated review by Codex GPT-5
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 0 |
| P2 | 1 |
| P3 | 0 |
| Total | 1 |
This is an automated review by Codex GPT-5
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 1 |
| P2 | 0 |
| P3 | 0 |
| Total | 1 |
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 1 |
| P2 | 0 |
| P3 | 0 |
| Total | 1 |
Reviewed 11 of 11 changed files.
This is an automated review by Codex GPT-5.5
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 0 |
| P2 | 1 |
| P3 | 0 |
| Total | 1 |
Reviewed 11 of 11 changed files.
This is an automated review by Codex GPT-5.5
FrankChen021
left a comment
There was a problem hiding this comment.
Follow-up handled: the MSQ output extraction now uses DruidSqlParser, normalizes schema/catalog-prefixed datasource targets, skips EXTERN exports, and includes regression coverage for the cases raised.
Reviewed 11 of 11 changed files.
This is an automated review by Codex GPT-5.5
FrankChen021
left a comment
There was a problem hiding this comment.
I have reviewed the code for correctness, edge cases, concurrency, and integration risks; no issues found.
Reviewed 13 of 13 changed files.
This is an automated review by Codex GPT-5.5
Add CTEs, deduplicated, and emits to resolve spellcheck errors
Fixes spellcheck error for openlineage-emitter compound word
1bd300d to
05ddb4a
Compare
Description
Added
extensions-contrib/openlineage-emitteras a contrib extension that uses theRequestLoggerto transform and send lineage information to any OpenLineage-compatible API.For SQL queries, the SQL text is parsed with the Calcite parser to extract input datasources (FROM clauses, JOINs, CTEs) and output datasources (INSERT INTO). For native queries, table names are read from
DataSource.getTableNames(). Native sub-queries spawned by a SQL execution are deduplicated against the SQL-level event.Each event includes standard OpenLineage facets (
processing_engine,jobType,sql,errorMessage) and custom Druid facets (druid_query_contextwith user identity and query metadata,druid_query_statisticswith duration and bytes).Transport is configurable:
CONSOLE(default) logs JSON to the Druid log;HTTPPOSTs to an OpenLineage endpoint such as Marquez. Can be combined with other loggers via thecomposingprovider.This PR has: