feat: add INLINE + ARROW_STREAM format support for analytics plugin#256
feat: add INLINE + ARROW_STREAM format support for analytics plugin#256jamesbroadhead wants to merge 6 commits intomainfrom
Conversation
Some serverless warehouses only support ARROW_STREAM with INLINE disposition, but the analytics plugin only offered JSON_ARRAY (INLINE) and ARROW_STREAM (EXTERNAL_LINKS). This adds a new "ARROW_STREAM" format option that uses INLINE disposition, making the plugin compatible with these warehouses. Fixes #242
Tests verify: - ARROW_STREAM format passes INLINE disposition + ARROW_STREAM format - ARROW format passes EXTERNAL_LINKS disposition + ARROW_STREAM format - Default JSON format does not pass disposition or format overrides
The server-side ARROW_STREAM format added in the previous commit was not exposed to the frontend or typegen: - Add "ARROW_STREAM" to AnalyticsFormat in appkit-ui hooks - Add "arrow_stream" to DataFormat in chart types - Handle "arrow_stream" in useChartData's resolveFormat() - Make typegen resilient to ARROW_STREAM-only warehouses by retrying DESCRIBE QUERY without format when JSON_ARRAY is rejected Co-authored-by: Isaac Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
…compatibility ARROW_STREAM with INLINE disposition is the only format that works across all warehouse types, including serverless warehouses that reject JSON_ARRAY. Change the default from JSON to ARROW_STREAM throughout: - Server: defaults.ts, analytics plugin request handler - Client: useAnalyticsQuery, UseAnalyticsQueryOptions, useChartData - Tests: update assertions for new default JSON and ARROW formats remain available via explicit format parameter. Co-authored-by: Isaac Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
When using the default ARROW_STREAM format, the analytics plugin now automatically falls back through formats if the warehouse rejects one: ARROW_STREAM → JSON → ARROW. This handles warehouses that only support a subset of format/disposition combinations without requiring users to know their warehouse's capabilities. Explicit format requests (JSON, ARROW) are respected without fallback. Co-authored-by: Isaac Signed-off-by: James Broadhead <jamesbroadhead@gmail.com>
|
|
||
| /** Supported data formats for analytics queries */ | ||
| export type DataFormat = "json" | "arrow" | "auto"; | ||
| export type DataFormat = "json" | "arrow" | "arrow_stream" | "auto"; |
There was a problem hiding this comment.
in theory arrow is the same as arrow_stream, so I'm not following what's the problem?
| /** Format configurations in fallback order. */ | ||
| private static readonly FORMAT_CONFIGS = { | ||
| ARROW_STREAM: { | ||
| formatParameters: { disposition: "INLINE", format: "ARROW_STREAM" }, |
There was a problem hiding this comment.
from this URL
https://docs.databricks.com/api/workspace/statementexecution/executestatement#format
Important: The formats ARROW_STREAM and CSV are supported only with EXTERNAL_LINKS disposition. JSON_ARRAY is supported in INLINE and EXTERNAL_LINKS disposition.
so before changing anything this was already supporting arrow, can I know what's the case where this was failing? I would like to see it
|
Seeing that there's a case of Arrow + inline, let's refactor what we had instead of introducing a new format. Let's change the format "ARROW" to "ARROW_STREAM" and allow it to use both "EXTERNAL_LINKS" and "INLINE". Then for now let's keep JSON + inline as the default. This might require some UI hooks changes too |
Summary
ARROW_STREAMwithINLINEdisposition, but the analytics plugin only offeredJSON_ARRAY(INLINE) andARROW_STREAM(EXTERNAL_LINKS)"ARROW_STREAM"format option that usesINLINEdisposition, making the plugin compatible with these warehousesAnalyticsFormattype to include"ARROW_STREAM"Test plan
useAnalyticsQuerywithformat: "ARROW_STREAM"returns results"JSON"and"ARROW"formats are unaffectedFixes #242
This pull request was AI-assisted by Isaac.