feat(drivers): support dataplane custom driver management#42
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…vers command - Rewrote Java JDBC probe to auto-detect all 14 DriverDefinition fields: NDV(1) added to approxCountDistinctFunction probe; new rowCountQueryStyle probe (INFORMATION_SCHEMA_ROW_COUNT, INFORMATION_SCHEMA_TABLES_WITH_SIZE, ALL_TABLES); all probes emit to JSON for Python consumption. - Rewrote _build_yaml() with canonical key ordering (Identity → SQL dialect → Performance → Schema/catalog → Style selectors → Date arithmetic templates → Connectivity → Spark JdbcDialect → URL construction → Connection spec). Applies "non-default keys only" rule: fields equal to DriverDefinition defaults are omitted. Only insertBatchSize is excluded (write-only); maxPartitionParallelism restored as a TODO field. - Added _derive_url_metadata() returning (port, template, url_components). connectionSpec only marks fields required when actually present in the probe URL — portless drivers (SQLite, MongoDB) no longer get a spurious required port field. - Default output path changed to dist/META-INF/jdbc-drivers/<prefix>.yaml. Directory is auto-created. Index file created/updated after every write (idempotent — no duplicates on re-run). - Added package-drivers top-level command: bundles dist/ into custom-drivers.jar using Python zipfile (no jar tool required). - Removed Source URL from generated YAML header so probe URL does not confuse the LLM when suggesting jdbcUrlTemplate values. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…r and Spark built-ins
…alue with comment Redshift uses a 32-bit JDBC driver and must use INT_MAX, not LONG_MAX. The generator now auto-selects INT_MAX for redshift/sqlserver/db2 prefixes with a matching comment so value and comment are always in sync. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Greptile SummaryThis PR fixes a value/comment mismatch in
Confidence Score: 4/5Safe to merge; the core fix is correct and both issues identified are low-risk No real-world JDBC prefix would trigger the substring false-positive today, and the todo_fields inconsistency has no user-visible impact since the display layer uses its own always_todo list. The fix correctly resolves the stated mismatch between emitted value and comment for 32-bit drivers. qualytics/cli/generate_driver.py — specifically the prefix-detection predicate on line 871 Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[generate-driver invoked] --> B[_extract_prefix\nextract jdbc: scheme token]
B --> C{prefix in 32-bit set?\nredshift / sqlserver / db2}
C -- Yes --> D[dataSizeLimit = INT_MAX\ncomment: older 32-bit driver]
C -- No --> E[dataSizeLimit = LONG_MAX\ncomment: TODO review]
D --> F[_build_yaml assembles YAML]
E --> F
F --> G[Write .yaml file to disk]
G --> H[_collect_todo_fields\nscan YAML for TODO: markers]
H --> I{LLM-assist\navailable?}
I -- Yes --> J[LLM fills remaining TODO fields]
I -- No --> K[Prompt user to review manually]
Reviews (1): Last reviewed commit: "fix(generate-driver): use INT_MAX for Re..." | Re-trigger Greptile |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Summary
This PR introduces a complete custom dataplane driver management workflow to the Qualytics CLI, enabling operators to package and deploy third-party JDBC drivers into the Qualytics platform without manual YAML authoring.
New Commands
qualytics drivers generateDriverDefinitionYAMLqualytics drivers packagecustom-drivers.jarloadable by the Qualytics dataplaneHow It Works
drivers generatedialectClassvia JAR ServiceLoader inspection and a catalog of known Spark built-ins — no manual lookup requireddataSizeLimitfor known 32-bit JDBC drivers (Redshift, SQL Server, DB2 →INT_MAX; all others →LONG_MAX)DriverDefinitionYAML todist/META-INF/jdbc-drivers/<prefix>.yamland maintains an index file# TODOfields in the YAML before writingdrivers packagedist/structure into a singlecustom-drivers.jarCLI Integration
driverscommand group is registered under the top-levelqualyticscommandChanges
qualytics/cli/generate_driver.pygenerateandpackagecommands (~1,600 lines)qualytics/qualytics.pydriverscommand groupTest plan
qualytics drivers generate --jar <path>.jar --url jdbc:postgresql://... --user ... --password ...— confirm YAML written todist/META-INF/jdbc-drivers/postgresql.yamlwith correctdialectClassanddataSizeLimitqualytics drivers generateagainst a Redshift JAR — confirmdataSizeLimit: INT_MAXwith correct commentqualytics drivers packageafter generating — confirmcustom-drivers.jarcreated containingMETA-INF/jdbc-drivers/treequalytics drivers --help— confirm bothgenerateandpackagesubcommands are listedqualytics --help— confirmdriversappears as a single top-level entry🤖 Generated with Claude Code