From 3ce0e18e2924aa25dfcd362dec976bd379a501a1 Mon Sep 17 00:00:00 2001 From: Eddie A Tejeda <669988+eddietejeda@users.noreply.github.com> Date: Mon, 15 Jun 2026 08:56:57 -0700 Subject: [PATCH 1/4] docs(skills): fix stale datasets create flags and add --no-input - Replace non-existent --label/--table-name flags with --name/--description in hotdata and hotdata-analytics WORKFLOWS.md chain examples - Fix dataset upload workflow: remove --file/--url (not supported on datasets create; parquet uploads belong in managed databases) - Document --no-input global flag in hotdata/SKILL.md --- skills/hotdata-analytics/references/WORKFLOWS.md | 6 +++--- skills/hotdata/SKILL.md | 2 +- skills/hotdata/references/WORKFLOWS.md | 11 ++++++----- 3 files changed, 10 insertions(+), 9 deletions(-) diff --git a/skills/hotdata-analytics/references/WORKFLOWS.md b/skills/hotdata-analytics/references/WORKFLOWS.md index 4c7a1cd..f1a8ef5 100644 --- a/skills/hotdata-analytics/references/WORKFLOWS.md +++ b/skills/hotdata-analytics/references/WORKFLOWS.md @@ -69,8 +69,8 @@ Land a smaller table — pick one: **Datasets** (CSV/JSON/URL/SQL snapshot → `datasets..`): ```bash -hotdata datasets create --label "chain revenue slice" --sql "SELECT ..." [--table-name chain_revenue_slice] -hotdata datasets create --label "from saved" --query-id [--table-name ...] +hotdata datasets create --name chain_revenue_slice [--description "chain revenue slice"] --sql "SELECT ..." +hotdata datasets create --name chain_from_saved [--description "from saved"] --query-id ``` **Managed database** (parquet → `..
`): @@ -95,7 +95,7 @@ hotdata query "SELECT * FROM datasets.main.chain_revenue_slice WHERE ..." ### Naming and documentation -- Prefer predictable `--table-name` values: `chain__`. +- Prefer predictable `--name` values: `chain__`. - Record long-lived chains in **context:DATAMODEL → Derived tables (Chain)** with the **full** SQL name you use (`datasets.…` or `database.schema.table`). - Promote join/grain findings to **context:DATAMODEL** when they should be shared or persisted (**`hotdata`** skill). diff --git a/skills/hotdata/SKILL.md b/skills/hotdata/SKILL.md index 5413d3f..43ca98d 100644 --- a/skills/hotdata/SKILL.md +++ b/skills/hotdata/SKILL.md @@ -92,7 +92,7 @@ Catalog, skill decision tree, epic flows (onboard, chain, retrieval), and datase Top-level subcommands (each detailed below): **`auth`**, **`datasets`**, **`query`**, **`workspaces`**, **`connections`**, **`databases`**, **`tables`**, **`skills`**, **`results`**, **`jobs`**, **`indexes`**, **`embedding-providers`**, **`search`**, **`queries`**, **`context`**, **`completions`**. Search, indexes (bm25/vector), and embedding providers are documented in **`hotdata-search`**; query history, results, Chain, and OLAP patterns in **`hotdata-analytics`**. -Global CLI options: **`--api-key`**, **`-v` / `--version`**, **`-h` / `--help`**. Hidden developer flag: **`--debug`** (verbose HTTP logs). +Global CLI options: **`--api-key`**, **`-v` / `--version`**, **`-h` / `--help`**, **`--no-input`** (disable interactive prompts; commands that require input will error instead — useful in CI or non-TTY environments). Hidden developer flag: **`--debug`** (verbose HTTP logs). ### List Workspaces ``` diff --git a/skills/hotdata/references/WORKFLOWS.md b/skills/hotdata/references/WORKFLOWS.md index fe4cfd5..c0cc481 100644 --- a/skills/hotdata/references/WORKFLOWS.md +++ b/skills/hotdata/references/WORKFLOWS.md @@ -51,7 +51,7 @@ End-to-end checklists. Use the linked sections for command detail and guardrails 1. [ ] Run base SQL: `hotdata query "SELECT …"` — poll `hotdata query status ` if async 2. [ ] Materialize one way: - - [ ] **Dataset:** `hotdata datasets create --label "…" --sql "SELECT …" [--table-name …]` + - [ ] **Dataset:** `hotdata datasets create --name [--description "…"] --sql "SELECT …"` - [ ] **Managed DB:** `hotdata databases create --name … --table …` then `hotdata databases tables load … --file ./….parquet` 3. [ ] Copy **`full_name`** from create output (or `datasets list` **FULL NAME**) 4. [ ] Chain: `hotdata query "SELECT … FROM WHERE …"` @@ -96,14 +96,15 @@ Both land queryable tables in the workspace; the path depends on **format** and ### Workflow: dataset upload and query 1. Authenticate and set workspace (`hotdata auth`, `hotdata workspaces set` if needed). -2. Create the dataset (one source): +2. Create the dataset — `--name` is the SQL table name (required); `--description` is the display label (optional): ```bash - hotdata datasets create --label "Orders" --file ./orders.csv - # or: --url "https://example.com/orders.parquet" - # or: --sql "SELECT ..." # materialize from a query + hotdata datasets create --name orders --sql "SELECT ..." + # or: --query-id ``` + For parquet file uploads use **managed databases** instead (see below). + 3. Note the printed **`full_name`** (e.g. `datasets.main.orders`) — do not assume `datasets.main`. 4. Inspect if needed: `hotdata datasets list`, `hotdata datasets `. 5. Query: From 7a9ee79ac86c724fa0dd2ccdd3d640f01b427e65 Mon Sep 17 00:00:00 2001 From: Eddie A Tejeda <669988+eddietejeda@users.noreply.github.com> Date: Mon, 15 Jun 2026 09:20:26 -0700 Subject: [PATCH 2/4] docs(skills): fix datasets description and databases load syntax - Fix hotdata-analytics WORKFLOWS: datasets source is SQL/query-id only, not CSV/JSON/URL (remove stale description) - Fix hotdata WORKFLOWS Chain epic: databases create/load now uses --catalog and --table flags, not --name and imprecise load syntax --- skills/hotdata-analytics/references/WORKFLOWS.md | 2 +- skills/hotdata/references/WORKFLOWS.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/skills/hotdata-analytics/references/WORKFLOWS.md b/skills/hotdata-analytics/references/WORKFLOWS.md index f1a8ef5..8542635 100644 --- a/skills/hotdata-analytics/references/WORKFLOWS.md +++ b/skills/hotdata-analytics/references/WORKFLOWS.md @@ -66,7 +66,7 @@ hotdata query "SELECT ..." Land a smaller table — pick one: -**Datasets** (CSV/JSON/URL/SQL snapshot → `datasets..
`): +**Datasets** (SQL query or saved query → `datasets..
`): ```bash hotdata datasets create --name chain_revenue_slice [--description "chain revenue slice"] --sql "SELECT ..." diff --git a/skills/hotdata/references/WORKFLOWS.md b/skills/hotdata/references/WORKFLOWS.md index c0cc481..8f7853d 100644 --- a/skills/hotdata/references/WORKFLOWS.md +++ b/skills/hotdata/references/WORKFLOWS.md @@ -52,7 +52,7 @@ End-to-end checklists. Use the linked sections for command detail and guardrails 1. [ ] Run base SQL: `hotdata query "SELECT …"` — poll `hotdata query status ` if async 2. [ ] Materialize one way: - [ ] **Dataset:** `hotdata datasets create --name [--description "…"] --sql "SELECT …"` - - [ ] **Managed DB:** `hotdata databases create --name … --table …` then `hotdata databases tables load … --file ./….parquet` + - [ ] **Managed DB:** `hotdata databases create --catalog --table ` then `hotdata databases load --catalog --table --file ./….parquet` 3. [ ] Copy **`full_name`** from create output (or `datasets list` **FULL NAME**) 4. [ ] Chain: `hotdata query "SELECT … FROM WHERE …"` 5. [ ] Record stable chains in **context:DATAMODEL** when they should outlive the session From dc4af79f378443516209e2a41fb1186eb93aab85 Mon Sep 17 00:00:00 2001 From: Eddie A Tejeda <669988+eddietejeda@users.noreply.github.com> Date: Mon, 15 Jun 2026 09:39:16 -0700 Subject: [PATCH 3/4] docs(skills): fix stale datasets-vs-databases comparison table - Datasets only support --sql and --query-id (not CSV/JSON/URL/file) - Correct SQL prefix: ..
not 'connection name' - Correct managed DB CLI: databases load not databases tables load - Fix rule of thumb: parquet files -> databases, SQL snapshot -> datasets - Fix skill decision table: remove misleading CSV/JSON/URL/stdin label --- skills/hotdata/references/WORKFLOWS.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/skills/hotdata/references/WORKFLOWS.md b/skills/hotdata/references/WORKFLOWS.md index 8f7853d..2dfb6ed 100644 --- a/skills/hotdata/references/WORKFLOWS.md +++ b/skills/hotdata/references/WORKFLOWS.md @@ -11,7 +11,7 @@ Load **`hotdata`** first for auth and workspace setup. Add a sub-skill only when | User goal | Skill | Key commands | |-----------|--------|----------------| | Login, workspaces, connections, tables, context | **`hotdata`** | `auth`, `workspaces`, `connections`, `tables`, `context` | -| Upload CSV/JSON/URL or SQL-derived tables | **`hotdata`** | `datasets create`, `databases …` (see below) | +| Load parquet files or materialize SQL tables | **`hotdata`** | `databases create` + `databases load`, `datasets create --sql` | | SQL analytics, aggregations, history, Chain | **`hotdata-analytics`** | `query`, `queries`, `results`, `datasets create --sql` | | BM25 / vector search, retrieval indexes | **`hotdata-search`** | `search`, `indexes create`, `embedding-providers` | | Geospatial / PostGIS-style SQL | **`hotdata-geospatial`** | `query` with `ST_*`, WKB columns | @@ -84,14 +84,14 @@ Both land queryable tables in the workspace; the path depends on **format** and | | **Datasets** | **Managed databases** | |---|-------------|------------------------| -| **Best for** | CSV, JSON, URL import, stdin, SQL/query snapshot | Parquet files you own; catalog-style `name.schema.table` | -| **SQL prefix** | `datasets..
` (often `datasets.main.*`) | `..
` (database = connection name) | -| **CLI** | `hotdata datasets create` | `hotdata databases create` + `databases tables load` | -| **Declare schema up front** | No | Yes — `--table` on create (required before load on current API) | -| **Parquet** | Yes (`--file`, `--url`, `--upload-id`) | **Only** parquet on `tables load` | -| **Refresh upstream** | `datasets refresh` (URL/query sources) | Replace via `tables load` again | - -**Rule of thumb:** CSV/JSON or “upload a file from a URL” → **datasets**. Parquet catalog you control as **`mydb.public.orders`** → **databases**. +| **Best for** | SQL or saved-query snapshot | Parquet files you own; catalog-style `alias.schema.table` | +| **SQL prefix** | `datasets..
` (often `datasets.main.*`) | `..
` where catalog = `--catalog` alias | +| **CLI** | `hotdata datasets create --sql “…”` | `hotdata databases create --catalog` + `databases load` | +| **Declare schema up front** | No | Yes — `--table` on create (auto-declared on first `databases load`) | +| **Parquet file uploads** | Not supported via CLI | `databases load --file` / `--url` / `--upload-id` | +| **Refresh** | `datasets refresh` (re-runs source query) | Replace via `databases load` again | + +**Rule of thumb:** SQL or saved-query materialization → **datasets**. Parquet files you control as **`mydb.public.orders`** → **databases**. ### Workflow: dataset upload and query From b560dfcc54e1b546ea6a2e709dfdf5b564b12c56 Mon Sep 17 00:00:00 2001 From: Eddie A Tejeda <669988+eddietejeda@users.noreply.github.com> Date: Mon, 15 Jun 2026 09:46:04 -0700 Subject: [PATCH 4/4] fix(skills): indexes create has no --connection-id flag create only accepts --catalog (managed DB) or --dataset-id. list and delete accept --connection-id; create does not. Pre-existing error in hotdata-search/SKILL.md. --- skills/hotdata-search/SKILL.md | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/skills/hotdata-search/SKILL.md b/skills/hotdata-search/SKILL.md index 4309353..57ebfbd 100644 --- a/skills/hotdata-search/SKILL.md +++ b/skills/hotdata-search/SKILL.md @@ -42,26 +42,24 @@ hotdata search "" --table [--type vector] [--co ## Indexes (BM25 and vector) -Indexes attach to a **connection table** (`--connection-id` + `--schema` + `--table`) or a **dataset** (`--dataset-id`). Scopes are mutually exclusive for create/delete. +Indexes attach to a **managed database table** (`--catalog`) or a **dataset** (`--dataset-id`). Create is not supported on raw connection tables via CLI. `list` and `delete` accept `--connection-id` for connection-scoped operations. ```bash -# List — workspace scan on connection tables (filter with -c / --schema / --table) +# List — workspace scan (filter by connection, schema, table, or dataset) hotdata indexes list [--connection-id ] [--schema ] [--table
] [--workspace-id ] [--output table|json|yaml] hotdata indexes list --dataset-id [--workspace-id ] [--output table|json|yaml] -# Managed database (catalog alias — uses the active database when the catalog matches) +# Create — managed database table (catalog alias) hotdata indexes create --catalog --schema --table
\ --column --type bm25|vector \ [--name ] [--metric l2|cosine|dot] [--async] \ [--embedding-provider-id ] [--dimensions ] [--output-column ] [--description ] -# Connection table (raw connection ID) -hotdata indexes create --connection-id --schema --table
\ - --column --type bm25|vector [--name ] ... -hotdata indexes delete --connection-id --schema --table
--name - -# Dataset +# Create — dataset hotdata indexes create --dataset-id --column --type bm25|vector [--name ] ... + +# Delete — connection table or dataset +hotdata indexes delete --connection-id --schema --table
--name hotdata indexes delete --dataset-id --name ```