Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 109 additions & 0 deletions docs/etcd-backend-schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Iceberg REST Catalog — etcd Backend Schema

This document describes the etcd key/value schema used by the Iceberg REST Catalog to track namespaces and tables.

## Key Prefixes

For the `default` catalog, keys use bare prefixes:

| Prefix | Purpose |
|--------|---------|
| `n/` | Namespace entries |
| `t/` | Table entries |

For non-default catalogs, the catalog name is prepended: `<catalogName>/n/` and `<catalogName>/t/`.

Defined in `EtcdCatalog.java`:

```java
private static final String NAMESPACE_PREFIX = "n/";
private static final String TABLE_PREFIX = "t/";
```

## Schema

### Namespaces (`n/`)

**Key format:** `n/<namespace>`

**Value:** JSON map of namespace properties (may be empty `{}`).

| Field | Description |
|-------|-------------|
| key | `n/` + namespace name (levels joined by `/` for nested namespaces) |
| value | JSON object with namespace properties |

**Example:**

```
key: n/flowers
value: {}
```

```
key: n/nyc
value: {}
```

### Tables (`t/`)

**Key format:** `t/<namespace>/<table_name>`

**Value:** JSON object with table metadata pointers.

| Field | Description |
|-------|-------------|
| `table_type` | Object type (e.g. `ICEBERG`) |
| `metadata_location` | S3/file path to the current metadata JSON file |
| `previous_metadata_location` | S3/file path to the previous metadata JSON file (empty if first version) |

**Example:**

```
key: t/flowers/iris2
value: {
"table_type": "ICEBERG",
"metadata_location": "s3://bucket1/flowers/iris2/metadata/00001-5d6604f9-d6f1-4ced-a036-8f20d6c0def2.metadata.json",
"previous_metadata_location": "s3://bucket1/flowers/iris2/metadata/00000-3f6b12e4-0a85-42bf-8ed0-d7cb443d5830.metadata.json"
}
```

## Mapping to SQLite Backend

The SQLite backend stores the same information in two relational tables. Here is how the etcd keys correspond:

```
etcd key SQLite table
─────────────────────────── ──────────────────────────────────
n/<namespace> -> iceberg_namespace_properties
t/<namespace>/<table> -> iceberg_tables
```

### `n/` -> `iceberg_namespace_properties`

| etcd | SQLite column |
|------|---------------|
| key prefix after `n/` | `namespace` |
| (implicit) | `catalog_name` = catalog name (e.g. `default`) |
| each entry in JSON value | one row per property: `property_key` + `property_value` |

In SQLite, each namespace property is stored as a separate row. In etcd, all properties for a namespace are stored as a single JSON blob.

### `t/` -> `iceberg_tables`

| etcd | SQLite column |
|------|---------------|
| namespace segment of key | `table_namespace` |
| table segment of key | `table_name` |
| (implicit) | `catalog_name` = catalog name (e.g. `default`) |
| `table_type` in value | `iceberg_type` |
| `metadata_location` in value | `metadata_location` |
| `previous_metadata_location` in value | `previous_metadata_location` |

## Notes

- The composite primary key equivalent in etcd is the key itself (`t/<namespace>/<table>` is unique per table).
- Tables with an empty `previous_metadata_location` are at their initial version (version 0).
- The metadata version can be inferred from the filename prefix (e.g. `00003-...` = version 3).
- All metadata files follow the pattern: `s3://<bucket>/<namespace>/<table>/metadata/<version>-<uuid>.metadata.json`.
- Namespace levels are joined by `/` in the key (e.g. `n/parent/child` for nested namespace `parent.child`).
107 changes: 107 additions & 0 deletions docs/etcd-cluster-setup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# ice-rest-catalog with etcd Cluster

This guide walks through setting up ice-rest-catalog backed by a 3-node etcd cluster, inserting data with the ice CLI, verifying replication, and querying from ClickHouse.

## Prerequisites

- Docker Compose
- ice CLI (`ice`)
- `etcdctl` (v3)

## 1. Start the etcd Cluster

Use the provided docker-compose file to bring up etcd, MinIO, ice-rest-catalog, and ClickHouse:

```bash
cd examples/docker-compose
docker compose -f docker-compose-etcd.yaml up -d
```

This starts a single-node etcd by default. For a 3-node cluster, replace the `etcd` service definition with three separate nodes (etcd1, etcd2, etcd3) each with their own ports mapped to the host:

| Node | Client Port |
|-------|-------------|
| etcd1 | 12379 |
| etcd2 | 12479 |
| etcd3 | 12579 |

## 2. Configure ice-rest-catalog

Update the `uri` in your `ice-rest-catalog.yaml` to point at all three etcd endpoints:

```yaml
uri: etcd:http://127.0.0.1:12379,http://127.0.0.1:12479,http://127.0.0.1:12579
warehouse: s3://bucket1

s3:
endpoint: http://localhost:9000
pathStyleAccess: true
accessKeyID: miniouser
secretAccessKey: miniopassword
region: minio

bearerTokens:
- value: foo
```

Start (or restart) ice-rest-catalog so it picks up the new config.

## 3. Insert Data

Use the ice CLI to create a table and insert a Parquet file:

```bash
ice insert flowers.iris file://iris.parquet
```

## 4. Verify Replication with etcdctl

Query the table key across all three etcd endpoints to confirm the data was replicated:

```bash
ETCDCTL_API=3 etcdctl \
--endpoints=http://127.0.0.1:12379,http://127.0.0.1:12479,http://127.0.0.1:12579 \
get t/flowers/iris
```

Expected output:

```
t/flowers/iris
{"table_type":"ICEBERG","metadata_location":"s3://bucket1/flowers/iris/metadata/00002-b2cd8da0-74a7-460d-ac3c-12f85d65225b.metadata.json","previous_metadata_location":"s3://bucket1/flowers/iris/metadata/00001-659c1907-5ac7-4ae1-b8c2-573d2f61b45b.metadata.json"}
```

This confirms the key was replicated across all etcd instances.

## 5. Query from ClickHouse

Connect to ClickHouse and create the Iceberg catalog database:

```sql
CREATE DATABASE ice
ENGINE = DataLakeCatalog('http://host.docker.internal:5000')
SETTINGS
catalog_type = 'rest',
auth_header = 'Authorization: Bearer foo',
storage_endpoint = 'http://host.docker.internal:9000',
warehouse = 's3://bucket1',
aws_access_key_id = 'miniouser',
aws_secret_access_key = 'miniopassword';
```

Query the table:

```sql
SELECT * FROM ice.`flowers.iris`;
```

```
┌─sepal.length─┬─sepal.width─┬─petal.length─┬─petal.width─┬─variety────┐
│ 5.1 │ 3.5 │ 1.4 │ 0.2 │ Setosa │
│ 4.9 │ 3 │ 1.4 │ 0.2 │ Setosa │
│ 4.7 │ 3.2 │ 1.3 │ 0.2 │ Setosa │
│ 4.6 │ 3.1 │ 1.5 │ 0.2 │ Setosa │
│ 5 │ 3.6 │ 1.4 │ 0.2 │ Setosa │
│ 5.4 │ 3.9 │ 1.7 │ 0.4 │ Setosa │
└──────────────┴─────────────┴──────────────┴─────────────┴────────────┘
```
98 changes: 98 additions & 0 deletions docs/sqlite-backend-schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Iceberg REST Catalog — SQLite Backend Schema

This document describes the SQLite database schema used by the Iceberg REST Catalog to track namespaces and tables.

---

## ERD Diagram

```mermaid
erDiagram
iceberg_namespace_properties {
string catalog_name PK
string namespace PK
string property_key PK
string property_value
}

iceberg_tables {
string catalog_name PK
string table_namespace PK
string table_name PK
string metadata_location
string previous_metadata_location
string iceberg_type
}

iceberg_namespace_properties ||--o{ iceberg_tables : "catalog_name, namespace → table_namespace"
```

> **Relationship:** `iceberg_namespace_properties.catalog_name` + `namespace` logically maps to `iceberg_tables.catalog_name` + `table_namespace`. A namespace can contain zero or more tables.

---

## Schema

### `iceberg_tables`

Stores metadata references for every registered Iceberg table.

| Column | Key | Description |
|---|---|---|
| `catalog_name` | PK | Catalog identifier (e.g. `default`) |
| `table_namespace` | PK | Dot-separated namespace the table belongs to |
| `table_name` | PK | Name of the table |
| `metadata_location` | | S3 path to the **current** metadata JSON file |
| `previous_metadata_location` | | S3 path to the **previous** metadata JSON file (empty if first version) |
| `iceberg_type` | | Object type — always `TABLE` |

### `iceberg_namespace_properties`

Stores key/value properties for each namespace.

| Column | Key | Description |
|---|---|---|
| `catalog_name` | PK | Catalog identifier |
| `namespace` | PK | Namespace name |
| `property_key` | PK | Property key |
| `property_value` | | Property value |

---

## Sample Data

### `iceberg_tables`

| catalog_name | table_namespace | table_name | metadata_location | previous_metadata_location | iceberg_type |
|---|---|---|---|---|---|
| default | test | ll2 | `s3://bucket1/test/ll2/metadata/00003-bb602d70-...json` | `s3://bucket1/test/ll2/metadata/00002-1debf62b-...json` | TABLE |
| default | test | ll2_p | `s3://bucket1/test/ll2_p/metadata/00001-690b44b8-...json` | `s3://bucket1/test/ll2_p/metadata/00000-d87bf260-...json` | TABLE |
| default | test | ll3_ | `s3://bucket1/test/ll3_/metadata/00001-005c7284-...json` | `s3://bucket1/test/ll3_/metadata/00000-08751e01-...json` | TABLE |
| default | test | type_test | `s3://bucket1/test/type_test/metadata/00003-688d3504-...json` | `s3://bucket1/test/type_test/metadata/00002-2aab973f-...json` | TABLE |
| default | test | type_test2 | `s3://bucket1/test/type_test2/metadata/00000-8573b865-...json` | *(none)* | TABLE |
| default | test | type_test_no_copy | `s3://bucket1/test/type_test_no_copy/metadata/00001-e2ecf4d0-...json` | `s3://bucket1/test/type_test_no_copy/metadata/00000-2e38b5ba-...json` | TABLE |
| default | test | type_test_no_copy_2 | `s3://bucket1/test/type_test_no_copy_2/metadata/00000-90ca965f-...json` | *(none)* | TABLE |
| default | test | type_test_no_copy_3 | `s3://bucket1/test/type_test_no_copy_3/metadata/00001-05050286-...json` | `s3://bucket1/test/type_test_no_copy_3/metadata/00000-a9a4f859-...json` | TABLE |
| default | test | type_test_no_copy_44 | `s3://bucket1/test/type_test_no_copy_44/metadata/00001-5ae8dcb9-...json` | `s3://bucket1/test/type_test_no_copy_44/metadata/00000-01ad9f2d-...json` | TABLE |
| default | test | type_test_no_copy_0 | `s3://bucket1/test/type_test_no_copy_0/metadata/00001-c3aa9443-...json` | `s3://bucket1/test/type_test_no_copy_0/metadata/00000-b64bfb12-...json` | TABLE |
| default | test | type_test_no_copy_02 | `s3://bucket1/test/type_test_no_copy_02/metadata/00000-6c2832c0-...json` | *(none)* | TABLE |
| default | nyc | taxis_p_by_day | `s3://bucket1/nyc/taxis_p_by_day/metadata/00001-dd00d7f3-...json` | `s3://bucket1/nyc/taxis_p_by_day/metadata/00000-afe27a5c-...json` | TABLE |
| default | nyc | simple_table | `s3://bucket1/nyc/simple_table/metadata/00000-0fc867be-...json` | *(none)* | TABLE |
| default | test | simple_table | `s3://bucket1/test/simple_table/metadata/00000-fe34018d-...json` | *(none)* | TABLE |

### `iceberg_namespace_properties`

| catalog_name | namespace | property_key | property_value |
|---|---|---|---|
| default | test.ll2 | exists | true |
| default | nyc | exists | true |

---

## Notes

- The composite primary key for `iceberg_tables` is (`catalog_name`, `table_namespace`, `table_name`).
- The composite primary key for `iceberg_namespace_properties` is (`catalog_name`, `namespace`, `property_key`).
- Tables with an empty `previous_metadata_location` are at their initial version (version 0).
- The metadata version can be inferred from the filename prefix (e.g. `00003-...` = version 3, meaning 3 commits since creation).
- All metadata files are stored in S3 under the pattern: `s3://<bucket>/<namespace>/<table>/metadata/<version>-<uuid>.metadata.json`.
Loading
Loading