joaoh82 · joaoh82 · May 16, 2026 · May 12, 2026
diff --git a/examples/README.md b/examples/README.md
@@ -16,6 +16,14 @@ Phase 5 lands these incrementally — each sub-phase fills in one language. The
 
 See [docs/roadmap.md](../docs/roadmap.md) for what each sub-phase delivers.
 
+## End-to-end example apps
+
+Beyond the per-SDK quick-start tours above, the [SQLR-38 umbrella](../docs/roadmap.md) tracks longer, opinionated example apps that exercise SQLRite end-to-end in real-world shapes:
+
+| App | Language / SDK | What it shows | Directory |
+|---|---|---|---|
+| LLM agent with persistent memory | Python | Vector + lexical recall, fact extraction, summaries — all in one `.sqlrite` file | [`python-agent/`](python-agent/) |
+
 ## Running the Rust quickstart
 
 ```bash
@@ -63,6 +71,17 @@ python examples/python/hello.py
 
 Mirrors the Rust quickstart shape via the DB-API: `sqlrite.connect(":memory:")` → `cursor.execute` → iterate tuples, plus a BEGIN/ROLLBACK block. See [`python/hello.py`](python/hello.py) and [`sdk/python/README.md`](../sdk/python/README.md) for the full API tour.
 
+## Running the Python LLM agent (SQLR-39)
+
+```bash
+cd examples/python-agent
+python3 -m venv .venv && source .venv/bin/activate
+pip install -e .
+python -m sqlrite_agent          # works offline; no API key required
+```
+
+A full CLI chat agent whose long-term memory is one `.sqlrite` file. Embeds each turn, hybrid-searches over past messages and a structured `facts` table on every recall, and survives process restarts. Read [`python-agent/README.md`](python-agent/README.md) for the demo script and architecture diagram.
+
 ## Running the Node.js sample
 
 ```bash

diff --git a/examples/python-agent/.gitignore b/examples/python-agent/.gitignore
@@ -0,0 +1,12 @@
+__pycache__/
+*.py[cod]
+*.egg-info/
+.pytest_cache/
+.coverage
+build/
+dist/
+
+# Local agent state.
+*.sqlrite
+*.sqlrite-wal
+*.sqlrite-lock
diff --git a/examples/python-agent/README.md b/examples/python-agent/README.md
@@ -0,0 +1,275 @@
+# sqlrite-agent — a Python LLM agent with persistent memory in SQLRite
+
+A small CLI chat agent whose entire long-term memory lives in **one
+`.sqlrite` file on disk**. Every turn the agent:
+
+1. Embeds the user's message.
+2. Hybrid-searches its memory: top-k vector KNN (HNSW) over past
+   messages and summaries, plus keyword recall over the structured
+   `facts` table.
+3. Injects the recalled context into the system prompt.
+4. Sends prompt + recent turns to an LLM and prints the reply.
+5. Writes the new turn back to the same SQLRite database.
+
+Close the process, reopen it days later — your assistant still knows
+your dog's name. No Postgres, no Pinecone, no Redis. One file.
+
+> **Why this example?** Single-file embedded storage is the *right*
+> architecture for a local agent, and SQLRite's HNSW vector index +
+> structured SQL gives you semantic *and* deterministic recall from one
+> store. This is a place where the database genuinely fits the
+> workload, not just a demo.
+
+## Architecture
+
+```mermaid
+flowchart LR
+    User[/"User input"/] --> Embed["Embedder<br/>(hash / OpenAI / local)"]
+    Embed --> Recall["Memory.recall()"]
+    Recall -->|vector KNN| Msgs[("messages<br/>HNSW(embedding)")]
+    Recall -->|vector KNN| Sums[("summaries<br/>HNSW(embedding)")]
+    Recall -->|keyword| Facts[("facts<br/>SQL")]
+    Recall --> Prompt["Prompt assembly<br/>(system + recent turns)"]
+    Prompt --> LLM["LLM<br/>(Anthropic / echo)"]
+    LLM --> Reply[/"Assistant reply"/]
+    Reply --> Writeback["Memory.log_message()"]
+    Writeback --> Msgs
+    User -.-> Writeback
+```
+
+Three tables, one file:
+
+| Table       | Purpose                                                     | Indexed how                              |
+|-------------|-------------------------------------------------------------|------------------------------------------|
+| `messages`  | Every user / assistant turn, plus its 384-dim embedding.    | HNSW on `embedding`                      |
+| `summaries` | Periodic rollups for old context that's too long to inline. | HNSW on `embedding`                      |
+| `facts`     | Structured `(subject, predicate, object)` triples extracted from user turns. | Plain SQL — keyword recall  |
+
+## Install
+
+```bash
+# 1. Clone the rust_sqlite repo (this example ships inside it).
+git clone https://github.com/joaoh82/rust_sqlite
+cd rust_sqlite/examples/python-agent
+
+# 2. Create a virtualenv and install the example.
+python3 -m venv .venv && source .venv/bin/activate
+pip install -e .
+
+# 3. (Optional) Install an LLM provider extra. Without one you get
+#    the offline "echo" agent — the recall pipeline still runs.
+pip install 'sqlrite-agent[anthropic]'         # default LLM
+# pip install 'sqlrite-agent[openai]'           # use OpenAI embeddings too
+# pip install 'sqlrite-agent[local-embeddings]' # 384-dim sentence-transformer
+```
+
+The `sqlrite` Python wheel comes from PyPI automatically (pinned to the
+0.10.x release that introduced `VECTOR(N)` + HNSW indexes).
+
+## Run
+
+```bash
+# Zero config — runs against a fresh in-memory hash embedder and the
+# offline echo "LLM". You see the recall pipeline work end-to-end
+# without an API key.
+python -m sqlrite_agent
+
+# With Anthropic — set ANTHROPIC_API_KEY and run.
+export ANTHROPIC_API_KEY=sk-ant-...
+python -m sqlrite_agent
+
+# Pick where the DB lives (default: ~/.sqlrite-agent.sqlrite).
+python -m sqlrite_agent --db ./my-agent.sqlrite
+
+# Multiple parallel conversations in one DB.
+python -m sqlrite_agent --conversation work
+python -m sqlrite_agent --conversation personal
+
+# Force a specific embedder.
+python -m sqlrite_agent --embedder local       # sentence-transformers
+python -m sqlrite_agent --embedder openai      # text-embedding-3-small
+```
+
+## CLI commands
+
+While the REPL is running, anything starting with `/` is a command:
+
+| Command            | What it does                                                   |
+|--------------------|----------------------------------------------------------------|
+| `/help`            | Show all commands.                                             |
+| `/stats`           | Counts of messages, summaries, facts.                          |
+| `/facts`           | List every extracted fact.                                     |
+| `/recent`          | Last 10 turns in chronological order.                          |
+| `/recall <query>`  | Show what *would* be recalled for a query, without replying.   |
+| `/summarize`       | Summarize the last 20 turns into a single `summaries` row.     |
+| `/quit`            | Exit. `Ctrl-D` also works.                                     |
+
+## 60-second demo script
+
+Run this top-to-bottom to see persistent memory survive a process
+restart. Uses the zero-key default (`hash` embedder + `echo` chat).
+
+```bash
+# Session 1 — drop some facts, then quit.
+$ python -m sqlrite_agent --db agent.sqlrite
+sqlrite-agent 0.1.0 — db=agent.sqlrite, ...
+  loaded memory: 0 messages, 0 summaries, 0 facts.
+
+you> My dog's name is Mochi.
+agent> [echo agent ...]
+
+you> Mochi loves carrots more than treats.
+agent> [echo agent ...]
+
+you> I live in Lisbon, Portugal.
+agent> [echo agent ...]
+
+you> /facts
+  user.dog.name = Mochi
+  user.location = Lisbon, Portugal
+
+you> /quit
+
+# Session 2 — fresh process, same DB.
+$ python -m sqlrite_agent --db agent.sqlrite
+sqlrite-agent 0.1.0 — db=agent.sqlrite, ...
+  loaded memory: 6 messages, 0 summaries, 2 facts.
+
+you> What's my dog's name?
+  [recalled: 1 facts, 0 summaries, 4 messages]
+agent> [echo agent ... — but the recall block above includes
+                          user.dog.name = Mochi]
+```
+
+With `ANTHROPIC_API_KEY` set, the second turn answers "Mochi" instead
+of the canned echo because the LLM sees the recalled fact in its
+system prompt.
+
+## Open the DB yourself with the SQLRite REPL
+
+The memory file is plain SQLRite — open it from anywhere:
+
+```bash
+$ cargo install sqlrite-engine   # or grab a binary from GitHub Releases
+$ sqlrite agent.sqlrite
+SQLRite v0.10.0
+sqlrite> SELECT role, content FROM messages ORDER BY id LIMIT 5;
+sqlrite> SELECT subject, predicate, object FROM facts;
+sqlrite> SELECT id, content
+   ...>   FROM messages
+   ...>   ORDER BY vec_distance_cosine(embedding, (SELECT embedding FROM messages WHERE id = 1))
+   ...>   LIMIT 3;
+```
+
+This is the demo's whole point: **the agent's memory is just SQL**.
+You can query it, back it up, copy it between machines, or load it
+into the Node / Go / WASM SDK without converting anything.
+
+## How recall works
+
+`Memory.recall(query)` runs three searches in parallel and merges the
+results. Pseudocode:
+
+```python
+embedding = embedder.embed(query)
+keywords  = query_keywords(query)        # filtered to content words
+
+vector_messages   = SELECT ... FROM messages
+                     ORDER BY vec_distance_cosine(embedding, ?)
+                     LIMIT k_messages
+
+vector_summaries  = SELECT ... FROM summaries
+                     ORDER BY vec_distance_cosine(embedding, ?)
+                     LIMIT k_summaries
+
+lexical_messages  = SELECT ... FROM messages
+                     WHERE fts_match(content, ?)
+                     ORDER BY bm25_score(content, ?) DESC
+                     LIMIT k_messages           -- Phase 8, BM25 over the inverted index
+
+facts             = SELECT * FROM facts
+                     WHERE subject LIKE ...
+                        OR predicate LIKE ...
+                        OR object LIKE ...
+                     LIMIT k_facts
+```
+
+The vector and lexical message lists are merged in Python (dedupe by
+`id`, vector ranking primary) — that's the simplest correct shape for
+hybrid retrieval: vector finds conceptual neighbors even with zero
+lexical overlap, and BM25 surfaces exact-term matches the vector
+embedding might rank too low. See [`docs/fts.md`](../../docs/fts.md)
+for the BM25 surface and [`examples/hybrid-retrieval/`](../hybrid-retrieval/)
+for an example that fuses both into a single `ORDER BY` arithmetic.
+
+## Embedding-provider tradeoffs
+
+| Provider       | Dependencies          | API key | Real semantic recall | First-run friction |
+|----------------|-----------------------|---------|----------------------|--------------------|
+| `hash` (default) | None — stdlib only  | No      | Bag-of-words approximation only. Good enough for the demo, mediocre for real RAG. | Zero. |
+| `openai`       | `openai` package      | `OPENAI_API_KEY` | Excellent. `text-embedding-3-small` constrained to 384 dims. | ~30s install. |
+| `local`        | `sentence-transformers` (~500 MB with torch) | No | Excellent. `all-MiniLM-L6-v2`, fully offline. | ~5 min install. |
+
+Swap with `--embedder hash | openai | local`. The dimension is fixed
+at 384 to match `VECTOR(384)` in the schema; if you need a different
+dim, change `DEFAULT_DIM` in `sqlrite_agent/db.py` and start with a
+fresh DB.
+
+## Known simplifications
+
+This is an *example*, not a production agent. Things v1 punts on:
+
+- **Memory eviction.** No automatic rolling-window or summarize-and-evict
+  loop yet — run `/summarize` manually when the conversation grows
+  unwieldy.
+- **Fact extraction.** Six hand-written regex patterns. A real agent
+  would call the LLM to extract facts so it catches phrasings the
+  regex misses. Easy upgrade: wrap an LLM call in `facts.extract_facts`.
+- **Single-query hybrid.** The agent merges vector hits and BM25 hits
+  in Python. The engine also supports a single SQL query that fuses
+  both into one `ORDER BY` arithmetic (`0.5 * bm25_score(...) + 0.5 *
+  (1.0 - vec_distance_cosine(...))`) — see [`examples/hybrid-retrieval/`](../hybrid-retrieval/).
+  The merge approach handles conceptual queries with no token overlap;
+  the single-query approach is tighter when you always want BM25 to
+  pre-filter. Pick per workload.
+- **Concurrency.** The agent assumes single-user single-process. The
+  engine supports concurrent reads + a single writer via fs2 advisory
+  locks; running two `sqlrite-agent` processes against the same DB
+  works, but they won't see each other's in-flight writes until commit.
+
+## Development
+
+```bash
+# Tests run fully offline with the hash embedder and echo chat — no
+# API keys, no network.
+pip install -e '.[dev]'
+pytest
+```
+
+## Layout
+
+```
+examples/python-agent/
+├── pyproject.toml             # package metadata + pinned sqlrite dep
+├── README.md                  # this file
+├── sqlrite_agent/
+│   ├── __init__.py
+│   ├── __main__.py            # python -m sqlrite_agent → cli.main()
+│   ├── agent.py               # turn loop, prompt assembly, summarization
+│   ├── chat.py                # LLM provider abstraction (Anthropic / Echo)
+│   ├── cli.py                 # interactive REPL + slash commands
+│   ├── db.py                  # schema, migrations, all SQL
+│   ├── embeddings.py          # Embedder abstraction (hash / OpenAI / local)
+│   ├── facts.py               # regex-based fact extractor
+│   ├── memory.py              # hybrid recall over messages + summaries + facts
+│   └── sqlutil.py             # safe SQL-literal inlining
+└── tests/                     # offline pytest suite (31 tests)
+```
+
+The agent binds only to the SQLRite Python SDK's documented public
+surface (`sqlrite.connect`, `Connection`, `Cursor`). It does not reach
+into internals.
+
+## License
+
+MIT — same as the rest of the rust_sqlite repo.
diff --git a/examples/python-agent/pyproject.toml b/examples/python-agent/pyproject.toml
@@ -0,0 +1,55 @@
+[build-system]
+requires = ["setuptools>=68", "wheel"]
+build-backend = "setuptools.build_meta"
+
+[project]
+name = "sqlrite-agent"
+version = "0.1.0"
+description = "Example: a Python CLI chat agent with persistent long-term memory backed entirely by SQLRite (vector + lexical recall, periodic summarization, structured fact extraction)."
+readme = "README.md"
+requires-python = ">=3.11"
+license = { text = "MIT" }
+authors = [{ name = "SQLRite contributors" }]
+keywords = ["sqlrite", "llm", "agent", "rag", "memory", "vector-search"]
+classifiers = [
+    "Development Status :: 4 - Beta",
+    "Programming Language :: Python :: 3.11",
+    "Programming Language :: Python :: 3.12",
+    "License :: OSI Approved :: MIT License",
+    "Topic :: Database",
+    "Topic :: Scientific/Engineering :: Artificial Intelligence",
+]
+
+# The agent binds only to the public SQLRite Python SDK surface
+# (`sqlrite.connect`, `Connection`, `Cursor`). Pinned to the 0.10.x
+# release that introduced VECTOR + HNSW + the `ask` feature.
+dependencies = [
+    "sqlrite>=0.10.0,<0.11.0",
+]
+
+[project.optional-dependencies]
+# Cloud LLM + embedding providers — install whichever you want.
+anthropic = ["anthropic>=0.40"]
+openai = ["openai>=1.40"]
+# Local embedding model — heavy install (~500 MB with torch), but
+# means no API key is needed for the embedding half of recall.
+local-embeddings = ["sentence-transformers>=2.7"]
+# Everything at once for a full demo.
+all = ["anthropic>=0.40", "openai>=1.40", "sentence-transformers>=2.7"]
+dev = ["pytest>=7", "pytest-cov"]
+
+[project.scripts]
+sqlrite-agent = "sqlrite_agent.cli:main"
+
+[project.urls]
+Homepage = "https://sqlritedb.com"
+Repository = "https://github.com/joaoh82/rust_sqlite"
+"Bug Tracker" = "https://github.com/joaoh82/rust_sqlite/issues"
+
+[tool.setuptools.packages.find]
+where = ["."]
+include = ["sqlrite_agent*"]
+
+[tool.pytest.ini_options]
+testpaths = ["tests"]
+addopts = "-ra --tb=short"
diff --git a/examples/python-agent/sqlrite_agent/__init__.py b/examples/python-agent/sqlrite_agent/__init__.py
@@ -0,0 +1,4 @@
+"""sqlrite_agent — an example LLM chat agent with persistent memory in SQLRite."""
+
+__version__ = "0.1.0"
+__all__ = ["__version__"]
diff --git a/examples/python-agent/sqlrite_agent/__main__.py b/examples/python-agent/sqlrite_agent/__main__.py
@@ -0,0 +1,4 @@
+from sqlrite_agent.cli import main
+
+if __name__ == "__main__":
+    raise SystemExit(main())