A Node.js CLI that ingests a folder of markdown notes (Obsidian
vault, Notion export, plain ~/Documents/notes) into a SQLRite
database, then exposes the database to Claude Desktop / any MCP
client through the engine's first-party MCP server.
End-user effect: drop your notes folder in, paste one block into
Claude Desktop's config, and ask Claude "what did I write about
CRDTs last month?" — it answers using your local notes. No cloud
sync, no third-party indexer, the entire memory is one .sqlrite
file on disk you can open in the REPL.
Why this example? Other "chat with your notes" demos build a custom RAG pipeline and bolt it onto a model. This one shows that when the database itself speaks the agent protocol, you don't need a pipeline — Claude drives the database directly via
sqlrite-mcp. The Node.js side is just the ingest + glue.
flowchart LR
Notes[/"~/notes/*.md<br/>(markdown)"/] -->|sqlrite-notes init / refresh| Ingest
Ingest["Ingest pipeline<br/>(walk → chunk → embed → store)"] --> DB[("notes.sqlrite<br/>documents · chunks<br/>HNSW + FTS indexes")]
DB -->|"sqlrite-mcp --read-only<br/>stdio JSON-RPC"| Claude["Claude Desktop<br/>(or any MCP client)"]
Claude -->|"vector_search · bm25_search · query · ask"| DB
The whole stack: Node.js for the write side (ingest pipeline,
chunking, embeddings), SQLRite for storage + retrieval primitives
(HNSW vector index, BM25 inverted index, raw SQL), and sqlrite-mcp
for the read side that Claude actually talks to. The Node CLI
never touches the database while Claude is connected — that's what
--read-only is for.
| Table | Purpose | Indexes |
|---|---|---|
documents |
One row per .md file — path, title, mtime, full body, content hash. |
UNIQUE on path. FTS on content (BM25 over whole docs). |
chunks |
One row per ~400-token slice of a document, plus a VECTOR(384) embedding. |
HNSW on embedding (semantic KNN). FTS on content (passage BM25). |
Hybrid retrieval queries chunks and fuses BM25 + vector cosine in a
single ORDER BY (see docs/fts.md for the
SQL pattern; the executor's try_fts_probe hook serves the top-k
straight from the inverted index).
npx sqlrite-notes init ~/Documents/notesThat's it for the Node side — npx downloads sqlrite-notes, which
pulls @joaoh82/sqlrite with prebuilt napi-rs binaries for
macOS-arm64, Linux x64/arm64, and Windows x64. No clone, no
npm install, no Rust toolchain required.
sqlrite-mcp is a separate Rust binary used by the serve
subcommand. Install it once, anywhere on your PATH:
# from crates.io (~30s):
cargo install sqlrite-mcp
# or grab a prebuilt binary from GitHub Releases:
# https://github.com/joaoh82/rust_sqlite/releasesIf you don't want to install globally, set SQLRITE_MCP_BIN to its
absolute path — sqlrite-notes serve will pick it up.
# 1. Ingest a folder of markdown into a notes.sqlrite database.
npx sqlrite-notes init ~/Documents/notes
# 2. Confirm it works locally — same retrieval shape Claude will see.
npx sqlrite-notes search "what did I learn about CRDTs?"
# 3. Wire up Claude Desktop using the snippet printed by `init`
# (also available any time via `npx sqlrite-notes config`).
# 4. Open Claude Desktop. The sqlrite-mcp tools appear in the
# tool picker — `bm25_search`, `vector_search`, `query`, `ask`,
# plus `list_tables` / `describe_table` / `schema_dump`.Pinning a version.
npx sqlrite-notespulls the latest; passnpx sqlrite-notes@0.10.1 …(or whichever tag) to lock to a specific release. The package ships in lockstep with the SQLRite engine, so the version number matches@joaoh82/sqlrite's.
Once you've added the snippet to claude_desktop_config.json and
restarted Claude Desktop, run a chat like:
"Summarize what I've written about Postgres over the last month."
Claude will call bm25_search (and/or vector_search) against the
chunks table, get back the matching passages, and answer with
inline citations to the file path and chunk number.
The default embedder is a deterministic hash-based bag-of-words
embedder that runs in pure JavaScript. No API key, no network,
nothing to install — sqlrite-notes init ~/Documents/notes works
on a fresh laptop.
Hybrid retrieval still beats either signal alone because BM25 is already doing exact-term ranking; the hash embedder mostly carries its weight via the long-tail of co-occurring tokens.
For real semantic recall, switch to OpenAI:
export OPENAI_API_KEY=sk-...
node bin/sqlrite-notes.mjs init ~/Documents/notes --embedder openaiUses text-embedding-3-small with the dimensions: 384 override
so it matches the schema. Override the model with
--openai-model text-embedding-3-large (and bump --dim if you
want full-fat dimensionality).
sqlrite-notes init <dir> Ingest <dir> into the notes DB (replaces the index).
sqlrite-notes refresh <dir> Re-ingest only files whose mtime/hash changed.
sqlrite-notes search "<query>" Hybrid retrieval against the index (debug).
sqlrite-notes serve Spawn sqlrite-mcp --read-only against the DB.
sqlrite-notes stats Row counts.
sqlrite-notes config Print the Claude Desktop config snippet again.
Common flags:
--db <path> Path to the SQLRite database file. Default: ~/.sqlrite-notes/notes.sqlrite
--embedder hash|openai Embedding provider. Default: hash (offline).
--dim <N> Vector dimension. Default: 384.
--openai-model <name> OpenAI embedding model. Default: text-embedding-3-small.
--chunk-tokens <N> Target chunk size in tokens. Default: 400.
--chunk-overlap <N> Chunk overlap in tokens. Default: 60.
sqlrite-notes init prints this block; you can also regenerate it
any time with sqlrite-notes config --db <path>:
{
"mcpServers": {
"sqlrite-notes": {
"command": "sqlrite-notes",
"args": ["serve", "--db", "/Users/you/.sqlrite-notes/notes.sqlrite"]
}
}
}Where it lives:
| Platform | Path |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Linux | ${XDG_CONFIG_HOME:-~/.config}/Claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
Merge the snippet into any existing mcpServers block; don't
overwrite the file wholesale.
If you don't have Claude Desktop, the same MCP server works with any MCP client — Cursor, Codex, your own. The fastest way to verify the wiring is Anthropic's inspector:
npx @modelcontextprotocol/inspector sqlrite-notes serve --db ./notes.sqlriteOpen the URL it prints, click through the tools, type JSON args. Saves a lot of "did Claude restart correctly?" debugging.
The notes index is plain SQLRite — open it in the REPL whenever:
$ cargo install sqlrite-engine # or grab a release binary
$ sqlrite ./notes.sqlrite
SQLRite v0.10.0
sqlrite> SELECT path, title FROM documents ORDER BY mtime DESC LIMIT 5;
sqlrite> SELECT path, ord, substr(content, 1, 80)
...> FROM chunks
...> JOIN documents ON chunks.document_id = documents.id
...> WHERE fts_match(chunks.content, 'rust embedded')
...> ORDER BY bm25_score(chunks.content, 'rust embedded') DESC
...> LIMIT 5;This is the demo's whole point: the notes index is just SQL. You can query it, back it up, copy it between machines, or feed the same file into the Python / Go / WASM SDKs without converting anything.
The search command (and Claude, indirectly, when it composes
query calls) runs the canonical hybrid shape from
docs/fts.md:
SELECT id, document_id, ord, content FROM chunks
WHERE fts_match(content, 'collaborative editing CRDT')
ORDER BY 0.5 * bm25_score(content, 'collaborative editing CRDT')
+ 0.5 * (1.0 - vec_distance_cosine(embedding, [/* query embedding */]))
DESC LIMIT 5;Two things worth noting:
vec_distance_cosinereturns a distance (1 - cos(a, b)). Hybrid scoring wants "higher is better", so we invert it.fts_matchpre-filters before scoring. Paraphrases with zero shared tokens never get scored — a deliberate tradeoff. If a query produces no FTS tokens at all (e.g. a single non-ASCII word),db.hybridSearch()falls back to vector-only so Claude always has something to ground on.
The -w 0..1 flag on sqlrite-notes search tunes the BM25 / vector
balance. At -w 1 you get pure BM25; at -w 0 pure vector cosine.
Default is 0.5.
-
Concurrent
refreshwhileserveis running. SQLRite shippedBEGIN CONCURRENTwrites in v0.10 (the SQLR-22 / Phase 11 work), so reading fromsqlrite-mcp --read-onlywhilesqlrite-notes refreshwrites is supported in principle. In practice we recommend stopping Claude Desktop's MCP connection during a refresh — Claude Desktop caches the tool schemas at server-spawn time and won't notice newly-ingested files until you reload it anyway, so the gain from coexisting isn't worth the surprise. -
Aggregates limited. The engine supports
COUNT(*),SUM/AVG/MIN/MAXbut not arbitrary expressions inSELECTprojection beyond aggregates (seedocs/supported-sql.md). Thestatscommand sticks toCOUNT(*). -
No parameter binding yet. Engine SDKs don't yet accept
?placeholders (Phase 5a.2 follow-up); all SQL strings inline literals via theq()helper insrc/sqlutil.mjs. That helper handles quoting + vector-literal encoding for the four value types we use (string / int / vector / null). -
Re-ingest persistence.
refreshrequires you to pass the source directory again (we don't yet store it inside the DB). If you forget the path, runsqlrite-notes stats --db <path>to confirm the index exists, then re-runinitto rebuild from scratch — it's idempotent. -
--watchmode — not in v1. Re-runrefreshmanually after adding notes. -
Authentication. None. This is a single-user local tool; the MCP server inherits the spawner's filesystem privileges. Don't point it at a database you wouldn't share with whoever can read your home directory.
Hacking on the example itself — running tests, editing the CLI, iterating against a local engine binding — uses the clone path:
git clone https://github.com/joaoh82/rust_sqlite
cd rust_sqlite/examples/nodejs-notes
npm install
npm test # offline; runs all 40 unit + integration tests
# Iterate against the local sources:
node bin/sqlrite-notes.mjs init ~/Documents/notesThe test suite uses node:test and exercises:
chunker(frontmatter, title derivation, overlap windowing)sqlutil(every value type the engine accepts as a SQL literal)embeddings(hash determinism + a mocked OpenAI HTTP call)db(schema migration, upsert/delete/cascade, hybrid retrieval)ingest(end-to-end against the markdown fixtures intest/fixtures/)serve+claude-config(binary lookup, snippet shape)
A few tests need the @joaoh82/sqlrite engine binding installed;
they skip cleanly (with a message) if it isn't.
examples/nodejs-notes/
├── package.json # @joaoh82/sqlrite pinned in lockstep with the engine release
├── README.md # this file
├── bin/
│ └── sqlrite-notes.mjs # entry — calls src/cli.mjs
├── src/
│ ├── cli.mjs # argv parsing + command dispatch
│ ├── config.mjs # defaults + path resolution
│ ├── sqlutil.mjs # q() + ident() — SQL literal helpers
│ ├── db.mjs # schema, migrations, every SQL string
│ ├── chunker.mjs # frontmatter + heading-aware chunking
│ ├── embeddings.mjs # hash + OpenAI embedders
│ ├── ingest.mjs # plan → delete → insert
│ ├── search.mjs # hybrid retrieval driver
│ ├── serve.mjs # spawn sqlrite-mcp --read-only
│ └── claude-config.mjs # Claude Desktop snippet renderer
└── test/ # node --test suite
├── fixtures/ # 3 short markdown notes
├── chunker.test.mjs
├── claude-config.test.mjs
├── db.test.mjs # integration — needs @joaoh82/sqlrite
├── embeddings.test.mjs
├── ingest.test.mjs # integration — needs @joaoh82/sqlrite
├── serve.test.mjs
└── sqlutil.test.mjs
The example binds only to the documented public surfaces:
@joaoh82/sqlrite (Database, Statement)
and sqlrite-mcp as a child process. It does
not reach into engine internals.
MIT — same as the rest of the rust_sqlite repo.