High-performance Ruby driver for Stoolap embedded SQL database. Built with Magnus + rb-sys for direct Rust bindings with no FFI overhead.
- Native Ruby extension gem (not an FFI wrapper).
- Sub-microsecond point queries on in-memory databases.
- MVCC transactions with snapshot isolation, parallel query execution, columnar cold volumes with zone maps, bloom filters, LZ4 compression.
- Full SQL: JOINs, window functions, CTEs, subqueries, GROUP BY / ROLLUP / CUBE / GROUPING SETS, HNSW vector search.
Add to your Gemfile:
gem "stoolap"Or install directly:
gem install stoolapRequires Ruby >= 3.3 and a stable Rust toolchain at install time (the native extension is built from source on your machine). Rust: https://rustup.rs.
require "stoolap"
# In-memory database
db = Stoolap::Database.open(":memory:")
# exec runs one or more DDL/DML statements (no parameters)
db.exec(<<~SQL)
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT
);
CREATE INDEX idx_users_name ON users(name);
SQL
# execute runs a single statement with parameters, returns rows affected
db.execute(
"INSERT INTO users (id, name, email) VALUES ($1, $2, $3)",
[1, "Alice", "alice@example.com"]
)
# Named parameters (:key) with either Symbol or String keys
db.execute(
"INSERT INTO users (id, name, email) VALUES (:id, :name, :email)",
{ id: 2, name: "Bob", email: "bob@example.com" }
)
# query returns an Array of Hashes with String keys
users = db.query("SELECT * FROM users ORDER BY id")
# => [{"id" => 1, "name" => "Alice", ...}, ...]
# query_one returns a single Hash or nil
user = db.query_one("SELECT * FROM users WHERE id = $1", [1])
# => {"id" => 1, "name" => "Alice", "email" => "alice@example.com"}
# query_raw returns columnar format (skip the per-row Hash allocation)
raw = db.query_raw("SELECT id, name FROM users ORDER BY id")
# => {"columns" => ["id", "name"], "rows" => [[1, "Alice"], [2, "Bob"]]}
db.closeStoolap::Database.open(":memory:") do |db|
db.exec("CREATE TABLE t (id INTEGER PRIMARY KEY)")
db.execute("INSERT INTO t VALUES ($1)", [1])
end
# db.close already ran, even if the block raised# In-memory (three equivalent forms)
Stoolap::Database.open(":memory:")
Stoolap::Database.open("")
Stoolap::Database.open("memory://")
# File-backed (data persists across restarts)
Stoolap::Database.open("./mydata")
Stoolap::Database.open("file:///absolute/path/to/db")| Method | Returns | Description |
|---|---|---|
execute(sql, params = nil) |
Integer |
Execute DML, return rows affected |
exec(sql) |
nil |
Execute one or more statements (no parameters) |
query(sql, params = nil) |
Array<Hash> |
All rows as Array of Hashes |
query_one(sql, params = nil) |
Hash, nil |
First row as Hash, or nil |
query_raw(sql, params = nil) |
Hash |
{"columns" => [...], "rows" => [[...], ...]} |
execute_batch(sql, params_list) |
Integer |
Same SQL, many param sets, auto-tx |
prepare(sql) |
PreparedStatement |
Cache a parsed + planned statement |
begin_transaction |
Transaction |
Start a manual transaction |
transaction { |tx| ... } |
block return | Auto-commit or rollback on raise |
close |
nil |
Close the database |
All parameter-accepting methods take either an Array (positional, $1, $2, ...) or a Hash (named, :key):
# Positional
db.query("SELECT * FROM users WHERE id = $1 AND name = $2", [1, "Alice"])
# Named with Symbol keys
db.query("SELECT * FROM users WHERE id = :id AND name = :name",
{ id: 1, name: "Alice" })
# Named with String keys
db.query("SELECT * FROM users WHERE id = :id", { "id" => 1 })Named-parameter keys can optionally carry a :, @, or $ sigil. The driver strips it:
db.query("SELECT * FROM users WHERE id = :id", { ":id" => 1 })
db.query("SELECT * FROM users WHERE id = :id", { "@id" => 1 })
db.query("SELECT * FROM users WHERE id = :id", { "$id" => 1 })query_raw returns {"columns" => [...], "rows" => [[...], ...]} instead of an Array of Hashes. Use it when you are streaming large results and do not need named access to each row.
raw = db.query_raw("SELECT id, name, email FROM users ORDER BY id")
raw["columns"] # => ["id", "name", "email"]
raw["rows"] # => [[1, "Alice", "alice@example.com"], ...]Execute the same statement with many parameter sets in one call. Automatically wrapped in a transaction:
changes = db.execute_batch(
"INSERT INTO users (id, name, email) VALUES ($1, $2, $3)",
[
[1, "Alice", "alice@example.com"],
[2, "Bob", "bob@example.com"],
[3, "Charlie", "charlie@example.com"]
]
)
# changes => 3execute_batch only supports positional parameters (Arrays). Hash parameter sets raise Stoolap::Error.
A prepared statement parses SQL once and caches the execution plan. Column-key Strings are also cached on the first query so repeated calls do not re-allocate them.
insert = db.prepare("INSERT INTO users (id, name) VALUES ($1, $2)")
insert.execute([1, "Alice"])
insert.execute([2, "Bob"])
# Batch on a prepared statement (reuses the cached plan)
insert.execute_batch([
[3, "Charlie"],
[4, "Diana"]
])
# Prepared reads
lookup = db.prepare("SELECT * FROM users WHERE id = $1")
lookup.query_one([1]) # => {"id" => 1, "name" => "Alice"}
lookup.query([1]) # => [{"id" => 1, "name" => "Alice"}]
lookup.query_raw([1]) # => {"columns" => [...], "rows" => [[...]]}
# Named parameters work the same
lookup_named = db.prepare("SELECT * FROM users WHERE id = :id")
lookup_named.query_one({ id: 1 })| Method | Returns | Description |
|---|---|---|
execute(params = nil) |
Integer |
Execute DML |
query(params = nil) |
Array<Hash> |
All rows |
query_one(params = nil) |
Hash, nil |
First row or nil |
query_raw(params = nil) |
Hash |
Columnar format |
execute_batch(params_list) |
Integer |
Many param sets, auto-tx |
sql |
String |
The SQL text this statement was built from |
db.transaction do |tx|
tx.execute("INSERT INTO users (id, name) VALUES ($1, $2)", [1, "Alice"])
tx.execute("INSERT INTO users (id, name) VALUES ($1, $2)", [2, "Bob"])
# Reads within the tx see its own uncommitted writes
rows = tx.query("SELECT * FROM users")
one = tx.query_one("SELECT * FROM users WHERE id = $1", [1])
raw = tx.query_raw("SELECT id, name FROM users")
end
# commit on clean exit, rollback on any raiseThe block helper returns whatever the block returns:
total = db.transaction do |tx|
tx.query_one("SELECT COUNT(*) AS c FROM users")["c"]
endtx = db.begin_transaction
begin
tx.execute("INSERT INTO users (id, name) VALUES ($1, $2)", [1, "Alice"])
tx.commit
rescue StandardError
tx.rollback
raise
endOr the equivalent Transaction#with_rollback helper:
tx = db.begin_transaction
tx.with_rollback do |t|
t.execute("INSERT INTO users (id, name) VALUES ($1, $2)", [1, "Alice"])
end
# commit on success, rollback on raise| Method | Returns | Description |
|---|---|---|
execute(sql, params = nil) |
Integer |
Execute DML |
query(sql, params = nil) |
Array<Hash> |
All rows |
query_one(sql, params = nil) |
Hash, nil |
First row or nil |
query_raw(sql, params = nil) |
Hash |
Columnar format |
execute_batch(sql, params_list) |
Integer |
Many param sets |
execute_prepared(stmt, params = nil) |
Integer |
Run a PreparedStatement inside this tx |
query_prepared(stmt, params = nil) |
Array<Hash> |
Query a PreparedStatement inside this tx |
query_one_prepared(stmt, params = nil) |
Hash, nil |
First row from a prepared query |
query_raw_prepared(stmt, params = nil) |
Hash |
Columnar result from a prepared query |
commit |
nil |
Commit the transaction |
rollback |
nil |
Roll back the transaction |
with_rollback { |tx| ... } |
block return | Commit on clean exit, rollback on raise |
Calling execute / query / commit / rollback on a committed or rolled-back transaction raises Stoolap::Error. DDL statements (CREATE TABLE, etc.) are not allowed inside explicit transactions; run them outside.
| Ruby | Stoolap | Notes |
|---|---|---|
Integer |
INTEGER |
64-bit signed (full i64::MIN..=i64::MAX) |
Float |
FLOAT |
64-bit double |
String |
TEXT |
UTF-8, returned as UTF-8-encoded String |
true / false |
BOOLEAN |
|
nil |
NULL |
Any column type |
Time |
TIMESTAMP |
Nanosecond precision; local times converted to UTC; returned as UTC Time |
Symbol |
TEXT |
Converted to its string name |
Hash |
JSON |
Serialized via JSON.dump (requires require "json") |
Array |
JSON |
Serialized via JSON.dump |
Stoolap::Vector |
VECTOR(N) |
See below |
Any other type passed as a parameter raises TypeError.
Stoolap has native VECTOR(N) columns and HNSW indexes. Wrap a Ruby numeric array in Stoolap::Vector to avoid the JSON encoding path:
db.exec(<<~SQL)
CREATE TABLE documents (
id INTEGER PRIMARY KEY,
title TEXT,
embedding VECTOR(3)
);
CREATE INDEX idx_emb ON documents(embedding) USING HNSW WITH (metric = 'cosine');
SQL
db.execute(
"INSERT INTO documents VALUES ($1, $2, $3)",
[1, "Hello world", Stoolap::Vector.new([0.1, 0.2, 0.3])]
)
db.execute(
"INSERT INTO documents VALUES ($1, $2, $3)",
[2, "Goodbye world", Stoolap::Vector.new([0.9, 0.1, 0.0])]
)
# k-NN search: 5 nearest neighbours by cosine distance
results = db.query(<<~SQL)
SELECT id, title,
VEC_DISTANCE_COSINE(embedding, '[0.1, 0.2, 0.3]') AS dist
FROM documents
ORDER BY dist
LIMIT 5
SQL
# Read vectors back as Array<Float>
row = db.query_one("SELECT embedding FROM documents WHERE id = 1")
row["embedding"] # => [0.1, 0.2, 0.3]| Method | Returns | Description |
|---|---|---|
Stoolap::Vector.new(array) |
Vector |
Build from an Array of numbers (raises TypeError on non-numeric) |
#to_a |
Array<Float> |
Copy to a plain Array |
#length / #size |
Integer |
Dimension count |
#inspect / #to_s |
String |
#<Stoolap::Vector [0.1, 0.2, 0.3]> |
| Function | Description |
|---|---|
VEC_DISTANCE_L2(a, b) |
Euclidean distance |
VEC_DISTANCE_COSINE(a, b) |
Cosine distance (1 minus cosine similarity) |
VEC_DISTANCE_IP(a, b) |
Negative inner product |
CREATE INDEX idx ON documents(embedding) USING HNSW WITH (metric = 'cosine');Supported metrics: l2 (default), cosine, ip.
File-backed databases persist data via Write-Ahead Logging and immutable columnar cold volumes. A background checkpoint cycle seals hot rows into volume files, compacts them, and truncates the WAL. Data survives process restarts.
Stoolap::Database.open("./mydata") do |db|
db.exec("CREATE TABLE kv (key TEXT PRIMARY KEY, value TEXT)")
db.execute("INSERT INTO kv VALUES ($1, $2)", ["hello", "world"])
end
# Reopen: data is still there
Stoolap::Database.open("./mydata") do |db|
db.query_one("SELECT value FROM kv WHERE key = $1", ["hello"])
# => {"value" => "world"}
endAppend options to the path as query parameters:
# Maximum durability (fsync on every WAL write)
Stoolap::Database.open("./mydata?sync=full")
# High throughput (no fsync, larger buffers)
Stoolap::Database.open("./mydata?sync=none&wal_buffer_size=131072")
# Custom checkpoint interval with compression
Stoolap::Database.open("./mydata?checkpoint_interval=60&compression=on")| Parameter | Default | Description |
|---|---|---|
sync |
normal |
Sync mode: none, normal, full |
checkpoint_interval |
60 |
Seconds between automatic checkpoint cycles |
compact_threshold |
4 |
Sub-target volumes per table before merging |
keep_snapshots |
3 |
Backup snapshots retained per table |
wal_flush_trigger |
32768 |
WAL flush trigger in bytes (32 KB) |
wal_buffer_size |
65536 |
WAL buffer size in bytes (64 KB) |
wal_max_size |
67108864 |
Max WAL file size before rotation (64 MB) |
commit_batch_size |
100 |
Commits batched before syncing (normal mode) |
sync_interval_ms |
1000 |
Minimum ms between syncs (normal mode) |
wal_compression |
on |
LZ4 compression for WAL entries |
volume_compression |
on |
LZ4 compression for cold volume files |
compression |
-- | Alias that sets both wal_compression and volume_compression |
compression_threshold |
64 |
Minimum bytes before compressing an entry |
checkpoint_on_close |
on |
Seal all hot rows to volumes on clean shutdown |
target_volume_rows |
1048576 |
Target rows per cold volume (min 65536) |
| Mode | Value | Description |
|---|---|---|
none |
sync=none |
No fsync. Fastest, data may be lost on crash |
normal |
sync=normal |
Fsync in commit batches (default) |
full |
sync=full |
Fsync on every WAL write. Slowest, maximum durability |
Opening the same DSN twice in one process returns the same underlying engine. Closing one handle closes the engine for all handles. This prevents corruption and is why you should open a database once per process and pass the instance around.
All database errors raise Stoolap::Error, which inherits from StandardError:
begin
db.execute("SELEXT * FROM nothing")
rescue Stoolap::Error => e
warn "database error: #{e.message}"
endInvalid parameter types raise TypeError:
begin
db.execute("INSERT INTO t VALUES ($1)", [Object.new])
rescue TypeError => e
warn e.message
endStoolap is a full-featured embedded SQL database:
- MVCC transactions with snapshot isolation.
- Cost-based query optimizer with adaptive execution.
- Parallel execution for filter, join, sort, and distinct operators.
- JOINs: INNER, LEFT, RIGHT, FULL OUTER, CROSS, NATURAL.
- Subqueries: scalar,
EXISTS,IN,NOT IN,ANY/ALL, correlated. - Window functions:
ROW_NUMBER,RANK,DENSE_RANK,LAG,LEAD,NTILE, plus frame specs. - CTEs:
WITHandWITH RECURSIVE. - Aggregations:
GROUP BY,HAVING,ROLLUP,CUBE,GROUPING SETS. - Vector similarity search with HNSW indexes over
l2,cosine,ip. - Indexes: B-tree, hash, bitmap (auto-selected), HNSW, multi-column composite.
- 131 built-in functions: string, math, date/time, JSON, vector, aggregate.
- Immutable volume storage with columnar format, zone maps, bloom filters, LZ4 compression.
- WAL + checkpoint cycles for crash recovery.
- Aggregation pushdown to cold volume statistics (
COUNT,SUM,MIN,MAX). - Semantic query caching with predicate subsumption.
See the Stoolap documentation for SQL reference.
Requires:
- Rust (stable)
- Ruby
>= 3.3 - A C toolchain for the Ruby header files
git clone https://github.com/stoolap/stoolap-ruby.git
cd stoolap-ruby
bundle install
bundle exec rake compile
bundle exec rake testThe compile task invokes rb_sys/mkmf which builds the Rust extension into lib/stoolap/stoolap.<ext> so the installed gem can require it.
bundle exec rake testThe suite runs against lib/stoolap.rb with 100% line coverage (SimpleCov). lib/stoolap/version.rb is filtered from the metric because Bundler loads it via the gemspec before SimpleCov can instrument it.
Apache-2.0. See LICENSE.