Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
tests/testdata/** filter=lfs diff=lfs merge=lfs -text
AGENTS.md merge=union
build.rs merge=union
130 changes: 100 additions & 30 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,18 +14,19 @@ pathrex/
├── Cargo.toml # Crate manifest (edition 2024)
├── build.rs # Links LAGraph + LAGraphX; optionally regenerates FFI bindings
├── src/
│ ├── lib.rs # Public modules: formats, graph, sparql, lagraph_sys, utils
│ ├── lib.rs # Modules: formats, graph, rpq, sparql, utils, lagraph_sys
│ ├── main.rs # Binary entry point (placeholder)
│ ├── lagraph_sys.rs # FFI module — includes generated bindings
│ ├── lagraph_sys_generated.rs# Bindgen output (checked in, regenerated in CI)
│ ├── utils.rs # Internal helpers: CountingBuilder, CountOutput, VecSource,
│ │ # grb_ok! and la_ok! macros
│ ├── utils.rs # Public helpers: CountingBuilder, CountOutput, VecSource,
│ │ # grb_ok! and la_ok! macros, build_graph
│ ├── graph/
│ │ ├── mod.rs # Core traits (GraphBuilder, GraphDecomposition, GraphSource,
│ │ │ # Backend, Graph<B>), error types, RAII wrappers, GrB init
│ │ └── inmemory.rs # InMemory marker, InMemoryBuilder, InMemoryGraph
│ ├── rpq/
│ │ ├── mod.rs # RpqEvaluator (assoc. Result), RpqQuery, Endpoint, PathExpr, RpqError
│ │ ├── nfarpq.rs # NfaRpqEvaluator (LAGraph_RegularPathQuery)
│ │ └── rpqmatrix.rs # Matrix-plan RPQ evaluator
│ ├── sparql/
│ │ └── mod.rs # parse_rpq / extract_rpq → RpqQuery (spargebra)
Expand All @@ -37,6 +38,7 @@ pathrex/
├── tests/
│ ├── inmemory_tests.rs # Integration tests for InMemoryBuilder / InMemoryGraph
│ ├── mm_tests.rs # Integration tests for MatrixMarket format
│ ├── nfarpq_tests.rs # Integration tests for NfaRpqEvaluator
│ └── rpqmatrix_tests.rs # Integration tests for matrix-plan RPQ evaluator
├── deps/
│ └── LAGraph/ # Git submodule (SparseLinearAlgebra/LAGraph)
Expand Down Expand Up @@ -131,10 +133,10 @@ builders: `{ source: String, target: String, label: String }`.

### GraphSource trait

[`GraphSource<B>`](src/graph/mod.rs:164) is implemented by any data source that knows how to
[`GraphSource<B>`](src/graph/mod.rs:168) is implemented by any data source that knows how to
feed itself into a specific [`GraphBuilder`]:

- [`apply_to(self, builder: B) -> Result<B, B::Error>`](src/graph/mod.rs:165) — consumes the
- [`apply_to(self, builder: B) -> Result<B, B::Error>`](src/graph/mod.rs:169) — consumes the
source and returns the populated builder.

[`Csv<R>`](src/formats/csv.rs), [`MatrixMarket`](src/formats/mm.rs), and [`NTriples<R>`](src/formats/nt.rs)
Expand All @@ -143,24 +145,24 @@ can be passed to [`GraphBuilder::load`] and [`Graph::try_from`].

### GraphBuilder trait

[`GraphBuilder`](src/graph/mod.rs:169) accumulates edges and produces a
[`GraphDecomposition`](src/graph/mod.rs:188):
[`GraphBuilder`](src/graph/mod.rs:173) accumulates edges and produces a
[`GraphDecomposition`](src/graph/mod.rs:193):

- [`load<S: GraphSource<Self>>(self, source: S)`](src/graph/mod.rs:179) — primary entry point;
- [`load<S: GraphSource<Self>>(self, source: S)`](src/graph/mod.rs:183) — primary entry point;
delegates to `GraphSource::apply_to`.
- [`build(self)`](src/graph/mod.rs:184) — finalise into an immutable graph.
- [`build(self)`](src/graph/mod.rs:188) — finalise into an immutable graph.

`InMemoryBuilder` also exposes lower-level helpers outside the trait:

- [`push_edge(&mut self, edge: Edge)`](src/graph/inmemory.rs:62) — ingest one edge.
- [`with_stream<I, E>(self, stream: I)`](src/graph/inmemory.rs:72) — consume an
- [`push_edge(&mut self, edge: Edge)`](src/graph/inmemory.rs:83) — ingest one edge.
- [`with_stream<I, E>(self, stream: I)`](src/graph/inmemory.rs:93) — consume an
`IntoIterator<Item = Result<Edge, E>>`.
- [`push_grb_matrix(&mut self, label, matrix: GrB_Matrix)`](src/graph/inmemory.rs:85) — accept
- [`push_grb_matrix(&mut self, label, matrix: GrB_Matrix)`](src/graph/inmemory.rs:106) — accept
a pre-built `GrB_Matrix` for a label, wrapping it in an `LAGraph_Graph` immediately.

### Backend trait & Graph\<B\> handle

[`Backend`](src/graph/mod.rs:217) associates a marker type with a concrete builder/graph pair:
[`Backend`](src/graph/mod.rs:221) associates a marker type with a concrete builder/graph pair:

```rust
pub trait Backend {
Expand All @@ -169,28 +171,28 @@ pub trait Backend {
}
```

[`Graph<B>`](src/graph/mod.rs:229) is a zero-sized handle parameterised by a `Backend`:
[`Graph<B>`](src/graph/mod.rs:233) is a zero-sized handle parameterised by a `Backend`:

- [`Graph::<InMemory>::builder()`](src/graph/mod.rs:234) — returns a fresh `InMemoryBuilder`.
- [`Graph::<InMemory>::try_from(source)`](src/graph/mod.rs:238) — builds a graph from a single
- [`Graph::<InMemory>::builder()`](src/graph/mod.rs:238) — returns a fresh `InMemoryBuilder`.
- [`Graph::<InMemory>::try_from(source)`](src/graph/mod.rs:242) — builds a graph from a single
source in one call.

[`InMemory`](src/graph/inmemory.rs:26) is the concrete backend marker type.
[`InMemory`](src/graph/inmemory.rs:27) is the concrete backend marker type.

### GraphDecomposition trait

[`GraphDecomposition`](src/graph/mod.rs:188) is the read-only query interface:
[`GraphDecomposition`](src/graph/mod.rs:193) is the read-only query interface:

- [`get_graph(label)`](src/graph/mod.rs:192) — returns `Arc<LagraphGraph>` for a given edge label.
- [`get_node_id(string_id)`](src/graph/mod.rs:195) / [`get_node_name(mapped_id)`](src/graph/mod.rs:198) — bidirectional string ↔ integer dictionary.
- [`num_nodes()`](src/graph/mod.rs:199) — total unique nodes.
- [`get_graph(label)`](src/graph/mod.rs:197) — returns `Arc<LagraphGraph>` for a given edge label.
- [`get_node_id(string_id)`](src/graph/mod.rs:200) / [`get_node_name(mapped_id)`](src/graph/mod.rs:203) — bidirectional string ↔ integer dictionary.
- [`num_nodes()`](src/graph/mod.rs:204) — total unique nodes.

### InMemoryBuilder / InMemoryGraph

[`InMemoryBuilder`](src/graph/inmemory.rs:35) is the primary `GraphBuilder` implementation.
[`InMemoryBuilder`](src/graph/inmemory.rs:36) is the primary `GraphBuilder` implementation.
It collects edges in RAM, then [`build()`](src/graph/inmemory.rs:131) calls
GraphBLAS to create one `GrB_Matrix` per label via COO format, wraps each in an
`LAGraph_Graph`, and returns an [`InMemoryGraph`](src/graph/inmemory.rs:173).
`LAGraph_Graph`, and returns an [`InMemoryGraph`](src/graph/inmemory.rs:174).

Multiple CSV sources can be chained with repeated `.load()` calls; all edges are merged
into a single graph.
Expand All @@ -205,6 +207,7 @@ which is used by the MatrixMarket loader.
Three built-in parsers are available, each yielding
`Iterator<Item = Result<Edge, FormatError>>` and pluggable into
`GraphBuilder::load()` via `GraphSource<InMemoryBuilder>` (see [`src/graph/inmemory.rs`](src/graph/inmemory.rs)).
CSV and MatrixMarket edge loaders are available:

#### `Csv<R>`

Expand All @@ -227,7 +230,7 @@ Name-based lookup requires `has_header: true`.

[`MatrixMarket`](src/formats/mm.rs) loads an edge-labeled graph from a directory with:

- `vertices.txt` — one line per node: `<node_name> <1-based-index>` on disk; [`get_node_id`](src/graph/mod.rs:199) returns the matching **0-based** matrix index
- `vertices.txt` — one line per node: `<node_name> <1-based-index>` on disk; [`get_node_id`](src/graph/mod.rs:200) returns the matching **0-based** matrix index
- `edges.txt` — one line per label: `<label_name> <1-based-index>` (selects `n.txt`)
- `<n>.txt` — MatrixMarket adjacency matrix for label with index `n`

Expand Down Expand Up @@ -349,6 +352,63 @@ RDF files using `oxttl` and `oxrdf`. Each triple `(subject, predicate, object)`
Constructor:

- [`NTriples::new(reader)`](src/formats/nt.rs:56) — parses the stream; each predicate IRI is copied verbatim to the edge label.
### SPARQL parsing (`src/sparql/mod.rs`)

The [`rpq`](src/rpq/mod.rs) module provides an abstraction for evaluating
Regular Path Queries (RPQs) over edge-labeled graphs using GraphBLAS/LAGraph.

Key public items:

- [`Endpoint`](src/rpq/mod.rs) — `Variable(String)` or `Named(String)` (IRI string).
- [`PathExpr`](src/rpq/mod.rs) — `Label`, `Sequence`, `Alternative`, `ZeroOrMore`,
`OneOrMore`, `ZeroOrOne`.
- [`RpqQuery`](src/rpq/mod.rs) — `{ subject, path, object }` using the types above;
[`strip_base(&mut self, base)`](src/rpq/mod.rs) removes a shared IRI prefix from
named endpoints and labels.
- [`RpqEvaluator`](src/rpq/mod.rs) — trait with associated type `Result` and
[`evaluate(query, graph)`](src/rpq/mod.rs) taking `&RpqQuery` and
[`GraphDecomposition`], returning `Result<Self::Result, RpqError>`.
Each concrete evaluator exposes its own output type (see below).
- [`RpqError`](src/rpq/mod.rs) — unified error type for RPQ parsing and evaluation:
`Parse` (SPARQL syntax), `Extract` (query extraction), `UnsupportedPath`,
`VertexNotFound`, and `Graph` (wraps [`GraphError`](src/graph/mod.rs) for
label-not-found and GraphBLAS/LAGraph failures).

#### `NfaRpqEvaluator` (`src/rpq/nfarpq.rs`)

[`NfaRpqEvaluator`](src/rpq/nfarpq.rs) implements [`RpqEvaluator`] by:

1. Converting a [`PathExpr`] into an [`Nfa`](src/rpq/nfarpq.rs) via Thompson's
construction ([`Nfa::from_path_expr()`](src/rpq/nfarpq.rs)).
2. Eliminating ε-transitions via epsilon closure ([`NfaBuilder::epsilon_closure()`](src/rpq/nfarpq.rs)).
3. Building one `LAGraph_Graph` per NFA label transition
([`Nfa::build_lagraph_matrices()`](src/rpq/nfarpq.rs)).
4. Calling [`LAGraph_RegularPathQuery`] with the NFA matrices, data-graph
matrices, start/final states, and source vertices.

`type Result = NfaRpqResult` ([`GraphblasVector`] of reachable targets).

Supported path operators match [`PathExpr`] variants above. `Reverse` and
`NegatedPropertySet` from SPARQL map to [`RpqError::UnsupportedPath`] when they
appear in extracted paths.

Subject/object resolution: [`Endpoint::Variable`] means "all vertices";
[`Endpoint::Named`] resolves to a single vertex via
[`GraphDecomposition::get_node_id()`](src/graph/mod.rs:200).

[`NfaRpqResult`](src/rpq/nfarpq.rs) wraps a [`GraphblasVector`] of reachable **target**
vertices. When the subject is a variable, every vertex is used as a source and
`LAGraph_RegularPathQuery` returns the union of targets — individual `(source, target)`
pairs are not preserved.

#### `RpqMatrixEvaluator` (`src/rpq/rpqmatrix.rs`)

[`RpqMatrixEvaluator`](src/rpq/rpqmatrix.rs) compiles [`PathExpr`] into a Boolean matrix plan
over label adjacency matrices and runs [`LAGraph_RPQMatrix`]. It returns
[`RpqMatrixResult`](src/rpq/rpqmatrix.rs): the path-relation `nnz` plus a
[`GraphblasMatrix`] duplicate of the result matrix (full reachability relation for the path).
Subject/object do not filter the matrix; a named subject is only validated to exist.
Bound objects are not supported yet ([`RpqError::UnsupportedPath`]).

### FFI layer

Expand All @@ -363,7 +423,7 @@ LAGraph. Safe Rust wrappers live in [`graph::mod`](src/graph/mod.rs):
- [`GraphblasMatrix`](src/graph/mod.rs) — RAII wrapper around `GrB_Matrix` (`dup` + `free` on drop).
- [`ensure_grb_init()`](src/graph/mod.rs:39) — one-time `LAGraph_Init` via `std::sync::Once`.

### Macros (`src/utils.rs`)
### Macros & helpers (`src/utils.rs`)

Two `#[macro_export]` macros handle FFI error mapping:

Expand All @@ -373,6 +433,12 @@ Two `#[macro_export]` macros handle FFI error mapping:
appending the required `*mut i8` message buffer, and maps failure to
`GraphError::LAGraph(info, msg)`.

A convenience function is also provided:

- [`build_graph(edges)`](src/utils.rs:184) — builds an `InMemoryGraph` from a
slice of `(&str, &str, &str)` triples (source, target, label). Used by
integration tests.

## Coding Conventions

- **Rust edition 2024**.
Expand All @@ -384,14 +450,13 @@ Two `#[macro_export]` macros handle FFI error mapping:
- `GraphError` converts into `RpqError` via `#[from] GraphError` on the
`RpqError::Graph` variant, enabling `?` propagation in evaluators.
- Unsafe FFI calls are confined to `lagraph_sys`, `graph/mod.rs`,
`graph/inmemory.rs`, `rpq/nfarpq.rs`, and `rpq/rpqmatrix.rs`. All raw pointers are wrapped in
`graph/inmemory.rs`, `rpq/nfarpq.rs`. All raw pointers are wrapped in
RAII types that free resources on drop.
- `unsafe impl Send + Sync` is provided for `LagraphGraph`,
`GraphblasVector`, and `GraphblasMatrix` because GraphBLAS handles are thread-safe after init.
- Unit tests live in `#[cfg(test)] mod tests` blocks inside each module.
Integration tests that need GraphBLAS live in [`tests/inmemory_tests.rs`](tests/inmemory_tests.rs),
[`tests/mm_tests.rs`](tests/mm_tests.rs), [`tests/nfarpq_tests.rs`](tests/nfarpq_tests.rs),
and [`tests/rpqmatrix_tests.rs`](tests/rpqmatrix_tests.rs).
[`tests/mm_tests.rs`](tests/mm_tests.rs), [`tests/nfarpq_tests.rs`](tests/nfarpq_tests.rs).

## Testing

Expand All @@ -411,8 +476,13 @@ Tests in `src/formats/csv.rs` and `src/formats/nt.rs` are pure Rust and need no

Tests in `src/sparql/mod.rs` are pure Rust and need no native dependencies.

Tests in `src/graph/inmemory.rs` and [`tests/inmemory_tests.rs`](tests/inmemory_tests.rs)
call real GraphBLAS/LAGraph and require the native libraries to be present.
Tests in `src/rpq/nfarpq.rs` (NFA construction unit tests) are pure Rust and need no
native dependencies.

Tests in `src/graph/inmemory.rs`, [`tests/inmemory_tests.rs`](tests/inmemory_tests.rs),
[`tests/mm_tests.rs`](tests/mm_tests.rs), [`tests/nfarpq_tests.rs`](tests/nfarpq_tests.rs),
and [`tests/rpqmatrix_tests.rs`](tests/rpqmatrix_tests.rs) call real GraphBLAS/LAGraph and
require the native libraries to be present.

## CI

Expand Down
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ egg = "0.10.0"
libc = "0.2"
oxrdf = "0.3.3"
oxttl = "0.2.3"
rustfst = "1.2"
spargebra = "0.4.6"
thiserror = "1.0"

Expand Down
29 changes: 29 additions & 0 deletions src/graph/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,35 @@ impl GraphblasVector {
grb_ok!(GrB_Vector_new(&mut v, GrB_BOOL, n))?;
Ok(Self { inner: v })
}

/// Returns the number of stored values in this vector.
pub fn nvals(&self) -> Result<GrB_Index, GraphError> {
let mut nvals: GrB_Index = 0;
grb_ok!(GrB_Vector_nvals(&mut nvals, self.inner))?;
Ok(nvals)
}

/// Extracts all stored indices from boolean vector.
pub fn indices(&self) -> Result<Vec<GrB_Index>, GraphError> {
let nvals = self.nvals()?;
if nvals == 0 {
return Ok(Vec::new());
}

let mut indices = vec![0u64; nvals as usize];
let mut values = vec![false; nvals as usize];
let mut actual_nvals = nvals;

grb_ok!(GrB_Vector_extractTuples_BOOL(
indices.as_mut_ptr(),
values.as_mut_ptr(),
&mut actual_nvals,
self.inner,
))?;

indices.truncate(actual_nvals as usize);
Ok(indices)
}
}

impl Drop for GraphblasVector {
Expand Down
18 changes: 15 additions & 3 deletions src/lagraph_sys_generated.rs
Original file line number Diff line number Diff line change
Expand Up @@ -155,9 +155,6 @@ unsafe extern "C" {
ncols: GrB_Index,
) -> GrB_Info;
}
unsafe extern "C" {
pub fn GrB_Matrix_dup(C: *mut GrB_Matrix, A: GrB_Matrix) -> GrB_Info;
}
unsafe extern "C" {
pub fn GrB_Matrix_nvals(nvals: *mut GrB_Index, A: GrB_Matrix) -> GrB_Info;
}
Expand Down Expand Up @@ -272,6 +269,21 @@ unsafe extern "C" {
msg: *mut ::std::os::raw::c_char,
) -> ::std::os::raw::c_int;
}
unsafe extern "C" {
pub fn LAGraph_RegularPathQuery(
reachable: *mut GrB_Vector,
R: *mut LAGraph_Graph,
nl: usize,
QS: *const GrB_Index,
nqs: usize,
QF: *const GrB_Index,
nqf: usize,
G: *mut LAGraph_Graph,
S: *const GrB_Index,
ns: usize,
msg: *mut ::std::os::raw::c_char,
) -> ::std::os::raw::c_int;
}
#[repr(u32)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
pub enum RPQMatrixOp {
Expand Down
1 change: 1 addition & 0 deletions src/rpq/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
//! let result: NfaRpqResult = NfaRpqEvaluator.evaluate(&query, &graph)?;
//! ```

pub mod nfarpq;
pub mod rpqmatrix;

use crate::graph::{GraphDecomposition, GraphError};
Expand Down
Loading
Loading