Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,23 @@
# Changelog

## v1.6.0 — Onboarding + Identity (2026-04-10)

### Added
- **Interactive onboarding** (`engraph init`) — polished CLI with welcome banner, vault scan checkmarks, identity prompts via dialoguer, progress bars, actionable next steps
- **Agent onboarding** — `engraph init --detect --json` for vault inspection, `--json` for non-interactive apply. Two-phase detect → apply flow for AI agents.
- **`identity` MCP tool + CLI + HTTP** — returns compact L0/L1 identity block (~170 tokens) for AI session context
- **`setup` MCP tool + HTTP** — first-time setup from inside an MCP session (detect/apply modes)
- **`identity_facts` table** — SQLite storage for L0 (static identity) and L1 (dynamic context) facts
- **L1 auto-extraction** — active projects, key people, current focus, OOO status, blocking items extracted during `engraph index`
- **`engraph identity --refresh`** — re-extract L1 facts without full reindex
- **`[identity]` config section** — name, role, vault_purpose in config.toml
- **`[memory]` config section** — feature flags for identity/timeline/mining

### Changed
- MCP tools: 23 → 25
- HTTP endpoints: 24 → 26
- Dependencies: +dialoguer 0.12, +console 0.16, +regex 1

## v1.5.5 — Housekeeping (2026-04-10)

### Added
Expand Down
2 changes: 2 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ Single binary with 26 modules behind a lib crate:
- `indexer.rs` — orchestrates vault walking (via `ignore` crate for `.gitignore` support), diffing, chunking, embedding, writes to store + sqlite-vec + FTS5, vault graph edge building (wikilinks + people detection), and folder centroid computation. Exposes `index_file`, `remove_file`, `rename_file` as public per-file functions. `run_index_shared` accepts external store/embedder for watcher FullRescan. Dimension migration on model change.
- `temporal.rs` — temporal search lane. Extracts note dates from frontmatter `date:` field or `YYYY-MM-DD` filename patterns. Heuristic date parsing for natural language ("today", "yesterday", "last week", "this month", "recent", month names, ISO dates, date ranges). Smooth decay scoring for files near but outside target date range. Provides `extract_note_date()` for indexing and `score_temporal()` + `parse_date_range_heuristic()` for search
- `search.rs` — hybrid search orchestrator. `search_with_intelligence()` runs the full pipeline: orchestrate (intent + expansions) → 5-lane RRF retrieval (semantic + FTS5 + graph + reranker + temporal) per expansion → two-pass RRF fusion. `search_internal()` is a thin wrapper without intelligence models. Adaptive lane weights per query intent including temporal (1.5 weight for time-aware queries). Results display normalized confidence percentages (0-100%) instead of raw RRF scores.
- `identity.rs` — L1 extraction engine: active projects, key people, current focus, OOO, blocking. `format_identity_block()` for compact session context. `extract_l1_facts()` called after indexing.
- `onboarding.rs` — Interactive CLI UX: welcome banner, vault scan, identity prompts (dialoguer), agent mode (--detect --json, --json). `run_interactive()`, `run_detect_json()`, `run_apply_json()`.

`main.rs` is a thin clap CLI (async via `#[tokio::main]`). Subcommands: `index` (with progress bar), `search` (with `--explain`, loads intelligence models when enabled), `status` (shows intelligence state + date coverage stats), `clear`, `init` (intelligence onboarding prompt, detects Obsidian CLI + AI agents), `configure` (`--enable-intelligence`, `--disable-intelligence`, `--model`, `--obsidian-cli`, `--no-obsidian-cli`, `--agent`, `--add-api-key`, `--list-api-keys`, `--revoke-api-key`, `--setup-chatgpt`), `models`, `graph` (show/stats), `context` (read/list/vault-map/who/project/topic), `write` (create/append/update-metadata/move/edit/rewrite/edit-frontmatter/delete), `migrate` (para with `--preview`/`--apply`/`--undo` for PARA vault restructuring), `serve` (MCP stdio server with file watcher + intelligence + optional `--http`/`--port`/`--host`/`--no-auth` for HTTP REST API).

Expand Down
37 changes: 35 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 4 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "engraph"
version = "1.5.5"
version = "1.6.0"
edition = "2024"
description = "Local knowledge graph for AI agents. Hybrid search + MCP server for Obsidian vaults."
license = "MIT"
Expand All @@ -25,6 +25,9 @@ tokenizers = { version = "0.22", default-features = false, features = ["fancy-re
sha2 = "0.10"
ureq = "2.12"
indicatif = "0.17"
dialoguer = "0.12"
console = "0.16"
regex = "1"
sqlite-vec = "0.1.8-alpha.1"
zerocopy = { version = "0.7", features = ["derive"] }
rayon = "1"
Expand Down
19 changes: 11 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ engraph turns your markdown vault into a searchable knowledge graph that any AI
Plain vector search treats your notes as isolated documents. But knowledge isn't flat — your notes link to each other, share tags, reference the same people and projects. engraph understands these connections.

- **5-lane hybrid search** — semantic embeddings + BM25 full-text + graph expansion + cross-encoder reranking + temporal scoring, fused via [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf). An LLM orchestrator classifies queries and adapts lane weights per intent. Time-aware queries like "what happened last week" or "March 2026 notes" activate the temporal lane automatically.
- **MCP server for AI agents** — `engraph serve` exposes 22 tools (search, read, section-level editing, frontmatter mutations, vault health, context bundles, note creation, PARA migration) that Claude, Cursor, or any MCP client can call directly.
- **HTTP REST API** — `engraph serve --http` adds an axum-based HTTP server alongside MCP with 23 REST endpoints, API key authentication, rate limiting, and CORS. Web-based agents and scripts can query your vault with simple `curl` calls.
- **MCP server for AI agents** — `engraph serve` exposes 25 tools (search, read, section-level editing, frontmatter mutations, vault health, context bundles, note creation, PARA migration, identity) that Claude, Cursor, or any MCP client can call directly.
- **HTTP REST API** — `engraph serve --http` adds an axum-based HTTP server alongside MCP with 26 REST endpoints, API key authentication, rate limiting, and CORS. Web-based agents and scripts can query your vault with simple `curl` calls.
- **Section-level editing** — AI agents can read, replace, prepend, or append to specific sections by heading. Full note rewriting with frontmatter preservation. Granular frontmatter mutations (set/remove fields, add/remove tags and aliases).
- **Vault health diagnostics** — detect orphan notes, broken wikilinks, stale content, and tag hygiene issues. Available as MCP tool and CLI command.
- **Obsidian CLI integration** — auto-detects running Obsidian and delegates compatible operations. Circuit breaker (Closed/Degraded/Open) ensures graceful fallback.
Expand Down Expand Up @@ -61,7 +61,7 @@ Your vault (markdown files)
│ Search: Orchestrator → 4-lane retrieval │
│ → Reranker → Two-pass RRF fusion │
│ │
22 MCP tools + 23 REST endpoints │
25 MCP tools + 26 REST endpoints │
└─────────────────────────────────────────────┘
Expand Down Expand Up @@ -268,7 +268,7 @@ Returns orphan notes (no links in or out), broken wikilinks, stale notes, and ta

`engraph serve --http` adds a full REST API alongside the MCP server, exposing the same capabilities over HTTP for web agents, scripts, and integrations.

**24 endpoints:**
**26 endpoints:**

| Method | Endpoint | Permission | Description |
|--------|----------|------------|-------------|
Expand All @@ -292,6 +292,8 @@ Returns orphan notes (no links in or out), broken wikilinks, stale notes, and ta
| POST | `/api/unarchive` | write | Restore archived note |
| POST | `/api/update-metadata` | write | Update note metadata |
| POST | `/api/delete` | write | Delete note (soft or hard) |
| GET | `/api/identity` | read | User identity (L0) and current context (L1) |
| POST | `/api/setup` | write | First-time onboarding setup (detect/apply modes) |
| POST | `/api/reindex-file` | write | Re-index a single file after external edits |
| POST | `/api/migrate/preview` | write | Preview PARA migration (classify + suggest moves) |
| POST | `/api/migrate/apply` | write | Apply PARA migration (move files) |
Expand Down Expand Up @@ -526,7 +528,7 @@ STYLE:
| Search method | 5-lane RRF (semantic + BM25 + graph + reranker + temporal) | Vector similarity only | Keyword only |
| Query understanding | LLM orchestrator classifies intent, adapts weights | None | None |
| Understands note links | Yes (wikilink graph traversal) | No | Limited (backlinks panel) |
| AI agent access | MCP server (22 tools) + HTTP REST API (23 endpoints) | Custom API needed | No |
| AI agent access | MCP server (25 tools) + HTTP REST API (26 endpoints) | Custom API needed | No |
| Write capability | Create/edit/rewrite/delete with smart filing | No | Manual |
| Vault health | Orphans, broken links, stale notes, tag hygiene | No | Limited |
| Real-time sync | File watcher, 2s debounce | Manual re-index | N/A |
Expand All @@ -543,8 +545,9 @@ engraph is not a replacement for Obsidian — it's the intelligence layer that s
- LLM research orchestrator: query intent classification + query expansion + adaptive lane weights
- llama.cpp inference via Rust bindings (GGUF models, Metal GPU on macOS, CUDA on Linux)
- Intelligence opt-in: heuristic fallback when disabled, LLM-powered when enabled
- MCP server with 23 tools (8 read, 10 write, 1 index, 1 diagnostic, 3 migrate) via stdio
- HTTP REST API with 24 endpoints, API key auth (`eg_` prefix), rate limiting, CORS — enabled via `engraph serve --http`
- MCP server with 25 tools (8 read, 10 write, 2 identity, 1 index, 1 diagnostic, 3 migrate) via stdio
- HTTP REST API with 26 endpoints, API key auth (`eg_` prefix), rate limiting, CORS — enabled via `engraph serve --http`
- User identity with L0/L1 tiered context for AI agent session starts
- Section-level reading and editing: target specific headings with replace/prepend/append modes
- Full note rewriting with automatic frontmatter preservation
- Granular frontmatter mutations: set/remove fields, add/remove tags and aliases
Expand Down Expand Up @@ -573,7 +576,7 @@ engraph is not a replacement for Obsidian — it's the intelligence layer that s
- [x] ~~HTTP/REST API — complement MCP with a standard web API~~ (v1.3)
- [x] ~~PARA migration — AI-assisted vault restructuring with preview/apply/undo~~ (v1.4)
- [x] ~~ChatGPT Actions — OpenAPI 3.1.0 spec + plugin manifest + `--setup-chatgpt` helper~~ (v1.5)
- [ ] Identity — user context at session start, enhanced onboarding (v1.6)
- [x] ~~Identity — user context at session start, enhanced onboarding~~ (v1.6)
- [ ] Timeline — temporal knowledge graph with point-in-time queries (v1.7)
- [ ] Mining — automatic fact extraction from vault notes (v1.8)

Expand Down
67 changes: 67 additions & 0 deletions src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,38 @@ pub struct PluginConfig {
pub public_url: Option<String>,
}

/// User identity for AI agent context.
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
#[serde(default)]
pub struct IdentityConfig {
pub name: Option<String>,
pub role: Option<String>,
pub vault_purpose: Option<String>,
}

/// Memory layer feature flags.
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(default)]
pub struct MemoryConfig {
pub identity_enabled: bool,
pub timeline_enabled: bool,
pub mining_enabled: bool,
pub mining_strategy: String,
pub mining_on_index: bool,
}

impl Default for MemoryConfig {
fn default() -> Self {
Self {
identity_enabled: true,
timeline_enabled: true,
mining_enabled: true,
mining_strategy: "auto".into(),
mining_on_index: true,
}
}
}

/// HTTP REST API configuration.
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(default)]
Expand Down Expand Up @@ -104,6 +136,10 @@ pub struct Config {
/// HTTP REST API settings.
#[serde(default)]
pub http: HttpConfig,
#[serde(default)]
pub identity: IdentityConfig,
#[serde(default)]
pub memory: MemoryConfig,
}

impl Default for Config {
Expand All @@ -118,6 +154,8 @@ impl Default for Config {
obsidian: ObsidianConfig::default(),
agents: AgentsConfig::default(),
http: HttpConfig::default(),
identity: IdentityConfig::default(),
memory: MemoryConfig::default(),
}
}
}
Expand Down Expand Up @@ -379,4 +417,33 @@ public_url = "https://vault.example.com"
let config: Config = toml::from_str(toml).unwrap();
assert_eq!(config.http.plugin.name.as_deref(), Some("my-vault"));
}

#[test]
fn test_identity_config_deserializes() {
let toml_str = r#"
[identity]
name = "Test User"
role = "Developer"
vault_purpose = "notes"
"#;
let config: Config = toml::from_str(toml_str).unwrap();
assert_eq!(config.identity.name, Some("Test User".into()));
assert_eq!(config.identity.role, Some("Developer".into()));
assert_eq!(config.identity.vault_purpose, Some("notes".into()));
}

#[test]
fn test_identity_config_defaults_to_empty() {
let config = Config::default();
assert!(config.identity.name.is_none());
assert!(config.identity.role.is_none());
}

#[test]
fn test_memory_config_defaults() {
let config = Config::default();
assert!(config.memory.identity_enabled);
assert!(config.memory.timeline_enabled);
assert!(config.memory.mining_enabled);
}
}
61 changes: 61 additions & 0 deletions src/http.rs
Original file line number Diff line number Diff line change
Expand Up @@ -332,6 +332,14 @@ struct ReindexFileBody {
file: String,
}

#[derive(Debug, Deserialize)]
struct SetupBody {
mode: String,
name: Option<String>,
role: Option<String>,
purpose: Option<String>,
}

// ---------------------------------------------------------------------------
// CORS
// ---------------------------------------------------------------------------
Expand Down Expand Up @@ -388,6 +396,9 @@ pub fn build_router(state: ApiState) -> Router {
.route("/api/delete", post(handle_delete))
// Index maintenance
.route("/api/reindex-file", post(handle_reindex_file))
// Identity endpoints
.route("/api/identity", get(handle_identity))
.route("/api/setup", post(handle_setup))
// Migration endpoints
.route("/api/migrate/preview", post(handle_migrate_preview))
.route("/api/migrate/apply", post(handle_migrate_apply))
Expand Down Expand Up @@ -1066,6 +1077,56 @@ async fn handle_reindex_file(
})))
}

// ---------------------------------------------------------------------------
// Identity / setup endpoint handlers
// ---------------------------------------------------------------------------

async fn handle_identity(
State(state): State<ApiState>,
headers: HeaderMap,
) -> Result<impl IntoResponse, ApiError> {
authorize(&headers, &state, false)?;
let store = state.store.lock().await;
let config = crate::config::Config::load().unwrap_or_default();
let block = crate::identity::format_identity_block(&config, &store)
.map_err(|e| ApiError::internal(&format!("{e:#}")))?;
Ok(Json(serde_json::json!({ "identity": block })))
}

async fn handle_setup(
State(state): State<ApiState>,
headers: HeaderMap,
Json(body): Json<SetupBody>,
) -> Result<impl IntoResponse, ApiError> {
authorize(&headers, &state, true)?;
match body.mode.as_str() {
"detect" => {
let result = crate::onboarding::run_detect_json(&state.vault_path)
.map_err(|e| ApiError::internal(&format!("{e:#}")))?;
Ok(Json(result))
}
"apply" => {
let mut config = crate::config::Config::load().unwrap_or_default();
let data_dir = crate::config::Config::data_dir()
.map_err(|e| ApiError::internal(&format!("{e:#}")))?;
let flags = crate::onboarding::ApplyFlags {
name: body.name,
role: body.role,
purpose: body.purpose,
identity_only: false,
reindex_only: false,
};
let result =
crate::onboarding::run_apply_json(&state.vault_path, &mut config, &data_dir, flags)
.map_err(|e| ApiError::internal(&format!("{e:#}")))?;
Ok(Json(result))
}
other => Err(ApiError::bad_request(&format!(
"Unknown mode: {other}. Use 'detect' or 'apply'."
))),
}
}

// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------
Expand Down
Loading
Loading