diff --git a/README.md b/README.md
index 00062e3..0101ca6 100644
--- a/README.md
+++ b/README.md
@@ -149,6 +149,8 @@ We benchmark quality and speed across all methods on ~1,250 queries over 63 repo
| CodeRankEmbed | 0.765 | 57 s | 16 ms |
| ColGREP | 0.693 | 5.8 s | 124 ms |
| BM25 | 0.673 | 263 ms | 0.02 ms |
+| grepai | 0.561 | 35 s | 48 ms |
+| probe | 0.387 | — | 207 ms |
| ripgrep | 0.126 | — | 12 ms |
Semble achieves 99% of the performance of the 137M-parameter [CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed) Hybrid, while indexing 218x faster and answering queries 11x faster. See [benchmarks](benchmarks/README.md) for per-language results, ablations, and methodology.
diff --git a/assets/images/speed_vs_ndcg_cold.png b/assets/images/speed_vs_ndcg_cold.png
index 739fba3..2056b96 100644
Binary files a/assets/images/speed_vs_ndcg_cold.png and b/assets/images/speed_vs_ndcg_cold.png differ
diff --git a/assets/images/speed_vs_ndcg_warm.png b/assets/images/speed_vs_ndcg_warm.png
index 5444582..4d97d23 100644
Binary files a/assets/images/speed_vs_ndcg_warm.png and b/assets/images/speed_vs_ndcg_warm.png differ
diff --git a/benchmarks/README.md b/benchmarks/README.md
index 3ba8566..b0011e3 100644
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@@ -7,6 +7,7 @@ Quality and speed benchmarks for `semble`.
- [Ablations](#ablations)
- [Dataset](#dataset)
- [Methods](#methods)
+- [Excluded methods](#excluded-methods)
- [Running the benchmarks](#running-the-benchmarks)
## Main results
@@ -20,6 +21,8 @@ Quality and speed across all methods.
| CodeRankEmbed | 0.765 | 57 s | 16 ms |
| ColGREP | 0.693 | 5.8 s | 124 ms |
| BM25 | 0.673 | 263 ms | 0.02 ms |
+| grepai | 0.561 | 35 s | 48 ms |
+| probe | 0.387 | — | 207 ms |
| ripgrep | 0.126 | — | 12 ms |
|  |  |
@@ -34,28 +37,28 @@ NDCG@10 is averaged across all queries. Speed numbers use one repo per language,
NDCG@10 per language, sorted by CodeRankEmbed Hybrid (CRE in the table). Best score per row is bolded.
-| Language | semble | CRE Hybrid | CRE | ColGREP | ripgrep |
-|---|---:|---:|---:|---:|---:|
-| scala | 0.909 | **0.922** | 0.845 | 0.765 | 0.180 |
-| cpp | **0.915** | 0.913 | 0.846 | 0.626 | 0.126 |
-| ruby | **0.909** | **0.909** | 0.769 | 0.708 | 0.230 |
-| elixir | 0.894 | **0.905** | 0.869 | 0.808 | 0.134 |
-| javascript | 0.917 | 0.903 | **0.920** | 0.823 | 0.176 |
-| zig | **0.913** | 0.901 | 0.807 | 0.474 | 0.000 |
-| csharp | 0.885 | **0.889** | 0.743 | 0.614 | 0.117 |
-| go | **0.895** | 0.884 | 0.676 | 0.785 | 0.133 |
-| python | 0.867 | **0.880** | 0.794 | 0.777 | 0.202 |
-| php | 0.858 | **0.874** | 0.758 | 0.663 | 0.123 |
-| swift | 0.860 | **0.873** | 0.721 | 0.710 | 0.160 |
-| bash | 0.825 | 0.852 | **0.892** | 0.706 | 0.000 |
-| lua | 0.823 | **0.847** | 0.803 | 0.798 | 0.000 |
-| java | **0.849** | 0.841 | 0.706 | 0.641 | 0.198 |
-| kotlin | 0.821 | **0.830** | 0.670 | 0.637 | 0.166 |
-| rust | **0.856** | 0.827 | 0.627 | 0.662 | 0.162 |
-| c | 0.741 | **0.806** | 0.706 | 0.676 | 0.000 |
-| haskell | 0.765 | 0.771 | **0.776** | 0.683 | 0.000 |
-| typescript | 0.706 | **0.708** | 0.545 | 0.430 | 0.128 |
-| **overall** | **0.854** | **0.862** | **0.765** | **0.693** | **0.126** |
+| Language | semble | CRE Hybrid | CRE | ColGREP | grepai | probe | ripgrep |
+|---|---:|---:|---:|---:|---:|---:|---:|
+| scala | 0.909 | **0.922** | 0.845 | 0.765 | 0.330 | 0.392 | 0.180 |
+| cpp | **0.915** | 0.913 | 0.846 | 0.626 | 0.731 | 0.375 | 0.126 |
+| ruby | **0.909** | **0.909** | 0.769 | 0.708 | 0.643 | 0.382 | 0.230 |
+| elixir | 0.894 | **0.905** | 0.869 | 0.808 | 0.669 | 0.412 | 0.134 |
+| javascript | 0.917 | 0.903 | **0.920** | 0.823 | 0.675 | 0.588 | 0.176 |
+| zig | **0.913** | 0.901 | 0.807 | 0.474 | 0.755 | 0.369 | 0.000 |
+| csharp | 0.885 | **0.889** | 0.743 | 0.614 | 0.277 | 0.392 | 0.117 |
+| go | **0.895** | 0.884 | 0.676 | 0.785 | 0.722 | 0.410 | 0.133 |
+| python | 0.867 | **0.880** | 0.794 | 0.777 | 0.634 | 0.488 | 0.202 |
+| php | 0.858 | **0.874** | 0.758 | 0.663 | 0.402 | 0.340 | 0.123 |
+| swift | 0.860 | **0.873** | 0.721 | 0.710 | 0.429 | 0.280 | 0.160 |
+| bash | 0.825 | 0.852 | **0.892** | 0.706 | 0.723 | 0.226 | 0.000 |
+| lua | 0.823 | **0.847** | 0.803 | 0.798 | 0.699 | 0.336 | 0.000 |
+| java | **0.849** | 0.841 | 0.706 | 0.641 | 0.386 | 0.536 | 0.198 |
+| kotlin | 0.821 | **0.830** | 0.670 | 0.637 | 0.478 | 0.335 | 0.166 |
+| rust | **0.856** | 0.827 | 0.627 | 0.662 | 0.519 | 0.242 | 0.162 |
+| c | 0.741 | **0.806** | 0.706 | 0.676 | 0.555 | 0.384 | 0.000 |
+| haskell | 0.765 | 0.771 | **0.776** | 0.683 | 0.483 | 0.313 | 0.000 |
+| typescript | 0.706 | **0.708** | 0.545 | 0.430 | 0.394 | 0.354 | 0.128 |
+| **overall** | **0.854** | **0.862** | **0.765** | **0.693** | **0.561** | **0.387** | **0.126** |
## Ablations
@@ -102,10 +105,19 @@ NDCG@10 per language, sorted by CodeRankEmbed Hybrid (CRE in the table). Best sc
## Methods
- **[ripgrep](https://github.com/BurntSushi/ripgrep)**: fast regex search over files, included as a raw keyword-match baseline.
+- **[probe](https://github.com/buger/probe)**: BM25 keyword ranking backed by tree-sitter parse trees. No persistent index; scans on the fly.
- **[ColGREP](https://github.com/lightonai/next-plaid/tree/main/colgrep)**: late-interaction code retrieval built on next-plaid with the [LateOn-Code-edge](https://huggingface.co/lightonai/LateOn-Code-edge) model.
+- **[grepai](https://github.com/nicholasgasior/grepai)**: semantic search using [nomic-embed-text](https://huggingface.co/nomic-ai/nomic-embed-text-v1) (137M params) via a local Ollama daemon.
- **[CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed)**: 137M-param transformer embedding model for code retrieval. *CodeRankEmbed Hybrid* fuses its dense scores with BM25.
- **[semble](https://github.com/your-repo/semble)**: this library. [potion-code-16M](https://huggingface.co/minishlab/potion-code-16M) static embeddings + BM25 + the semble reranking stack.
+## Excluded methods
+
+Two tools were considered but not included in the benchmark:
+
+- **[codanna](https://codanna.io)**: symbol-level semantic search with fastembed. Excluded because it does not support Haskell, Bash, Zig, Scala, Elixir, or Ruby (6 of the 19 benchmark languages, covering 20 of 63 repos (~38% of tasks)).
+- **[claude-context](https://github.com/zilliztech/claude-context)**: retrieval-augmented code search using OpenAI embeddings and a vector database. Excluded because it requires a paid OpenAI API key and a running vector-DB service.
+
## Running the benchmarks
Repos are pinned in `repos.json` and cloned into `~/.cache/semble-bench`:
@@ -152,6 +164,42 @@ uv run python -m benchmarks.baselines.ablations --mode semble-semantic
+
+probe
+
+Needs `probe` on `$PATH` (`npm install -g @buger/probe`).
+
+```bash
+uv run python -m benchmarks.baselines.probe
+uv run python -m benchmarks.baselines.probe --repo fastapi --repo axios
+```
+
+
+
+
+grepai
+
+Needs `grepai` on `$PATH` and Ollama running with `nomic-embed-text` pulled:
+
+```bash
+ollama pull nomic-embed-text
+```
+
+```bash
+uv run python -m benchmarks.baselines.grepai
+uv run python -m benchmarks.baselines.grepai --repo fastapi --repo axios
+```
+
+Large repos take several minutes to index. Use `--timeout ` (default 120) for repos with many files:
+
+```bash
+uv run python -m benchmarks.baselines.grepai --timeout 1800 --output results.json
+```
+
+The `--output` flag enables resume mode: already-completed repos are skipped on restart.
+
+
+
ripgrep
diff --git a/benchmarks/baselines/grepai.py b/benchmarks/baselines/grepai.py
new file mode 100644
index 0000000..0eef5be
--- /dev/null
+++ b/benchmarks/baselines/grepai.py
@@ -0,0 +1,331 @@
+import argparse
+import json
+import os
+import shutil
+import signal
+import subprocess
+import sys
+import tempfile
+import time
+from dataclasses import dataclass
+from pathlib import Path
+
+from benchmarks.data import (
+ RepoSpec,
+ Task,
+ apply_task_filters,
+ available_repo_specs,
+ grouped_tasks,
+ load_tasks,
+ save_results,
+)
+from benchmarks.metrics import file_rank, ndcg_at_k
+
+_GREPAI = "grepai"
+_TOP_K = 10
+_LATENCY_RUNS = 1 # Ollama embedding calls are slow; single run is sufficient
+_INDEX_TIMEOUT = 300
+_SEARCH_TIMEOUT = 60
+_WATCH_READY_TIMEOUT = 120 # overridden by --timeout
+
+
+@dataclass(frozen=True)
+class RepoResult:
+ """Per-repo benchmark result."""
+
+ repo: str
+ language: str
+ ndcg10: float
+ p50_ms: float
+ index_ms: float
+
+
+def _cleanup_index(benchmark_dir: Path) -> None:
+ d = benchmark_dir / ".grepai"
+ if d.exists():
+ shutil.rmtree(d, ignore_errors=True)
+
+
+def _build_index(benchmark_dir: Path, *, watch_ready_timeout: int = _WATCH_READY_TIMEOUT) -> tuple[bool, float]:
+ """Init and index a repo with grepai; return (success, elapsed_ms)."""
+ _cleanup_index(benchmark_dir)
+
+ init_proc = subprocess.run(
+ [_GREPAI, "init", "--provider", "ollama", "--yes"],
+ capture_output=True,
+ text=True,
+ cwd=benchmark_dir,
+ timeout=30,
+ )
+ if init_proc.returncode != 0:
+ print(f" WARNING: grepai init failed: {init_proc.stderr.strip()}", file=sys.stderr)
+ return False, 0.0
+
+ # grepai writes progress bars with \r (no \n), so readline() blocks forever.
+ # Write stdout to a temp file and poll for the sentinel string instead.
+ # "Initial scan complete" appears after file scanning but BEFORE embeddings
+ # finish. Wait for 3 s of output silence after that sentinel to ensure all
+ # embeddings have been flushed to disk before killing watch.
+ started = time.perf_counter()
+ watch_proc: subprocess.Popen[bytes] | None = None
+ with tempfile.TemporaryFile() as log_f:
+ watch_proc = subprocess.Popen(
+ [_GREPAI, "watch"],
+ stdout=log_f,
+ stderr=subprocess.STDOUT,
+ cwd=benchmark_dir,
+ start_new_session=True, # Own process group so killpg doesn't hit us
+ )
+ try:
+ deadline = time.perf_counter() + watch_ready_timeout
+ scan_complete = False
+ last_size = 0
+ idle_since: float | None = None
+ _IDLE_SETTLE = 3.0 # seconds of silence after scan_complete → embeddings done
+
+ while time.perf_counter() < deadline:
+ time.sleep(0.3)
+ log_f.seek(0)
+ content = log_f.read()
+ if not scan_complete and b"Initial scan complete" in content:
+ scan_complete = True
+ idle_since = time.perf_counter()
+ if scan_complete:
+ if len(content) != last_size:
+ idle_since = time.perf_counter()
+ last_size = len(content)
+ elif idle_since is not None and (time.perf_counter() - idle_since) >= _IDLE_SETTLE:
+ return True, (time.perf_counter() - started) * 1000
+ if watch_proc.poll() is not None:
+ if scan_complete:
+ return True, (time.perf_counter() - started) * 1000
+ break
+ print(
+ f" WARNING: grepai watch timed out after {watch_ready_timeout}s",
+ file=sys.stderr,
+ )
+ return False, (time.perf_counter() - started) * 1000
+ finally:
+ try:
+ os.killpg(os.getpgid(watch_proc.pid), signal.SIGTERM)
+ except (ProcessLookupError, PermissionError):
+ pass
+ watch_proc.wait(timeout=5)
+
+
+def _run_search(query: str, benchmark_dir: Path, *, top_k: int) -> list[str]:
+ """Return absolute file paths from grepai JSON search output."""
+ cmd = [_GREPAI, "search", query, "--json", "-n", str(top_k)]
+ try:
+ proc = subprocess.run(
+ cmd,
+ capture_output=True,
+ text=True,
+ timeout=_SEARCH_TIMEOUT,
+ cwd=benchmark_dir,
+ )
+ except subprocess.TimeoutExpired:
+ return []
+ if proc.returncode != 0:
+ return []
+ try:
+ items = json.loads(proc.stdout)
+ except json.JSONDecodeError:
+ return []
+ # grepai returns relative paths; make them absolute.
+ seen: dict[str, None] = {}
+ for item in items:
+ rel = item.get("file_path", "")
+ if rel:
+ abs_path = str((benchmark_dir / rel).resolve())
+ seen[abs_path] = None
+ return list(seen)[:top_k]
+
+
+def _evaluate_repo(
+ tasks: list[Task],
+ benchmark_dir: Path,
+ *,
+ verbose: bool = False,
+) -> tuple[float, float]:
+ """Return (mean ndcg@10, p50 latency ms) for a list of tasks."""
+ ndcg10_sum = 0.0
+ latencies: list[float] = []
+
+ for task in tasks:
+ query_latencies: list[float] = []
+ file_paths: list[str] = []
+ for _ in range(_LATENCY_RUNS):
+ started = time.perf_counter()
+ file_paths = _run_search(task.query, benchmark_dir, top_k=_TOP_K)
+ query_latencies.append((time.perf_counter() - started) * 1000)
+ latencies.append(sorted(query_latencies)[_LATENCY_RUNS // 2])
+
+ relevant_ranks = [rank for t in task.all_relevant if (rank := file_rank(file_paths, t.path)) is not None]
+ q_ndcg10 = ndcg_at_k(relevant_ranks, len(task.all_relevant), _TOP_K)
+ ndcg10_sum += q_ndcg10
+
+ if verbose:
+ print(
+ f" ndcg@10={q_ndcg10:.3f} ranks={relevant_ranks} n_rel={len(task.all_relevant)} q={task.query!r}",
+ file=sys.stderr,
+ )
+ print(f" targets: {', '.join(t.path for t in task.all_relevant)}", file=sys.stderr)
+ print(f" top-5: {[Path(fp).name for fp in file_paths[:5]]}", file=sys.stderr)
+
+ latencies.sort()
+ return ndcg10_sum / len(tasks), latencies[len(latencies) // 2]
+
+
+def _run_repo(
+ spec: RepoSpec,
+ tasks: list[Task],
+ *,
+ verbose: bool,
+ watch_ready_timeout: int = _WATCH_READY_TIMEOUT,
+) -> RepoResult | None:
+ """Index, evaluate, and clean up a single repo."""
+ benchmark_dir = spec.benchmark_dir
+ ok, index_ms = _build_index(benchmark_dir, watch_ready_timeout=watch_ready_timeout)
+ if not ok:
+ print(f" SKIP: {spec.name} — grepai indexing failed", file=sys.stderr)
+ return None
+
+ try:
+ ndcg10, p50_ms = _evaluate_repo(tasks, benchmark_dir, verbose=verbose)
+ finally:
+ _cleanup_index(benchmark_dir)
+
+ return RepoResult(repo=spec.name, language=spec.language, ndcg10=ndcg10, p50_ms=p50_ms, index_ms=index_ms)
+
+
+def _build_summary(results: list[RepoResult]) -> dict:
+ avg_ndcg10 = sum(r.ndcg10 for r in results) / len(results)
+ avg_p50 = sum(r.p50_ms for r in results) / len(results)
+ avg_index = sum(r.index_ms for r in results) / len(results)
+ return {
+ "tool": "grepai",
+ "note": "nomic-embed-text via Ollama (137 M params, ~8× larger than semble's potion-code-16M)",
+ "repos": [
+ {
+ "repo": r.repo,
+ "language": r.language,
+ "ndcg10": round(r.ndcg10, 4),
+ "p50_ms": round(r.p50_ms, 1),
+ "index_ms": round(r.index_ms, 0),
+ }
+ for r in results
+ ],
+ "avg_ndcg10": round(avg_ndcg10, 4),
+ "avg_p50_ms": round(avg_p50, 1),
+ "avg_index_ms": round(avg_index, 0),
+ }
+
+
+def _write_results(results: list[RepoResult], path: Path) -> None:
+ path.parent.mkdir(parents=True, exist_ok=True)
+ path.write_text(json.dumps(_build_summary(results), indent=2))
+
+
+def _parse_args() -> argparse.Namespace:
+ parser = argparse.ArgumentParser(description="Benchmark grepai on the semble benchmark suite.")
+ parser.add_argument("--repo", action="append", default=[], help="Limit to one or more repo names.")
+ parser.add_argument("--language", action="append", default=[], help="Limit to one or more languages.")
+ parser.add_argument("--verbose", action="store_true", help="Print per-query results.")
+ parser.add_argument(
+ "--output",
+ metavar="FILE",
+ help="JSON file to write results to; if it already exists, repos already present are skipped (resume mode).",
+ )
+ parser.add_argument(
+ "--timeout",
+ type=int,
+ default=_WATCH_READY_TIMEOUT,
+ metavar="SECONDS",
+ help=f"Seconds to wait for embeddings to finish (default: {_WATCH_READY_TIMEOUT}). "
+ "Increase for large repos (e.g. --timeout 1800).",
+ )
+ return parser.parse_args()
+
+
+def _load_existing(output_path: Path | None) -> dict[str, dict]:
+ """Load already-completed repos from a prior run's output file."""
+ if output_path is None or not output_path.exists():
+ return {}
+ try:
+ existing_data = json.loads(output_path.read_text())
+ existing = {r["repo"]: r for r in existing_data.get("repos", [])}
+ print(f"Resuming: {len(existing)} repos already done, will skip them.", file=sys.stderr)
+ return existing
+ except (json.JSONDecodeError, KeyError):
+ return {}
+
+
+def main() -> None:
+ """Run the grepai baseline benchmark."""
+ args = _parse_args()
+ repo_specs = available_repo_specs()
+ tasks = apply_task_filters(
+ load_tasks(repo_specs=repo_specs), repos=args.repo or None, languages=args.language or None
+ )
+
+ output_path = Path(args.output) if args.output else None
+ existing = _load_existing(output_path)
+
+ print("grepai (ollama/nomic-embed-text, 137M params)", file=sys.stderr)
+ print(f"{'Repo':<22} {'Language':<12} {'Index':>9} {'NDCG@10':>8} {'p50':>8}", file=sys.stderr)
+ print(f"{'-' * 22} {'-' * 12} {'-' * 9} {'-' * 8} {'-' * 8}", file=sys.stderr)
+
+ results: list[RepoResult] = []
+ for repo, repo_task_list in sorted(grouped_tasks(tasks).items()):
+ spec = repo_specs[repo]
+ if repo in existing:
+ r = existing[repo]
+ results.append(
+ RepoResult(
+ repo=r["repo"],
+ language=r["language"],
+ ndcg10=r["ndcg10"],
+ p50_ms=r["p50_ms"],
+ index_ms=r["index_ms"],
+ )
+ )
+ print(f"{repo:<22} {'(skipped — already done)':<12}", file=sys.stderr)
+ continue
+ if args.verbose:
+ print(f"\n--- {repo} ---", file=sys.stderr)
+ result = _run_repo(spec, repo_task_list, verbose=args.verbose, watch_ready_timeout=args.timeout)
+ if result is None:
+ continue
+ results.append(result)
+ print(
+ f"{repo:<22} {spec.language:<12} {result.index_ms:>8.0f}ms {result.ndcg10:>8.3f} {result.p50_ms:>7.1f}ms",
+ file=sys.stderr,
+ )
+
+ if output_path:
+ _write_results(results, output_path)
+
+ if not results:
+ return
+
+ avg_ndcg10 = sum(r.ndcg10 for r in results) / len(results)
+ avg_p50 = sum(r.p50_ms for r in results) / len(results)
+ avg_index = sum(r.index_ms for r in results) / len(results)
+ print(f"{'-' * 22} {'-' * 12} {'-' * 9} {'-' * 8} {'-' * 8}", file=sys.stderr)
+ avg_label = f"Average ({len(results)})"
+ print(
+ f"{avg_label:<22} {'':<12} {avg_index:>8.0f}ms {avg_ndcg10:>8.3f} {avg_p50:>7.1f}ms",
+ file=sys.stderr,
+ )
+
+ summary = _build_summary(results)
+ if output_path:
+ _write_results(results, output_path)
+ else:
+ save_results("grepai", summary)
+ print(json.dumps(summary, indent=2))
+
+
+if __name__ == "__main__":
+ main()
diff --git a/benchmarks/baselines/probe.py b/benchmarks/baselines/probe.py
new file mode 100644
index 0000000..6f7699f
--- /dev/null
+++ b/benchmarks/baselines/probe.py
@@ -0,0 +1,160 @@
+import argparse
+import json
+import subprocess
+import sys
+import time
+from dataclasses import dataclass
+from pathlib import Path
+
+from benchmarks.data import (
+ Task,
+ apply_task_filters,
+ available_repo_specs,
+ grouped_tasks,
+ load_tasks,
+ save_results,
+)
+from benchmarks.metrics import file_rank, ndcg_at_k
+
+_TOP_K = 10
+_LATENCY_RUNS = 3
+
+
+@dataclass(frozen=True)
+class RepoResult:
+ """Per-repo benchmark result."""
+
+ repo: str
+ language: str
+ ndcg10: float
+ p50_ms: float
+
+
+def _run_probe(query: str, benchmark_dir: Path, *, top_k: int, timeout: int = 30) -> list[str]:
+ """Return file paths from probe JSON output, deduplicated and capped at top_k."""
+ cmd = [
+ "probe",
+ "search",
+ query,
+ str(benchmark_dir),
+ "--format",
+ "json",
+ "--max-results",
+ str(top_k * 3), # probe returns chunk-level results; over-fetch and dedup
+ ]
+ try:
+ proc = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
+ except subprocess.TimeoutExpired:
+ return []
+ if proc.returncode != 0:
+ return []
+ # probe prefixes stdout with non-JSON header lines ("Pattern: ...\nPath: ...\n")
+ # before the JSON object; skip to the first '{'.
+ json_start = proc.stdout.find("{")
+ if json_start < 0:
+ return []
+ try:
+ data = json.loads(proc.stdout[json_start:])
+ except json.JSONDecodeError:
+ return []
+ seen: dict[str, None] = {}
+ for item in data.get("results", []):
+ fp = item.get("file", "")
+ if fp:
+ seen[fp] = None
+ return list(seen)[:top_k]
+
+
+def _evaluate_repo(
+ tasks: list[Task],
+ benchmark_dir: Path,
+ *,
+ verbose: bool = False,
+) -> tuple[float, float]:
+ """Return (mean ndcg@10, p50 latency ms) for a list of tasks."""
+ ndcg10_sum = 0.0
+ latencies: list[float] = []
+
+ for task in tasks:
+ query_latencies: list[float] = []
+ file_paths: list[str] = []
+ for _ in range(_LATENCY_RUNS):
+ started = time.perf_counter()
+ file_paths = _run_probe(task.query, benchmark_dir, top_k=_TOP_K)
+ query_latencies.append((time.perf_counter() - started) * 1000)
+ latencies.append(sorted(query_latencies)[_LATENCY_RUNS // 2])
+
+ relevant_ranks = [rank for t in task.all_relevant if (rank := file_rank(file_paths, t.path)) is not None]
+ q_ndcg10 = ndcg_at_k(relevant_ranks, len(task.all_relevant), _TOP_K)
+ ndcg10_sum += q_ndcg10
+
+ if verbose:
+ print(
+ f" ndcg@10={q_ndcg10:.3f} ranks={relevant_ranks} n_rel={len(task.all_relevant)} q={task.query!r}",
+ file=sys.stderr,
+ )
+ print(f" targets: {', '.join(t.path for t in task.all_relevant)}", file=sys.stderr)
+ print(f" top-5: {[Path(fp).name for fp in file_paths[:5]]}", file=sys.stderr)
+
+ latencies.sort()
+ return ndcg10_sum / len(tasks), latencies[len(latencies) // 2]
+
+
+def _parse_args() -> argparse.Namespace:
+ parser = argparse.ArgumentParser(description="Benchmark probe on the semble benchmark suite.")
+ parser.add_argument("--repo", action="append", default=[], help="Limit to one or more repo names.")
+ parser.add_argument("--language", action="append", default=[], help="Limit to one or more languages.")
+ parser.add_argument("--verbose", action="store_true", help="Print per-query results.")
+ return parser.parse_args()
+
+
+def main() -> None:
+ """Run the probe baseline benchmark."""
+ args = _parse_args()
+ repo_specs = available_repo_specs()
+ tasks = apply_task_filters(
+ load_tasks(repo_specs=repo_specs), repos=args.repo or None, languages=args.language or None
+ )
+
+ print("probe (bm25, tree-sitter)", file=sys.stderr)
+ print("NOTE: probe uses keyword ranking; natural-language queries disadvantage it.", file=sys.stderr)
+ print(f"{'Repo':<22} {'Language':<12} {'NDCG@10':>8} {'p50':>8}", file=sys.stderr)
+ print(f"{'-' * 22} {'-' * 12} {'-' * 8} {'-' * 8}", file=sys.stderr)
+
+ results: list[RepoResult] = []
+ for repo, repo_task_list in sorted(grouped_tasks(tasks).items()):
+ spec = repo_specs[repo]
+ if args.verbose:
+ print(f"\n--- {repo} ---", file=sys.stderr)
+ ndcg10, p50_ms = _evaluate_repo(repo_task_list, spec.benchmark_dir, verbose=args.verbose)
+ results.append(RepoResult(repo=repo, language=spec.language, ndcg10=ndcg10, p50_ms=p50_ms))
+ print(f"{repo:<22} {spec.language:<12} {ndcg10:>8.3f} {p50_ms:>7.1f}ms", file=sys.stderr)
+
+ if not results:
+ return
+
+ avg_ndcg10 = sum(r.ndcg10 for r in results) / len(results)
+ avg_p50 = sum(r.p50_ms for r in results) / len(results)
+ print(f"{'-' * 22} {'-' * 12} {'-' * 8} {'-' * 8}", file=sys.stderr)
+ avg_label = f"Average ({len(results)})"
+ print(
+ f"{avg_label:<22} {'':<12} {avg_ndcg10:>8.3f} {avg_p50:>7.1f}ms",
+ file=sys.stderr,
+ )
+
+ summary = {
+ "tool": "probe",
+ "note": "BM25 + tree-sitter; no embedding model, no persistent index; natural-language queries disadvantage it",
+ "repos": [
+ {"repo": r.repo, "language": r.language, "ndcg10": round(r.ndcg10, 4), "p50_ms": round(r.p50_ms, 1)}
+ for r in results
+ ],
+ "avg_ndcg10": round(avg_ndcg10, 4),
+ "avg_p50_ms": round(avg_p50, 1),
+ }
+ save_results("probe", summary)
+ print(json.dumps(summary, indent=2))
+
+
+if __name__ == "__main__":
+ main()
diff --git a/benchmarks/plot.py b/benchmarks/plot.py
index db37f55..517f627 100644
--- a/benchmarks/plot.py
+++ b/benchmarks/plot.py
@@ -24,11 +24,19 @@ class _Method(TypedDict):
{
"name": "ripgrep",
"ndcg10": 0.126,
- "index_ms": 0.0,
+ "index_ms": 0.0, # no persistent index; scans on the fly
"query_p50_ms": 12.08,
"color": "#606060",
"params_m": 0,
},
+ {
+ "name": "probe",
+ "ndcg10": 0.387,
+ "index_ms": 0.0, # no persistent index; scans on the fly
+ "query_p50_ms": 207.1,
+ "color": "#9b7bb0",
+ "params_m": 0,
+ },
{
"name": "BM25",
"ndcg10": 0.673,
@@ -45,6 +53,14 @@ class _Method(TypedDict):
"color": "#e8a838",
"params_m": 16,
},
+ {
+ "name": "grepai",
+ "ndcg10": 0.561,
+ "index_ms": 34955.0,
+ "query_p50_ms": 47.7,
+ "color": "#c0724a",
+ "params_m": 137,
+ },
{
"name": "CodeRankEmbed",
"ndcg10": 0.7648,
@@ -170,7 +186,17 @@ def _make_plot(out_path: Path, *, warm: bool = False) -> None:
)
x_label = (x ** (1 / 3) + cbrt_label_delta) ** 3
- ax.text(x_label, y, m["name"], fontsize=8.5, color=m["color"], ha="left", va="center", zorder=4)
+ ax.text(
+ x_label,
+ y,
+ m["name"],
+ fontsize=8.5,
+ fontweight="bold" if m["name"] == "semble" else "normal",
+ color=m["color"],
+ ha="left",
+ va="center",
+ zorder=4,
+ )
ax.set_xscale("function", functions=(_cbrt_forward, _cbrt_inverse))
ax.set_ylabel("NDCG@10", fontsize=10, color="#444444")
diff --git a/benchmarks/results/grepai-715563a812c3.json b/benchmarks/results/grepai-715563a812c3.json
new file mode 100644
index 0000000..b65f4a7
--- /dev/null
+++ b/benchmarks/results/grepai-715563a812c3.json
@@ -0,0 +1,449 @@
+{
+ "tool": "grepai",
+ "repos": [
+ {
+ "repo": "abseil-cpp",
+ "language": "cpp",
+ "ndcg10": 0.5955,
+ "p50_ms": 147.9,
+ "index_ms": 226627.0
+ },
+ {
+ "repo": "aeson",
+ "language": "haskell",
+ "ndcg10": 0.6627,
+ "p50_ms": 30.2,
+ "index_ms": 10019.0
+ },
+ {
+ "repo": "aiohttp",
+ "language": "python",
+ "ndcg10": 0.6469,
+ "p50_ms": 35.9,
+ "index_ms": 15180.0
+ },
+ {
+ "repo": "alamofire",
+ "language": "swift",
+ "ndcg10": 0.5664,
+ "p50_ms": 35.9,
+ "index_ms": 11261.0
+ },
+ {
+ "repo": "axios",
+ "language": "javascript",
+ "ndcg10": 0.4167,
+ "p50_ms": 26.4,
+ "index_ms": 3941.0
+ },
+ {
+ "repo": "axum",
+ "language": "rust",
+ "ndcg10": 0.5127,
+ "p50_ms": 34.1,
+ "index_ms": 14836.0
+ },
+ {
+ "repo": "bash-it",
+ "language": "bash",
+ "ndcg10": 0.7448,
+ "p50_ms": 47.0,
+ "index_ms": 21867.0
+ },
+ {
+ "repo": "bats-core",
+ "language": "bash",
+ "ndcg10": 0.425,
+ "p50_ms": 24.2,
+ "index_ms": 1518.0
+ },
+ {
+ "repo": "cats",
+ "language": "scala",
+ "ndcg10": 0.2283,
+ "p50_ms": 48.4,
+ "index_ms": 31562.0
+ },
+ {
+ "repo": "chi",
+ "language": "go",
+ "ndcg10": 0.7807,
+ "p50_ms": 33.3,
+ "index_ms": 8811.0
+ },
+ {
+ "repo": "circe",
+ "language": "scala",
+ "ndcg10": 0.4538,
+ "p50_ms": 26.7,
+ "index_ms": 4250.0
+ },
+ {
+ "repo": "click",
+ "language": "python",
+ "ndcg10": 0.9217,
+ "p50_ms": 28.6,
+ "index_ms": 6980.0
+ },
+ {
+ "repo": "cobra",
+ "language": "go",
+ "ndcg10": 0.7778,
+ "p50_ms": 32.7,
+ "index_ms": 12440.0
+ },
+ {
+ "repo": "commons-lang",
+ "language": "java",
+ "ndcg10": 0.4406,
+ "p50_ms": 74.7,
+ "index_ms": 64942.0
+ },
+ {
+ "repo": "curl",
+ "language": "c",
+ "ndcg10": 0.5632,
+ "p50_ms": 91.8,
+ "index_ms": 114986.0
+ },
+ {
+ "repo": "dapper",
+ "language": "csharp",
+ "ndcg10": 0.3767,
+ "p50_ms": 32.1,
+ "index_ms": 8783.0
+ },
+ {
+ "repo": "ecto",
+ "language": "elixir",
+ "ndcg10": 0.7364,
+ "p50_ms": 39.2,
+ "index_ms": 22146.0
+ },
+ {
+ "repo": "exposed",
+ "language": "kotlin",
+ "ndcg10": 0.4851,
+ "p50_ms": 38.1,
+ "index_ms": 15134.0
+ },
+ {
+ "repo": "express",
+ "language": "javascript",
+ "ndcg10": 0.9622,
+ "p50_ms": 23.5,
+ "index_ms": 1519.0
+ },
+ {
+ "repo": "fastapi",
+ "language": "python",
+ "ndcg10": 0.4914,
+ "p50_ms": 33.7,
+ "index_ms": 11810.0
+ },
+ {
+ "repo": "flask",
+ "language": "python",
+ "ndcg10": 0.6361,
+ "p50_ms": 30.0,
+ "index_ms": 6681.0
+ },
+ {
+ "repo": "fmtlib",
+ "language": "cpp",
+ "ndcg10": 0.8105,
+ "p50_ms": 30.5,
+ "index_ms": 14265.0
+ },
+ {
+ "repo": "gin",
+ "language": "go",
+ "ndcg10": 0.607,
+ "p50_ms": 38.0,
+ "index_ms": 20015.0
+ },
+ {
+ "repo": "gson",
+ "language": "java",
+ "ndcg10": 0.5272,
+ "p50_ms": 52.1,
+ "index_ms": 30942.0
+ },
+ {
+ "repo": "guzzle",
+ "language": "php",
+ "ndcg10": 0.5859,
+ "p50_ms": 27.7,
+ "index_ms": 4546.0
+ },
+ {
+ "repo": "http4s",
+ "language": "scala",
+ "ndcg10": 0.3079,
+ "p50_ms": 46.0,
+ "index_ms": 24269.0
+ },
+ {
+ "repo": "httpx",
+ "language": "python",
+ "ndcg10": 0.6149,
+ "p50_ms": 27.1,
+ "index_ms": 5459.0
+ },
+ {
+ "repo": "jackson-databind",
+ "language": "java",
+ "ndcg10": 0.1903,
+ "p50_ms": 98.4,
+ "index_ms": 92140.0
+ },
+ {
+ "repo": "kotlinx-coroutines",
+ "language": "kotlin",
+ "ndcg10": 0.4977,
+ "p50_ms": 43.8,
+ "index_ms": 19105.0
+ },
+ {
+ "repo": "ktor",
+ "language": "kotlin",
+ "ndcg10": 0.4514,
+ "p50_ms": 31.5,
+ "index_ms": 9397.0
+ },
+ {
+ "repo": "laravel-framework",
+ "language": "php",
+ "ndcg10": 0.297,
+ "p50_ms": 134.1,
+ "index_ms": 147392.0
+ },
+ {
+ "repo": "lazy.nvim",
+ "language": "lua",
+ "ndcg10": 0.5555,
+ "p50_ms": 32.8,
+ "index_ms": 12146.0
+ },
+ {
+ "repo": "libuv",
+ "language": "c",
+ "ndcg10": 0.502,
+ "p50_ms": 49.3,
+ "index_ms": 38233.0
+ },
+ {
+ "repo": "messagepack-csharp",
+ "language": "csharp",
+ "ndcg10": 0.3051,
+ "p50_ms": 45.2,
+ "index_ms": 23631.0
+ },
+ {
+ "repo": "mini.nvim",
+ "language": "lua",
+ "ndcg10": 1.0,
+ "p50_ms": 61.9,
+ "index_ms": 65813.0
+ },
+ {
+ "repo": "model2vec",
+ "language": "python",
+ "ndcg10": 0.4896,
+ "p50_ms": 28.7,
+ "index_ms": 6061.0
+ },
+ {
+ "repo": "monolog",
+ "language": "php",
+ "ndcg10": 0.3217,
+ "p50_ms": 33.9,
+ "index_ms": 12425.0
+ },
+ {
+ "repo": "newtonsoft-json",
+ "language": "csharp",
+ "ndcg10": 0.1495,
+ "p50_ms": 63.5,
+ "index_ms": 44247.0
+ },
+ {
+ "repo": "nlohmann-json",
+ "language": "cpp",
+ "ndcg10": 0.7863,
+ "p50_ms": 41.2,
+ "index_ms": 23650.0
+ },
+ {
+ "repo": "nvm",
+ "language": "bash",
+ "ndcg10": 1.0,
+ "p50_ms": 35.4,
+ "index_ms": 13357.0
+ },
+ {
+ "repo": "pandoc",
+ "language": "haskell",
+ "ndcg10": 0.1382,
+ "p50_ms": 72.9,
+ "index_ms": 66106.0
+ },
+ {
+ "repo": "phoenix",
+ "language": "elixir",
+ "ndcg10": 0.6589,
+ "p50_ms": 34.7,
+ "index_ms": 18193.0
+ },
+ {
+ "repo": "plug",
+ "language": "elixir",
+ "ndcg10": 0.6127,
+ "p50_ms": 34.5,
+ "index_ms": 10328.0
+ },
+ {
+ "repo": "pydantic",
+ "language": "python",
+ "ndcg10": 0.4918,
+ "p50_ms": 51.7,
+ "index_ms": 36380.0
+ },
+ {
+ "repo": "rack",
+ "language": "ruby",
+ "ndcg10": 0.5663,
+ "p50_ms": 30.4,
+ "index_ms": 9106.0
+ },
+ {
+ "repo": "rails",
+ "language": "ruby",
+ "ndcg10": 0.5675,
+ "p50_ms": 38.4,
+ "index_ms": 15155.0
+ },
+ {
+ "repo": "redis",
+ "language": "c",
+ "ndcg10": 0.5988,
+ "p50_ms": 124.7,
+ "index_ms": 167157.0
+ },
+ {
+ "repo": "redux",
+ "language": "javascript",
+ "ndcg10": 0.645,
+ "p50_ms": 22.8,
+ "index_ms": 4560.0
+ },
+ {
+ "repo": "requests",
+ "language": "python",
+ "ndcg10": 0.7508,
+ "p50_ms": 27.8,
+ "index_ms": 6695.0
+ },
+ {
+ "repo": "serde",
+ "language": "rust",
+ "ndcg10": 0.5056,
+ "p50_ms": 49.4,
+ "index_ms": 30975.0
+ },
+ {
+ "repo": "sinatra",
+ "language": "ruby",
+ "ndcg10": 0.7964,
+ "p50_ms": 26.1,
+ "index_ms": 4863.0
+ },
+ {
+ "repo": "snapkit",
+ "language": "swift",
+ "ndcg10": 0.4189,
+ "p50_ms": 26.0,
+ "index_ms": 5456.0
+ },
+ {
+ "repo": "starlette",
+ "language": "python",
+ "ndcg10": 0.6606,
+ "p50_ms": 30.5,
+ "index_ms": 7899.0
+ },
+ {
+ "repo": "telescope.nvim",
+ "language": "lua",
+ "ndcg10": 0.5419,
+ "p50_ms": 37.3,
+ "index_ms": 16376.0
+ },
+ {
+ "repo": "tokio",
+ "language": "rust",
+ "ndcg10": 0.5391,
+ "p50_ms": 69.9,
+ "index_ms": 76184.0
+ },
+ {
+ "repo": "trpc",
+ "language": "typescript",
+ "ndcg10": 0.4809,
+ "p50_ms": 34.8,
+ "index_ms": 10920.0
+ },
+ {
+ "repo": "vapor",
+ "language": "swift",
+ "ndcg10": 0.3022,
+ "p50_ms": 43.7,
+ "index_ms": 19099.0
+ },
+ {
+ "repo": "vitest",
+ "language": "typescript",
+ "ndcg10": 0.4334,
+ "p50_ms": 46.1,
+ "index_ms": 26371.0
+ },
+ {
+ "repo": "xmonad",
+ "language": "haskell",
+ "ndcg10": 0.6487,
+ "p50_ms": 32.2,
+ "index_ms": 6080.0
+ },
+ {
+ "repo": "zig",
+ "language": "zig",
+ "ndcg10": 0.7124,
+ "p50_ms": 199.4,
+ "index_ms": 350606.0
+ },
+ {
+ "repo": "zig-clap",
+ "language": "zig",
+ "ndcg10": 0.8083,
+ "p50_ms": 33.5,
+ "index_ms": 6076.0
+ },
+ {
+ "repo": "zls",
+ "language": "zig",
+ "ndcg10": 0.7444,
+ "p50_ms": 47.3,
+ "index_ms": 32747.0
+ },
+ {
+ "repo": "zod",
+ "language": "typescript",
+ "ndcg10": 0.2684,
+ "p50_ms": 56.6,
+ "index_ms": 52460.0
+ }
+ ],
+ "avg_ndcg10": 0.5606,
+ "avg_p50_ms": 47.7,
+ "avg_index_ms": 34955.0
+}
diff --git a/benchmarks/results/probe-715563a812c3.json b/benchmarks/results/probe-715563a812c3.json
new file mode 100644
index 0000000..f9b725b
--- /dev/null
+++ b/benchmarks/results/probe-715563a812c3.json
@@ -0,0 +1,385 @@
+{
+ "tool": "probe-bm25",
+ "repos": [
+ {
+ "repo": "abseil-cpp",
+ "language": "cpp",
+ "ndcg10": 0.1244,
+ "p50_ms": 111.9
+ },
+ {
+ "repo": "aeson",
+ "language": "haskell",
+ "ndcg10": 0.2037,
+ "p50_ms": 139.1
+ },
+ {
+ "repo": "aiohttp",
+ "language": "python",
+ "ndcg10": 0.4316,
+ "p50_ms": 140.9
+ },
+ {
+ "repo": "alamofire",
+ "language": "swift",
+ "ndcg10": 0.3013,
+ "p50_ms": 141.0
+ },
+ {
+ "repo": "axios",
+ "language": "javascript",
+ "ndcg10": 0.4722,
+ "p50_ms": 90.2
+ },
+ {
+ "repo": "axum",
+ "language": "rust",
+ "ndcg10": 0.28,
+ "p50_ms": 793.9
+ },
+ {
+ "repo": "bash-it",
+ "language": "bash",
+ "ndcg10": 0.1601,
+ "p50_ms": 86.4
+ },
+ {
+ "repo": "bats-core",
+ "language": "bash",
+ "ndcg10": 0.4114,
+ "p50_ms": 69.1
+ },
+ {
+ "repo": "cats",
+ "language": "scala",
+ "ndcg10": 0.3496,
+ "p50_ms": 101.0
+ },
+ {
+ "repo": "chi",
+ "language": "go",
+ "ndcg10": 0.28,
+ "p50_ms": 104.8
+ },
+ {
+ "repo": "circe",
+ "language": "scala",
+ "ndcg10": 0.4489,
+ "p50_ms": 101.6
+ },
+ {
+ "repo": "click",
+ "language": "python",
+ "ndcg10": 0.6472,
+ "p50_ms": 126.3
+ },
+ {
+ "repo": "cobra",
+ "language": "go",
+ "ndcg10": 0.532,
+ "p50_ms": 151.7
+ },
+ {
+ "repo": "commons-lang",
+ "language": "java",
+ "ndcg10": 0.5891,
+ "p50_ms": 263.7
+ },
+ {
+ "repo": "curl",
+ "language": "c",
+ "ndcg10": 0.2494,
+ "p50_ms": 412.0
+ },
+ {
+ "repo": "dapper",
+ "language": "csharp",
+ "ndcg10": 0.4014,
+ "p50_ms": 235.2
+ },
+ {
+ "repo": "ecto",
+ "language": "elixir",
+ "ndcg10": 0.3956,
+ "p50_ms": 124.9
+ },
+ {
+ "repo": "exposed",
+ "language": "kotlin",
+ "ndcg10": 0.3478,
+ "p50_ms": 113.6
+ },
+ {
+ "repo": "express",
+ "language": "javascript",
+ "ndcg10": 0.7438,
+ "p50_ms": 72.5
+ },
+ {
+ "repo": "fastapi",
+ "language": "python",
+ "ndcg10": 0.4201,
+ "p50_ms": 152.3
+ },
+ {
+ "repo": "flask",
+ "language": "python",
+ "ndcg10": 0.5163,
+ "p50_ms": 97.9
+ },
+ {
+ "repo": "fmtlib",
+ "language": "cpp",
+ "ndcg10": 0.4674,
+ "p50_ms": 369.8
+ },
+ {
+ "repo": "gin",
+ "language": "go",
+ "ndcg10": 0.4167,
+ "p50_ms": 123.8
+ },
+ {
+ "repo": "gson",
+ "language": "java",
+ "ndcg10": 0.4908,
+ "p50_ms": 127.5
+ },
+ {
+ "repo": "guzzle",
+ "language": "php",
+ "ndcg10": 0.397,
+ "p50_ms": 110.2
+ },
+ {
+ "repo": "http4s",
+ "language": "scala",
+ "ndcg10": 0.3769,
+ "p50_ms": 94.8
+ },
+ {
+ "repo": "httpx",
+ "language": "python",
+ "ndcg10": 0.5374,
+ "p50_ms": 95.7
+ },
+ {
+ "repo": "jackson-databind",
+ "language": "java",
+ "ndcg10": 0.5278,
+ "p50_ms": 217.0
+ },
+ {
+ "repo": "kotlinx-coroutines",
+ "language": "kotlin",
+ "ndcg10": 0.3092,
+ "p50_ms": 111.2
+ },
+ {
+ "repo": "ktor",
+ "language": "kotlin",
+ "ndcg10": 0.3482,
+ "p50_ms": 107.4
+ },
+ {
+ "repo": "laravel-framework",
+ "language": "php",
+ "ndcg10": 0.306,
+ "p50_ms": 188.6
+ },
+ {
+ "repo": "lazy.nvim",
+ "language": "lua",
+ "ndcg10": 0.3382,
+ "p50_ms": 86.8
+ },
+ {
+ "repo": "libuv",
+ "language": "c",
+ "ndcg10": 0.5078,
+ "p50_ms": 213.2
+ },
+ {
+ "repo": "messagepack-csharp",
+ "language": "csharp",
+ "ndcg10": 0.3902,
+ "p50_ms": 258.3
+ },
+ {
+ "repo": "mini.nvim",
+ "language": "lua",
+ "ndcg10": 0.4623,
+ "p50_ms": 269.9
+ },
+ {
+ "repo": "model2vec",
+ "language": "python",
+ "ndcg10": 0.4623,
+ "p50_ms": 88.1
+ },
+ {
+ "repo": "monolog",
+ "language": "php",
+ "ndcg10": 0.3177,
+ "p50_ms": 103.2
+ },
+ {
+ "repo": "newtonsoft-json",
+ "language": "csharp",
+ "ndcg10": 0.3832,
+ "p50_ms": 360.5
+ },
+ {
+ "repo": "nlohmann-json",
+ "language": "cpp",
+ "ndcg10": 0.5336,
+ "p50_ms": 391.6
+ },
+ {
+ "repo": "nvm",
+ "language": "bash",
+ "ndcg10": 0.1067,
+ "p50_ms": 137.5
+ },
+ {
+ "repo": "pandoc",
+ "language": "haskell",
+ "ndcg10": 0.2581,
+ "p50_ms": 162.0
+ },
+ {
+ "repo": "phoenix",
+ "language": "elixir",
+ "ndcg10": 0.3504,
+ "p50_ms": 97.5
+ },
+ {
+ "repo": "plug",
+ "language": "elixir",
+ "ndcg10": 0.4895,
+ "p50_ms": 89.5
+ },
+ {
+ "repo": "pydantic",
+ "language": "python",
+ "ndcg10": 0.3377,
+ "p50_ms": 263.7
+ },
+ {
+ "repo": "rack",
+ "language": "ruby",
+ "ndcg10": 0.3986,
+ "p50_ms": 135.1
+ },
+ {
+ "repo": "rails",
+ "language": "ruby",
+ "ndcg10": 0.2155,
+ "p50_ms": 256.4
+ },
+ {
+ "repo": "redis",
+ "language": "c",
+ "ndcg10": 0.3943,
+ "p50_ms": 1040.8
+ },
+ {
+ "repo": "redux",
+ "language": "javascript",
+ "ndcg10": 0.5492,
+ "p50_ms": 73.3
+ },
+ {
+ "repo": "requests",
+ "language": "python",
+ "ndcg10": 0.5147,
+ "p50_ms": 83.0
+ },
+ {
+ "repo": "serde",
+ "language": "rust",
+ "ndcg10": 0.1772,
+ "p50_ms": 873.0
+ },
+ {
+ "repo": "sinatra",
+ "language": "ruby",
+ "ndcg10": 0.5327,
+ "p50_ms": 68.4
+ },
+ {
+ "repo": "snapkit",
+ "language": "swift",
+ "ndcg10": 0.325,
+ "p50_ms": 67.0
+ },
+ {
+ "repo": "starlette",
+ "language": "python",
+ "ndcg10": 0.5292,
+ "p50_ms": 86.6
+ },
+ {
+ "repo": "telescope.nvim",
+ "language": "lua",
+ "ndcg10": 0.2082,
+ "p50_ms": 113.4
+ },
+ {
+ "repo": "tokio",
+ "language": "rust",
+ "ndcg10": 0.2686,
+ "p50_ms": 1235.1
+ },
+ {
+ "repo": "trpc",
+ "language": "typescript",
+ "ndcg10": 0.342,
+ "p50_ms": 115.1
+ },
+ {
+ "repo": "vapor",
+ "language": "swift",
+ "ndcg10": 0.2145,
+ "p50_ms": 85.4
+ },
+ {
+ "repo": "vitest",
+ "language": "typescript",
+ "ndcg10": 0.373,
+ "p50_ms": 168.5
+ },
+ {
+ "repo": "xmonad",
+ "language": "haskell",
+ "ndcg10": 0.4766,
+ "p50_ms": 90.0
+ },
+ {
+ "repo": "zig",
+ "language": "zig",
+ "ndcg10": 0.2973,
+ "p50_ms": 254.9
+ },
+ {
+ "repo": "zig-clap",
+ "language": "zig",
+ "ndcg10": 0.4813,
+ "p50_ms": 83.2
+ },
+ {
+ "repo": "zls",
+ "language": "zig",
+ "ndcg10": 0.3273,
+ "p50_ms": 125.0
+ },
+ {
+ "repo": "zod",
+ "language": "typescript",
+ "ndcg10": 0.3473,
+ "p50_ms": 396.7
+ }
+ ],
+ "avg_ndcg10": 0.3872,
+ "avg_p50_ms": 207.1
+}