feat: add POST /search endpoint to HTTP server by tradewithmeai · Pull Request #96 · elara-labs/code-context-engine

tradewithmeai · 2026-05-24T22:19:44Z

What this does

Adds a POST /search endpoint to the HTTP server (cce serve --http), exposing the hybrid retrieval pipeline for custom agent integrations.

The gap

cce serve --http currently only exposes /ingest and /health — there's no way to query an indexed project over HTTP. The cce search CLI works, but requires subprocess management and produces human-readable text output rather than structured data.

Custom Python agents (not Claude Code) that want to use CCE for semantic code search currently have two bad options: parse CLI text output, or re-implement the retrieval pipeline themselves.

The change

One file changed (~35 lines). Adds handle_search to ContextEngineHTTP as a thin wrapper around the existing HybridRetriever pipeline — the same path the context_search MCP tool already uses internally.

POST /search
Content-Type: application/json

{"query": "cost tracking", "top_k": 5}

Response:

{
  "results": [
    {
      "id": "b1294739d28245a1",
      "file_path": "memory/journal.py",
      "start_line": 81,
      "end_line": 86,
      "content": "def record_usage(model, provider, input_tokens, ...)",
      "chunk_type": "function",
      "language": "python",
      "confidence_score": 0.878,
      "metadata": {"_distance": 0.774}
    }
  ]
}

Real-world test

Tested live against an indexed Python project on my instance of a Hermes agent — a personal AI assistant that clones repos to a VPS and queries them for code analysis. Query "cost tracking", top_k=3 returned three correctly ranked functions across two files with confidence scores. Matches the results from cce search CLI.

Background

I've been using CCE inside my Hermes agent to give it semantic search over cloned repos, getting ~93% token reduction compared to reading full files (400 tokens served vs 6,151 full file tokens on a real query). This endpoint makes that integration cleaner and opens the same pattern to any agent framework that speaks HTTP.

No new logic, no new dependencies, no new commands. Just surfaces what's already there.

Exposes the hybrid retrieval pipeline as a single HTTP endpoint, enabling custom Python agents to query CCE without subprocess management. The HTTP server previously only exposed /ingest and /health — no query surface at all. This adds /search as a thin wrapper around the existing HybridRetriever pipeline (the same path used by the context_search MCP tool). Accepts: {"query": "...", "top_k": 10, "confidence_threshold": 0.2} Returns: ranked chunks with file_path, line range, content, confidence_score

rajkumarsakthivel

Clean PR, useful feature. A few things to address before merge:

1. Input validation on top_k and confidence_threshold
int(data.get("top_k", 10)) will raise ValueError on non-numeric input (e.g. "top_k": "abc"). Same for confidence_threshold. The existing handle_vector_search has the same gap, but since this is a new public endpoint, worth adding a try/except or clamping:

try:
    top_k = max(1, min(int(data.get("top_k", 10)), 100))
    confidence_threshold = max(0.0, min(float(data.get("confidence_threshold", 0.2)), 1.0))
except (TypeError, ValueError):
    return web.json_response({"error": "top_k must be int, confidence_threshold must be float"}, status=400)

2. Missing encoding="utf-8" on read_text()
We're adding encoding="utf-8" to all file I/O in #110 (Windows cp1252 crash). Not blocking for this PR but FYI for when it rebases.

3. No savings tracking
The MCP context_search handler records token savings via _record(). This endpoint bypasses that, so queries through /search won't show up in cce savings. Fine for v1, but worth a comment noting the gap.

4. Query length limit
The MCP server caps query length at _MAX_QUERY_CHARS = 10_000. This endpoint has no cap, so a malicious/buggy client could send a multi-MB query string that gets embedded. Consider adding a similar guard.

5. Auth coverage
Confirmed: the _make_auth_middleware already covers all routes, so /search inherits token auth when CCE_API_TOKEN is set. Good.

Otherwise, the implementation is clean. Thin wrapper, no new logic, follows the existing patterns. The PR description is excellent.

Address review feedback on the new endpoint: clamp top_k (1-100) and confidence_threshold (0.0-1.0) and return 400 on non-numeric input instead of raising; cap query length at 10,000 chars to match the MCP server's guard; and note that /search does not record token savings (unlike the MCP context_search handler).

tradewithmeai · 2026-06-21T02:27:12Z

Thanks for the thorough review, @rajkumarsakthivel — all addressed in 26f850b:

1. Input validation — top_k is now clamped to 1–100 and confidence_threshold to 0.0–1.0, with a 400 on non-numeric input (essentially your snippet).

3. Savings tracking — added a comment noting /search bypasses _record(), so queries through it won't appear in cce savings. Left functional as-is for v1 per your call.

4. Query length — added _MAX_QUERY_CHARS = 10_000 mirroring the MCP server's guard, returning a 400 when exceeded.

On 2: there's no read_text() in serve_http.py, so nothing to change in this PR — I'll keep the encoding="utf-8" point in mind for when this rebases onto #110.

And thanks for confirming 5 — good to know the auth middleware already covers the new route.

rajkumarsakthivel

Request changes — tests needed before merge.

The handler logic is solid: constructor args match HybridRetriever.__init__ exactly, auth middleware covers all routes, validation mirrors the MCP server, _MAX_QUERY_CHARS = 10_000 is consistent. The implementation itself is ready.

Missing: tests

No test coverage for handle_search anywhere in the suite — not even a happy-path smoke test. Given the input validation added in the last round (empty query, query too long, non-numeric top_k/confidence_threshold), those branches deserve coverage. Please add at minimum:

Happy path: valid query returns results list
400 on empty query
400 on over-length query
400 on non-numeric top_k

Minor: confidence_threshold not read from config

POST /search hardcodes 0.2 as the default. If a project overrides retrieval_confidence_threshold in .context-engine.yaml, MCP respects it but /search ignores it. Low severity for v1 — a follow-up issue is fine — but worth documenting in a comment.

Cosmetic: dead code in response serialiser

getattr(c, "confidence_score", None) — Chunk.confidence_score has a default of 0.0 and is always set. The None fallback never triggers. Use c.confidence_score directly.

tradewithmeai requested review from fazleelahhee and rajkumarsakthivel as code owners May 24, 2026 22:19

rajkumarsakthivel reviewed Jun 16, 2026

View reviewed changes

Merge branch 'main' into feat/http-search-endpoint

9c3ec7a

rajkumarsakthivel requested changes Jul 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add POST /search endpoint to HTTP server#96

feat: add POST /search endpoint to HTTP server#96
tradewithmeai wants to merge 3 commits into
elara-labs:mainfrom
tradewithmeai:feat/http-search-endpoint

tradewithmeai commented May 24, 2026

Uh oh!

rajkumarsakthivel left a comment

Uh oh!

tradewithmeai commented Jun 21, 2026

Uh oh!

rajkumarsakthivel left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

tradewithmeai commented May 24, 2026

What this does

The gap

The change

Real-world test

Background

Uh oh!

rajkumarsakthivel left a comment

Choose a reason for hiding this comment

Uh oh!

tradewithmeai commented Jun 21, 2026

Uh oh!

rajkumarsakthivel left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants