This guide covers getting a full PhyslibSearch stack running on your machine for development.
| Tool | Minimum version | Install |
|---|---|---|
| Python | 3.11 | python.org |
| PostgreSQL | 14 | postgresql.org/download |
| Node.js | 20 | nodejs.org |
| Lean 4 + elan | latest | `curl https://elan.lean-lang.org/elan-init.sh -sSf |
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txtCreate a dedicated database:
createdb physlibsearchIf you want to reset from scratch at any point:
dropdb physlibsearch && createdb physlibsearchjixia is the Lean 4 project parser. Build it from source:
git clone https://github.com/frenzymath/jixia.git
cd jixia
lake build # first build takes ~70 s
cd ..Toolchain matching — jixia's
lean-toolchainmust match the toolchain of the project you're indexing. Runcat jixia/lean-toolchainand compare withcat /path/to/physlib/lean-toolchain. If they differ, update one to match the other before building.
cp .env.example .envOpen .env and fill in at minimum:
| Variable | What to set |
|---|---|
JIXIA_PATH |
Absolute path to jixia/.lake/build/bin/jixia |
LEAN_SYSROOT |
Run lake env in PhysLib and copy the LEAN_SYSROOT value |
CONNECTION_STRING |
"dbname=physlibsearch user=YOUR_USER password=YOUR_PASSWORD" |
GEMINI_API_KEY |
Your key from aistudio.google.com |
The remaining variables have sensible defaults.
If you want to use OpenRouter (or any OpenAI-compatible endpoint) instead of Gemini directly for the fast model, add:
LLM_API_KEY = "sk-or-..."
LLM_BASE_URL = "https://openrouter.ai/api/v1"
GEMINI_FAST_MODEL = "google/gemini-3-flash-preview" # use the endpoint's model nameFor development it's easiest to index a small Lean project rather than all of PhysLib. Set DRY_RUN=true to verify the pipeline runs without spending any API quota:
DRY_RUN=true python -m database jixia /path/to/physlib Physlib
DRY_RUN=true python -m database informal
DRY_RUN=true python -m database vector-dbFor a real index of PhysLib:
python -m database jixia /path/to/physlib Physlib,QuantumInfo
python -m database informal
python -m database vector-db# Set env vars first, then:
export DBNAME=physlibsearch
export INDEXED_REPO_PATH=/path/to/physlib
export MODULE_NAMES=Physlib,QuantumInfo
export CHROMA_PATH=chroma
make index # runs reset → jixia → informal → vector-db in sequenceuvicorn server:app --reload --port 8000The API is now at http://localhost:8000. Interactive docs at http://localhost:8000/docs.
cd frontend
npm install
npm run devOpen http://localhost:3000.
The frontend reads NEXT_PUBLIC_API_URL (defaults to http://localhost:8000). To point it at a different backend:
NEXT_PUBLIC_API_URL=http://my-server:8000 npm run dev# Python linting + formatting (ruff)
ruff check .
ruff format .
# Frontend linting
cd frontend && npm run lintCI runs ruff check and ruff format --check on every push (see .github/workflows/lint.yaml).
The schema is defined in database/create_schema.py. To (re-)apply it:
python -m database schemaThis is idempotent — safe to run on an empty database. It creates all tables, types, views, and the physlibsearch operational schema.
python search.py "conservation of momentum"
python search.py "Schrödinger equation" "Hamiltonian operator" --json
python search.py -n 20 "entropy"Flags:
-n N— number of results (default 5)--json— output as JSON