Auto-Researcher is an autonomous, multi-agent system that performs deep academic research, analyzes complex papers, synthesizes comprehensive reviews with verified citations, and provides rich interactive features for exploring results.
The frontend is a polished React 19 app with animated dashboards, zero-knowledge encryption, biometric unlock, an interactive knowledge graph, image galleries, follow-up chat, citation style switching, executive briefs, research timelines, and PDF file upload.
| Feature | Description |
|---|---|
| Multi-agent pipeline | Three autonomous agents (Researcher, Analyst, Critic) collaborate in a feedback loop with up to 3 revision passes |
| Academic search | Searches ArXiv and PDF repositories, parses full-text documents, ranks by relevance (Tavily + DuckDuckGo) |
| 10+ LLM providers | Ollama, OpenAI, Anthropic, Google, DeepSeek, Mistral, Groq, Perplexity, Together โ local or cloud |
| Knowledge graph | Interactive force-directed graph of cited papers โ click any node to open the source |
| Structured reports | Rich Markdown with Executive Summary, Key Findings, Critical Analysis, Methodological Notes, and Implications |
| Image gallery | Automatically fetches related images and charts/graphs for any research topic (powered by DuckDuckGo Images) |
| Follow-up chat | Ask questions about the report โ answers include formatted Markdown with source citation badges linked to the actual papers |
| Citation styles | Switch between Inline [S#], APA, MLA, Chicago, and IEEE citation formats |
| Executive brief | AI-generated concise summary extracted from the report's key sections |
| Research timeline | Extracts year-based milestones from the report and arXiv source URLs into a visual timeline |
| PDF upload & analysis | Upload multiple PDF files (drag & drop), extract text, and include the content as additional context for research |
| DOI resolution | Automatically resolves DOIs for arXiv sources and displays clickable DOI badges on reference cards |
| Print as PDF | Generates a styled HTML page optimized for printing as PDF |
| Text-to-speech | Reads the report aloud with markdown syntax stripped for clean audio |
| Zero-knowledge encryption | Passphrase-protected with WebAuthn biometric unlock (fingerprint, Face ID, Windows Hello). API keys encrypted end-to-end in your browser |
| Critique & revision | Automatic scoring, hallucination detection, and citation validation with configurable strictness |
| Deep controls | Configure search depth, source count, critic strictness, and custom model overrides |
| Research queue | Click multiple topics and they run sequentially โ no parallel conflicts |
| Trending topics | Fetch trending research topics from the API with localStorage caching and manual refresh |
| Collapsible sidebar | Compact icon-only mode with history browsing, semantic search, and passphrase management |
| Biometric unlock | Fingerprint, Face ID, or Windows Hello via WebAuthn PRF extension with emergency recovery codes |
| Dark mode | Full theme support persisted to localStorage with theme-aware favicon |
| Welcome page | Rich marketing landing page with hero section, animated metrics, and feature overview |
The system uses a Graph-based Multi-Agent Architecture (built with LangGraph) with real-time streaming via Server-Sent Events (SSE):
-
๐ต๏ธ The Researcher
- Parallel Search โ Simultaneously queries Tavily and DuckDuckGo for maximum coverage
- Academic Filtering โ Targets
arxiv.org,.edu,.ac.uk, andresearchgate.net - Smart Parsing โ Downloads PDFs with PyMuPDF (Fitz), extracts high-density text, and ignores references/bibliographies to save context window
-
โ๏ธ The Analyst
- High-Density Synthesis โ Drafts comprehensive reports (2400+ words for deep searches)
- Thematic Grouping โ Automatically organizes findings into logical themes
- Structured Output โ Generates Markdown with Executive Summary, Key Findings, Methodological Notes, and Implications
-
โ๏ธ The Critic
- Fact-Checking โ Reviews the draft for hallucinations and vague generalizations
- Quantitative Enforcement โ Rejects drafts that lack specific numbers and data
- Feedback Loop โ Triggers up to 3 revision cycles if the quality score drops below the configured threshold
Real-time Streaming: The backend pushes events (SSE) so you can watch agents transition live โ Researching โ Drafting โ Critiquing โ as they work.
| Layer | Technology |
|---|---|
| Backend | Python, FastAPI, LangGraph, LangChain |
| Streaming | Server-Sent Events (SSE) |
| Frontend | React 19, Vite, TypeScript |
| Styling | Tailwind CSS v4, Framer Motion |
| Graphs | React Force Graph (2D) |
| Routing | React Router v7 |
| LLM Engine (Local) | Ollama (Llama 3, Mistral, etc.) |
| LLM Engine (Cloud) | OpenAI, Anthropic, Google, DeepSeek, Mistral, Groq, Perplexity, Together, OpenRouter |
| Search | Tavily API + DuckDuckGo fallback |
| Images | DuckDuckGo Image Search (DDGS) |
| PyMuPDF (Fitz) | |
| Auth | WebAuthn PRF (biometric unlock) + PBKDF2 encryption |
| Markdown | react-markdown |
Every research report includes a Related Images and Charts & Visualizations gallery:
- Fetches images from DuckDuckGo using the research topic as query
- Charts/graphs are fetched with enriched queries (
"chart graph data visualization") - Click any image to open a lightbox with keyboard navigation (โ โ ESC) and dimension info
- Images show source domain and link to the original page
- Results are cached in-memory on the backend for 1 hour
The Ask Follow-up panel lets you ask questions about the completed report:
- Answers are formatted in Markdown with headings, paragraphs, bullet points, and bold terms
- Source citation badges (
[S1]) link directly to the referenced paper URLs - Code blocks include a copy button on hover
- Suggested questions (Summarize key findings, What are the limitations?) for quick interaction
- Copy message button on each assistant response
- Relative timestamps ("Just now", "2m ago") on every message
- Uses the same LLM provider selected for the research (no separate configuration needed)
Switch between citation formats on the fly โ the report re-renders instantly:
- Inline โ
[S1],[S2](default) - APA โ
(Author et al., n.d.) - MLA โ
(Author 1) - Chicago โ Superscript numbers
- IEEE โ Bracketed numbers
[1]
Click the Brief button to generate a simplified executive summary extracted from the report's Introduction / Executive Summary and Key Findings sections โ no LLM call needed.
Click the Timeline button to extract year-based milestones from the report content and arXiv source URLs. Years are parsed from text (1900โ2029) and arXiv submission dates, displayed in a scrollable timeline with connecting lines.
Upload PDFs via drag-and-drop or file picker in the research form:
- Multiple file upload โ Upload and select multiple PDFs to include as research context
- Text extraction โ PDF text is extracted with PyMuPDF and truncated to the configured max context length
- File management โ Uploaded files persist in localStorage with selection checkboxes, expandable previews, and individual removal
- Selected files are concatenated and sent as additional context with
[U1]markers in the agent pipeline
arXiv source URLs automatically resolve their DOIs via the arXiv API:
- Displays clickable DOI badges on reference cards
- Prefixed with the
Unlinkicon for quick identification - DOIs are cached in-memory to avoid repeated API calls
- arXiv sources without DOIs show an "arXiv" label
Click the Print icon to generate a styled HTML print view of the report with proper typography, table styling, code highlighting, and a formatted sources section. Opens in a new tab and triggers the browser's print dialog.
Click the volume icon to have the report read aloud. Markdown syntax (headings, bold, links, code blocks, citations) is automatically stripped so only clean text is spoken.
- Python 3.10+
- Node.js 18+
- Ollama installed and running (for local mode)
# Clone the repository
git clone https://github.com/royxlead/auto-researcher-python.git
cd auto-researcher-python
# Setup Backend
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
# Setup Frontend
cd frontend
npm installCreate a .env file in the project root:
TAVILY_API_KEY=your_key_here
# Local Mode
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3:8b
# Cloud Mode (Optional โ any of these)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=AIza...
DEEPSEEK_API_KEY=sk-...# Terminal 1 - Backend
python run.py
# Terminal 2 - Frontend
cd frontend && npm run devOpen http://localhost:5173 in your browser. The Welcome page greets you with a live dashboard, example report preview, and feature overview. Click Launch the app or navigate to /app to start researching.
- Enter a research topic (e.g., "Impact of solid-state batteries on EV range")
- Configure your research:
- Depth โ Fast / Balanced / Deep (1โ10)
- Papers โ 5โ50 sources
- Strictness โ The Critic's threshold (Lenient / Balanced / Strict)
- Provider โ Ollama (local) or any cloud provider
- Model โ Custom model name override
- Click the arrow to start โ watch agents work in real-time
- Interact with the report:
- Read Aloud โ Text-to-Speech
- Visualize โ Explore the Knowledge Graph
- Images โ Browse related images and charts
- Chat โ Ask follow-up questions
- Export โ Download as Markdown or Print as PDF
- Cite โ Switch citation styles
Click any trending topic chip to instantly kick off research. Click multiple and they queue up. The refresh button (โป) fetches updated topics from the API.
Set a passphrase in the sidebar to encrypt your API keys end-to-end. Optionally register a biometric credential (fingerprint, Face ID, Windows Hello) for one-tap unlock on repeat visits. If you lose access, use your 5-word emergency recovery code.
auto-researcher-python/
โโโ run.py # Backend entry point (uvicorn)
โโโ requirements.txt
โโโ src/
โ โโโ api.py # FastAPI routes + SSE streaming
โ โโโ config.py # Environment configuration
โ โโโ schemas.py # Pydantic models
โ โโโ agents/
โ โ โโโ graph.py # LangGraph workflow definition
โ โ โโโ nodes.py # Agent node functions
โ โ โโโ state.py # Graph state schema
โ โโโ tools/
โ โ โโโ search.py # Tavily + DuckDuckGo search
โ โ โโโ pdf.py # PDF download + parsing (PyMuPDF)
โ โ โโโ ranking.py # Source relevance ranking (TF-IDF)
โ โ โโโ validation.py # Citation validation
โ โ โโโ graph.py # Knowledge graph extraction
โ โ โโโ chat.py # Follow-up chat prompt builder
โ โ โโโ images.py # Image + chart search (DDGS)
โ โ โโโ summarize.py # Executive brief + timeline extraction
โ โ โโโ doi.py # ArXiv DOI resolution
โ โ โโโ __init__.py
โ โโโ evaluation/
โ โ โโโ retrieval.py # Retrieval evaluation
โ โโโ utils/
โ โโโ crypto.py # Server-side crypto helpers
โ โโโ tracing.py # LangSmith tracing
โโโ frontend/
โ โโโ index.html # HTML with favicon + apple-touch-icon
โ โโโ src/
โ โ โโโ main.tsx # React entry + BrowserRouter
โ โ โโโ App.tsx # Research app (route: /app)
โ โ โโโ pages/
โ โ โ โโโ Welcome.tsx # Marketing landing page (route: /)
โ โ โโโ components/
โ โ โ โโโ Sidebar.tsx # Collapsible nav + passphrase mgmt
โ โ โ โโโ ResearchForm.tsx # Topic input + config + PDF upload
โ โ โ โโโ ReportView.tsx # Report viewer + all features
โ โ โ โโโ LoadingState.tsx # Real-time progress dashboard
โ โ โ โโโ KnowledgeGraph.tsx # Force-directed citation graph
โ โ โ โโโ ImageGallery.tsx # Image/chart gallery with lightbox
โ โ โ โโโ ChatPanel.tsx # Follow-up chat with markdown
โ โ โ โโโ ErrorBoundary.tsx # Error fallback UI
โ โ โ โโโ BrandIcon.tsx # SVG brand icon component
โ โ โโโ hooks/
โ โ โ โโโ useResearch.ts # Research state + queue logic
โ โ โโโ lib/
โ โ โโโ api.ts # API client + all endpoints
โ โ โโโ crypto.ts # AES-256-GCM + PBKDF2 encryption
โ โ โโโ webauthn.ts # WebAuthn PRF biometric unlock
โ โ โโโ favicon.ts # Dynamic theme-aware favicon swap
โ โโโ package.json
โโโ assets/
โโโ HomeScreen.png
โโโ SearchScreen.png
โโโ Features.png
- Cloud Mode โ Multiple providers (OpenAI, Anthropic, Google, etc.)
- Knowledge Graph โ Interactive node-link diagram of cited papers
- Zero-knowledge encryption โ Passphrase + WebAuthn biometric unlock
- Trending topics โ API-fetched, cached, with refresh button
- Research queue โ Sequential topic processing
- Welcome page โ Marketing landing page with live metrics
- Image gallery โ Related images and charts with lightbox
- Follow-up chat โ Ask questions with formatted Markdown + source badges
- Citation styles โ APA, MLA, Chicago, IEEE, Inline
- Executive brief โ AI-generated simplified summary
- Research timeline โ Year-based visual milestones
- PDF upload โ Multi-file upload with text extraction
- DOI badges โ Resolved DOIs for arXiv sources
- PDF export โ Print-ready HTML report
- Theme-aware favicon โ Dynamic light/dark SVG icon
- Sidebar collapse โ Compact icon-only mode
- Multi-document chat โ Chat with collected sources
- Zotero / Mendeley integration โ Direct export to reference managers
- Streaming chat โ Word-by-word response generation
Contributions are welcome! See CONTRIBUTING.md for detailed setup instructions, development workflow, and PR guidelines.
MIT โ see LICENSE.
See CHANGELOG.md for the full version history and CONTRIBUTING.md for development setup and contribution guidelines.


