Key Features • System Architecture • Directory Structure • Getting Started • Deployment • License
RepoGPT is a premium developer-onboarding and codebase intelligence platform designed to eliminate code discovery friction. By pasting a public GitHub repository URL, the system clones, traverses, and parses the codebase locally using customized AST-signature extractors to generate comprehensive interactive charts, semantic document libraries, and context-aware chat interfaces.
It compiles code hierarchies in real time and interfaces directly with local LLMs (via Ollama) or falls back to a high-fidelity local semantic retrieval engine (TF-IDF + Cosine Similarity) when offline.
- ⚡ Real-Time Repository Ingestion: Securely clones public Git repositories, automatically scans the folder structure, and generates dynamic metadata profiles (size, languages, file tree).
- 🔍 AST-Signature Extraction Engine: Recursively scans and parses source files across major ecosystems (JavaScript, TypeScript, Python, Java, Go, Rust, PHP) to map classes, function scopes, routes, libraries, and import/export structures.
- 📊 Interactive Architecture Visualization: Employs
@xyflow/reactto render interactive, zoomable codebase graphs in four distinct modes:- Overview (Concentric): Folders group at the center with files orbiting.
- File Tree: A top-down hierarchical layout mapping file depth.
- Imports & Dependencies: Concentric rings mapping module imports.
- Core Flows: Service boundaries (API controllers, databases, middleware) mapped together.
- 💬 Context-Aware Semantic Code Chat: Converse with any parsed repository. It maps inputs against local code snippets using TF-IDF tokenization and cosine similarity to retrieve exact code context, piping the result through local LLMs or the fallback parser. Contains persistent quick suggested query pills right above the input bar.
- 📖 Automated Developer Onboarding Guides: Synthesizes developer manuals including setup walkthroughs, API reference tables, and architectural overviews.
RepoGPT uses a highly optimized 5-stage ingestion pipeline to fetch and process repositories:
🔍 View Ingestion Pipeline Details
- Git Blobless Clone: Clones the repo with
--filter=blob:noneto download only metadata initially, fetching file contents on-demand. - Noise Exclusions: Bypasses testing, documentation, and asset folders (
tests,docs,website,.github) to speed up file walks. - AST Traverser: Scans the files to parse imports, exports, functions, and class symbols.
- Local Storage Store: Saves the resulting repository map and chunks into JSON cache folders under
data/. - Interactive Dashboard: Displays file trees, React Flow diagrams, and semantic chat.
RepoGPT/
├── data/ # Local File-Based database (tracked files, chats, indices)
│ ├── indexes/ # AST and semantic search indices per repository
│ └── chats/ # Saved RAG chat sessions
├── public/ # Static media assets and branding elements
│ ├── Favicon.png # Website Favicon
│ ├── Logo.png # Primary transparent logo
│ └── system_architecture.png # Generated architecture diagram asset
└── src/
├── app/ # Next.js App Router workspace
│ ├── api/ # Fullstack API Endpoints
│ │ ├── analyze/ # Ingests, clones, and parses repositories
│ │ ├── chat/ # Routes RAG prompts to LLM / local retriever
│ │ ├── docs/ # Dynamically synthesizes manuals
│ │ ├── file/ # Safely streams source file contents
│ │ └── visualize/ # Exposes parsed node and edge coordinates
│ ├── dashboard/ # Multi-tab dashboard pages
│ ├── globals.css # Global styling and custom scrollbars
│ ├── icon.png # App icon source
│ ├── layout.tsx # Root HTML layout and metadata configurations
│ └── page.tsx # Interactive landing page with clone progress stepper
├── components/ # Premium animated Tailwind + Framer Motion components
│ ├── BorderGlow.tsx # Hover glow border wrappers
│ ├── MagicRings.tsx # Orbiting vector background elements
│ ├── SplitText.tsx # Character-staggered typography entrance animations
│ └── Stepper.tsx # Ingestion phase tracker
└── lib/ # Core modules and helper libraries
├── parser.ts # AST traverser, symbol resolver, and summary compiler
├── rag.ts # TF-IDF vectorizer, cosine similarity retriever, and Ollama adapter
└── storage.ts # File-based database read/write adapter
Follow these instructions to download, install, and run RepoGPT on your local machine.
Ensure you have the following software installed:
- Node.js (v18.x or newer): Essential to run the Next.js development server. Download it from nodejs.org.
- Git CLI: Needed to clone the codebase and ingest target repositories. Download it from git-scm.com. Ensure
gitis added to your environmentPATH. - Ollama (Optional): If you want conversational AI chat capability powered by local LLMs. Download it from ollama.com.
Open your terminal (Command Prompt, PowerShell, or Terminal on macOS/Linux) and run:
# 1. Clone the project code
git clone https://github.com/jaymore4501/RepoGPT.git
# 2. Navigate into the cloned project folder
cd RepoGPT
# 3. Install all necessary dependencies
npm installOnce dependencies are installed, start the local development server:
npm run devYour terminal will print a local address (usually http://localhost:3000). Open this link in your web browser to access the RepoGPT landing page.
To interact with the codebase using conversational AI:
- Launch the Ollama app on your machine.
- Run the following command in a new terminal window to download and run the code-focused model:
ollama run deepseek-coder
- Once the model is loaded, refresh your RepoGPT browser tab. The badge on the Chat page will change to Ollama Active.
- Note: If Ollama is offline or not installed, RepoGPT automatically falls back to Fallback Mode (using TF-IDF syntax extraction) to retrieve relevant files and details.
For a detailed walkthrough on deploying RepoGPT to cloud platforms (DigitalOcean App Platform, Render, Railway) or running it on a self-hosted Ubuntu VPS, please refer to the dedicated Deployment Guide.
Build the optimized production bundle and start the server:
# Build the application
npm run build
# Start the compiled bundle
npm run startThe server will start running on port 3000.
This project is licensed under the MIT License.

