| 12+ LLM Providers |
60-80% Cost Reduction |
699 Tests Passing |
0 Code Changes Required |
npm install -g lynkrFirst run creates a .env file. Edit it with your provider settings.
Option A: Free & Local (Ollama) - Recommended for Testing
# Install Ollama first: https://ollama.com
ollama pull qwen2.5-coder:latestCreate/edit .env in your project directory:
# Provider
MODEL_PROVIDER=ollama
FALLBACK_ENABLED=false
# Ollama Configuration
OLLAMA_ENDPOINT=http://localhost:11434
OLLAMA_MODEL=qwen2.5-coder:latest
# Server
PORT=8081
# Optional: Limits (remove for unlimited)
POLICY_MAX_STEPS=50
POLICY_MAX_TOOL_CALLS=100
# Disable overly strict command filtering
POLICY_SAFE_COMMANDS_ENABLED=falseOption B: Cloud (OpenRouter) - Recommended for Production
# Get API key from https://openrouter.aiCreate/edit .env:
# Provider
MODEL_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-v1-your-key-here
FALLBACK_ENABLED=false
# Server
PORT=8081
# Optional: Limits (remove for unlimited)
POLICY_MAX_STEPS=50
POLICY_MAX_TOOL_CALLS=100
# Optional: Enable caching
PROMPT_CACHE_ENABLED=true
SEMANTIC_CACHE_ENABLED=trueOption C: Enterprise (AWS Bedrock)
Create/edit .env:
# Provider
MODEL_PROVIDER=bedrock
AWS_BEDROCK_API_KEY=your-aws-key
AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0
FALLBACK_ENABLED=false
# Server
PORT=8081
# Optional: Limits (remove for unlimited)
POLICY_MAX_STEPS=50
POLICY_MAX_TOOL_CALLS=100Option D: Enterprise (Databricks)
Create/edit .env:
# Provider
MODEL_PROVIDER=databricks
DATABRICKS_API_BASE=https://your-workspace.cloud.databricks.com
DATABRICKS_API_KEY=your-token
FALLBACK_ENABLED=false
# Server
PORT=8081
# Optional: Limits (remove for unlimited)
POLICY_MAX_STEPS=50
POLICY_MAX_TOOL_CALLS=100Then start Lynkr:
lynkr startClaude Code
Windows (Command Prompt):
set ANTHROPIC_BASE_URL=http://localhost:8081
set ANTHROPIC_API_KEY=dummy
claude "write a hello world in python"Linux/macOS:
export ANTHROPIC_BASE_URL=http://localhost:8081
export ANTHROPIC_API_KEY=dummy
claude "write a hello world in python"Cursor IDE
- Settings → Models → Override Base URL
- Set to:
http://localhost:8081/v1 - API Key:
any-value
Codex CLI
Edit ~/.codex/config.toml:
model_provider = "lynkr"
[model_providers.lynkr]
base_url = "http://localhost:8081/v1"
wire_api = "responses"✅ Done! Your AI tool now uses your chosen provider.
Problem: You're running an older version (< 9.3.0).
Solution: Update to the latest version:
npm install -g lynkr@latestIf you must use an older version, set NODE_ENV=production before starting.
This is just a warning - you can ignore it. Tier routing is optional.
To remove the warning, add to .env:
TIER_SIMPLE=ollama:qwen2.5-coder:latest
TIER_MEDIUM=ollama:qwen2.5-coder:latest
TIER_COMPLEX=ollama:qwen2.5-coder:latest
TIER_REASONING=ollama:qwen2.5-coder:latestSolution: Add to .env:
FALLBACK_ENABLED=falseProblem: Ollama is not running.
Solution:
ollama serveKeep this terminal open, and start Lynkr in a new terminal.
Problem: Lynkr is not running or wrong port.
Solution: Check Lynkr is running on the correct port:
curl http://localhost:8081/Should return: {"service":"Lynkr","version":"9.x.x","status":"running"}
AI coding tools lock you into one provider. Lynkr breaks that lock.
Claude Code / Cursor / Codex / Cline / Continue
↓
Lynkr
↓
Ollama | Bedrock | Azure | OpenRouter | OpenAI
What you get:
- ✅ Use free local models (Ollama, llama.cpp) with Claude Code
- ✅ Route through your company's infrastructure (Databricks, Azure, Bedrock)
- ✅ Cut costs 60-80% with smart token optimization
- ✅ Zero code changes - just change one environment variable
| Provider | Type | Example Models | Cost |
|---|---|---|---|
| Ollama | Local | qwen2.5-coder, deepseek-coder, llama3 | Free |
| llama.cpp | Local | Any GGUF model | Free |
| LM Studio | Local | Local models with GUI | Free |
| OpenRouter | Cloud | GPT-4o, Claude 3.5, Llama 3, Gemini | $ |
| AWS Bedrock | Cloud | Claude, Llama, Mistral, Titan | $$ |
| Databricks | Cloud | Claude Sonnet 4.5, Opus 4.6 | $$$ |
| Azure OpenAI | Cloud | GPT-4o, o1, o3 | $$$ |
| Azure Anthropic | Cloud | Claude Sonnet, Opus | $$$ |
| OpenAI | Cloud | GPT-4o, o3-mini | $$$ |
| DeepSeek | Cloud | DeepSeek R1, Reasoner | $ |
4 local providers for 100% offline, free usage. 10+ cloud providers for scale.
Route different request types to different models automatically:
# .env file
MODEL_PROVIDER=ollama
FALLBACK_ENABLED=false
# Use small/fast models for simple tasks
TIER_SIMPLE=ollama:qwen2.5:3b
# Use medium models for normal coding
TIER_MEDIUM=ollama:qwen2.5:7b
# Use powerful models for complex architecture
TIER_COMPLEX=ollama:deepseek-r1:14b
TIER_REASONING=ollama:deepseek-r1:14b
# Optional: Limits (remove for unlimited) for long conversations
POLICY_MAX_STEPS=50
POLICY_MAX_TOOL_CALLS=100Lynkr analyzes each request and routes it to the appropriate tier. Simple questions use fast models. Complex refactoring uses powerful models.
Result: 70-90% of requests use cheaper/faster models. Only hard problems hit expensive models.
Copy-paste ready configuration for immediate use:
# .env - Minimal Ollama Setup
# ============================================
# REQUIRED: Provider Configuration
# ============================================
MODEL_PROVIDER=ollama
FALLBACK_ENABLED=false
# ============================================
# REQUIRED: Ollama Settings
# ============================================
OLLAMA_ENDPOINT=http://localhost:11434
OLLAMA_MODEL=qwen2.5-coder:latest
# ============================================
# REQUIRED: Server Configuration
# ============================================
PORT=8081
HOST=0.0.0.0
# ============================================
# REQUIRED: Claude Code/Cursor Compatibility
# ============================================
POLICY_MAX_STEPS=50
POLICY_MAX_TOOL_CALLS=100
POLICY_SAFE_COMMANDS_ENABLED=false
# ============================================
# OPTIONAL: Performance (Recommended)
# ============================================
LOG_LEVEL=warn
LOAD_SHEDDING_ENABLED=true
LOAD_SHEDDING_HEAP_THRESHOLD=0.85Steps:
- Install Ollama:
curl -fsSL https://ollama.com/install.sh | sh - Pull model:
ollama pull qwen2.5-coder:latest - Copy above to
.envin your project directory - Run:
lynkr start
Optimized for cost savings with smart routing:
# .env - Production OpenRouter Setup
# ============================================
# REQUIRED: Provider Configuration
# ============================================
MODEL_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-v1-your-key-here
FALLBACK_ENABLED=false
# ============================================
# REQUIRED: Server Configuration
# ============================================
PORT=8081
HOST=0.0.0.0
# ============================================
# TIER ROUTING: Smart Cost Optimization
# ============================================
# Simple queries → Cheap/fast model
TIER_SIMPLE=openrouter:google/gemini-flash-1.5
# Normal coding → Balanced model
TIER_MEDIUM=openrouter:anthropic/claude-3.5-sonnet
# Complex refactoring → Powerful model
TIER_COMPLEX=openrouter:anthropic/claude-opus-4
# Deep reasoning → Most capable model
TIER_REASONING=openrouter:anthropic/claude-opus-4
# ============================================
# REQUIRED: Claude Code/Cursor Compatibility
# ============================================
POLICY_MAX_STEPS=50
POLICY_MAX_TOOL_CALLS=100
POLICY_SAFE_COMMANDS_ENABLED=false
# ============================================
# OPTIONAL: Token Optimization (60-80% savings)
# ============================================
PROMPT_CACHE_ENABLED=true
SEMANTIC_CACHE_ENABLED=true
SEMANTIC_CACHE_THRESHOLD=0.95
TOOL_INJECTION_ENABLED=false
# ============================================
# OPTIONAL: Performance Tuning
# ============================================
LOG_LEVEL=warn
LOAD_SHEDDING_ENABLED=true
LOAD_SHEDDING_HEAP_THRESHOLD=0.85Expected savings: 70-90% of requests use Gemini Flash (
For teams using Databricks Model Serving:
# .env - Enterprise Databricks Setup
# ============================================
# REQUIRED: Provider Configuration
# ============================================
MODEL_PROVIDER=databricks
DATABRICKS_API_BASE=https://your-workspace.cloud.databricks.com
DATABRICKS_API_KEY=dapi1234567890abcdef
FALLBACK_ENABLED=false
# ============================================
# REQUIRED: Model Configuration
# ============================================
# Option 1: Single model (no tier routing)
DATABRICKS_MODEL=databricks-meta-llama-3-1-405b-instruct
# Option 2: Tier routing (comment out above, uncomment below)
# TIER_SIMPLE=databricks:databricks-meta-llama-3-1-70b-instruct
# TIER_MEDIUM=databricks:databricks-claude-sonnet-4-5
# TIER_COMPLEX=databricks:databricks-claude-opus-4-6
# TIER_REASONING=databricks:databricks-claude-opus-4-6
# ============================================
# REQUIRED: Server Configuration
# ============================================
PORT=8081
HOST=0.0.0.0
# ============================================
# REQUIRED: Claude Code/Cursor Compatibility
# ============================================
POLICY_MAX_STEPS=50
POLICY_MAX_TOOL_CALLS=100
POLICY_SAFE_COMMANDS_ENABLED=false
# ============================================
# OPTIONAL: Enterprise Features
# ============================================
LOG_LEVEL=info
LOAD_SHEDDING_ENABLED=true
LOAD_SHEDDING_HEAP_THRESHOLD=0.85
# Optional: Metrics for monitoring
# PROMETHEUS_METRICS_ENABLED=trueUse free Ollama, fallback to cloud when needed:
# .env - Hybrid Setup (Advanced)
# ============================================
# PRIMARY: Local Ollama
# ============================================
MODEL_PROVIDER=ollama
OLLAMA_ENDPOINT=http://localhost:11434
OLLAMA_MODEL=qwen2.5-coder:latest
# ============================================
# FALLBACK: Cloud Provider
# ============================================
FALLBACK_ENABLED=true
FALLBACK_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-v1-your-key-here
# ============================================
# TIER ROUTING: Mix Local + Cloud
# ============================================
TIER_SIMPLE=ollama:qwen2.5:3b
TIER_MEDIUM=ollama:qwen2.5:7b
TIER_COMPLEX=openrouter:anthropic/claude-3.5-sonnet
TIER_REASONING=openrouter:anthropic/claude-opus-4
# ============================================
# REQUIRED: Server Configuration
# ============================================
PORT=8081
HOST=0.0.0.0
# ============================================
# REQUIRED: Claude Code/Cursor Compatibility
# ============================================
POLICY_MAX_STEPS=50
POLICY_MAX_TOOL_CALLS=100
POLICY_SAFE_COMMANDS_ENABLED=false
# ============================================
# OPTIONAL: Performance
# ============================================
LOG_LEVEL=warn
LOAD_SHEDDING_ENABLED=trueBest of both worlds: 80% of requests stay local (free). Complex tasks use cloud (paid).
| Issue | Solution |
|---|---|
| "Service temporarily overloaded" | Ollama model too large for RAM. Use smaller model or increase --max-old-space-size |
| "Route not found: HEAD /" | Ignore - harmless health check from Claude Code |
| "Hallucinated tool calls" | Normal - Lynkr automatically filters invalid tools |
| "Safe Command DSL blocked" | Add POLICY_SAFE_COMMANDS_ENABLED=false to .env |
| "spawn graphify ENOENT" | Optional feature. Set CODE_GRAPH_ENABLED=false in .env (see Advanced Features section for installation) |
| Slow first request (20+ sec) | Ollama loading model into memory. Add OLLAMA_KEEP_ALIVE=30m in Ollama config |
| No response after N turns | Remove POLICY_MAX_STEPS and POLICY_MAX_TOOL_CALLS from .env (unlimited by default in v9.3.0+) |
# Enable all optimizations
PROMPT_CACHE_ENABLED=true
SEMANTIC_CACHE_ENABLED=true
TOOL_INJECTION_ENABLED=false
CODE_MODE_ENABLED=trueMEMORY_ENABLED=true
MEMORY_TTL=3600000 # 1 hourLOAD_SHEDDING_ENABLED=true
LOAD_SHEDDING_HEAP_THRESHOLD=0.85curl -X POST http://localhost:8081/v1/admin/reloadGraphify provides AST-based code analysis for smarter routing decisions.
Installation (Rust required):
# Install Rust if not already installed
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env
# Build and install graphify
git clone https://github.com/safishamsi/graphify
cd graphify
cargo build --release
sudo cp target/release/graphify /usr/local/bin/
# Verify installation
graphify --versionEnable in .env:
CODE_GRAPH_ENABLED=true
CODE_GRAPH_WORKSPACE=/path/to/your/project # Optional, defaults to cwdFeatures:
- AST-based complexity scoring
- Structural code analysis (19 languages supported)
- Enhanced routing decisions based on code structure
Note: Graphify is completely optional. If not installed, Lynkr falls back to simpler complexity analysis.
NPM (recommended)
npm install -g lynkrOne-line installer
curl -fsSL https://raw.githubusercontent.com/Fast-Editor/Lynkr/main/install.sh | bashHomebrew
brew tap vishalveerareddy123/lynkr
brew install lynkrDocker
git clone https://github.com/Fast-Editor/Lynkr.git
cd Lynkr
docker-compose up -dFrom source
git clone https://github.com/Fast-Editor/Lynkr.git
cd Lynkr
npm install
cp .env.example .env
npm start| Guide | Description |
|---|---|
| Installation | All installation methods |
| Provider Setup | Configuration for all 12+ providers |
| Claude Code | Claude Code CLI integration |
| Cursor IDE | Cursor setup + troubleshooting |
| Codex CLI | Codex configuration |
| Tier Routing | Smart model routing by complexity |
| Token Optimization | 60-80% cost reduction |
| Troubleshooting | Common issues and solutions |
| API Reference | REST API endpoints |
| Production | Enterprise deployment |
| Scenario | Direct Anthropic | Lynkr + Ollama | Lynkr + OpenRouter |
|---|---|---|---|
| Daily coding (8h) | $10-30/day | $0 (free) | $2-8/day |
| Monthly (heavy use) | $300-900 | $0 | $60-240 |
With tier routing + token optimization: additional 60-80% savings on cloud providers.
| Feature | Lynkr | LiteLLM | OpenRouter | PortKey |
|---|---|---|---|---|
| Setup | npm install -g lynkr |
Python + Docker + Postgres | Account signup | Docker stack |
| Claude Code native | ✅ Drop-in | ❌ | ||
| Cursor native | ✅ Drop-in | ❌ | ||
| Local models | Ollama, llama.cpp, LM Studio, MLX | Ollama only | ❌ | ❌ |
| Tier routing | Auto complexity-based | ❌ Manual | Cost-based only | ❌ Manual |
| Token optimization | 60-80% built-in | ❌ | ❌ | Cache only |
| Self-hosted | ✅ Node.js only | ✅ Python stack | ❌ SaaS | ✅ Docker |
| Dependencies | Node.js 20+ | Python, Prisma, PostgreSQL | None | Docker, Python |
Lynkr's edge: Purpose-built for AI coding tools. Zero-config for Claude Code, Cursor, and Codex. Installs in one command, runs anywhere Node.js runs.
- GitHub Discussions — Ask questions
- Report Issues — Bug reports
- NPM Package — Official releases
- DeepWiki — AI-powered docs
Apache 2.0 — See LICENSE.
Built by Vishal Veera Reddy for developers who want control over their AI tools.