SwitchIt is a lightweight, OpenAI-compatible AI gateway. I built it because I needed a simple gateway for my projects running on resource-constrained hardware like Set-Top Boxes (STB).
- OpenAI Compatible: Exposes
/v1/chat/completionsand/v1/modelsendpoints, making it easy to point my coding tools or OpenAI clients to it. - Gemini Translation: Translates request and response structures (including SSE streams) to Google Gemini API.
- OpenAI Pass-Through: Proxies requests to other OpenAI-compatible upstreams (e.g., AgentRouter, LiteLLM) when I want to use other models.
- Custom Headers: Can send custom headers (like
User-Agent) to upstreams if needed (disabled by default). - Priority & Failover: Orders providers by priority. If a high-priority provider fails (e.g. rate limit 429 or auth error), it automatically falls over to the next provider in the list.
- Budget Control: Tracks daily and monthly spend in USD. Blocks requests with a
429error once budgets are hit to prevent runaway coding agent bills. - SQLite Request Logs: Logs requests in a local SQLite database, automatically pruning entries older than 7 days (configurable) to save space.
- TUI Dashboard: A terminal-based monitor (
switchit-tui) that displays real-time spend gauges, token usage, and recent request history. - Lightweight: Built in Rust and runs on a single-threaded Tokio event loop to keep memory usage under 15MB at idle.
- Hot-Reload: Polls the config file for modification changes every 5 seconds and reloads config dynamically without restarting the server.
switchit/
├── Cargo.toml # Workspace configuration
├── config.example.toml # Template config file
├── config.toml # Local config (gitignored)
├── switchit.service # systemd unit file for Linux
└── crates/
├── switchit-common/ # Shared structs & types
├── switchit-daemon/ # Axum gateway & provider handlers
└── switchit-tui/ # Ratatui terminal dashboard
- Priority Routing: Providers are sorted by priority. The gateway always uses the highest-priority provider first to hit prompt context caching.
- Non-retryable Errors: If the gateway gets a rate limit (429) or authentication error (401/403), it immediately falls over to the next provider.
- Retryable Errors: For connection timeouts or 5xx server errors, it retries with exponential backoff before falling over.
To compile:
cargo build --workspaceTo build a size-optimized production binary:
cargo build --release --workspaceCopy the template config file:
cp config.example.toml config.tomlEdit config.toml to add keys and adjust daily/monthly budgets or logs retention:
[server]
listen = "127.0.0.1:3000"
log_level = "info"
ctl_port = 3001
[storage]
backend = "sqlite"
path = "switchit.db"
retention_days = 7
[limits]
daily_budget = 1.00
monthly_budget = 20.00
usage_file = "usage.json"export GEMINI_API_KEY="AIzaSy..."
./target/debug/switchit-daemon --config config.tomlTo monitor stats, token usage, and logs in real-time:
./target/debug/switchit-tui --config config.tomlcurl http://127.0.0.1:3000/health
# Response: "ok"curl http://127.0.0.1:3000/v1/models -H "Authorization: Bearer sk-local-key"curl http://127.0.0.1:3000/v1/chat/completions \
-H "Authorization: Bearer sk-local-key" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-v4-flash",
"messages": [{"role": "user", "content": "1+1="}]
}'SwitchIt exposes a standard OpenAI-compatible API at http://<IP>:3000/v1 (offering /v1/chat/completions and /v1/models). This allows you to easily connect it to any AI coding assistant, IDE extension, or CLI tool that supports custom OpenAI endpoints.
-
Kilo Code / Cline:
- Open settings in the extension panel.
- Choose OpenAI Compatible (or Custom Provider) as the API provider.
- Set Base URL to
http://10.0.0.15:3000/v1(replace with your gateway's IP). - Set API Key to
sk-local-key(or leave blank if auth is disabled in yourconfig.toml). - Select or type your desired model (e.g.,
gemini-2.5-flash).
-
Continue: Add the following block to your
~/.continue/config.json:{ "models": [ { "title": "SwitchIt Gateway", "provider": "openai", "model": "gemini-2.5-flash", "apiBase": "http://10.0.0.15:3000/v1", "apiKey": "sk-local-key" } ] }
Claude Code (claude) can be configured to point to a custom API proxy or gateway by setting environment variables:
- Via environment variables (per-session):
export ANTHROPIC_BASE_URL="http://10.0.0.15:3000/v1" export ANTHROPIC_AUTH_TOKEN="sk-local-key" # (if auth is enabled) claude
- Via settings file (persistent) (
~/.claude/settings.json):{ "env": { "ANTHROPIC_BASE_URL": "http://10.0.0.15:3000/v1", "ANTHROPIC_AUTH_TOKEN": "sk-local-key" } }
Aider natively supports custom OpenAI-compatible backends:
export OPENAI_API_BASE="http://10.0.0.15:3000/v1"
export OPENAI_API_KEY="sk-local-key"
aider --model openai/gemini-2.5-flashIf you are using a client or CLI tool that wraps the standard Google Gemini SDK and respects custom endpoints via environment variables:
export GOOGLE_GEMINI_BASE_URL="http://10.0.0.15:3000/v1"
export GEMINI_API_KEY="sk-local-key"