diff --git a/src/pages/docs/self-hosting.mdx b/src/pages/docs/self-hosting.mdx index dd3c9fbf..e2dd80c7 100644 --- a/src/pages/docs/self-hosting.mdx +++ b/src/pages/docs/self-hosting.mdx @@ -1,49 +1,28 @@ --- -title: "Self-Hosting Future AGI: Deploy on Your Own Infrastructure" -description: "Deploy the full Future AGI platform on your own infrastructure using Docker Compose. Follow the step-by-step guide to get all services running locally." +title: "Self-hosting Future AGI" +description: "Run the entire Future AGI platform on your own infrastructure with Docker Compose. Your traces, datasets, evaluations, and model calls stay inside your network" --- -## About +Future AGI is fully open-source. Self-hosting runs the **entire stack on your own machines**, so all traces, datasets, evaluations, and model calls stay within your network. The backend is Django, the frontend is React + Vite, and the LLM gateway is Go, all deployed together with Docker Compose -Future AGI is fully open-source. Self-hosting runs the entire stack on your machines — all traces, datasets, evaluations, and model calls stay within your network. Backend is Django, frontend is React + Vite, LLM gateway is Go. +## When to self-host -Not sure if you need this? The hosted version at [app.futureagi.com](https://app.futureagi.com) is easier to operate. Self-host when you need **data residency**, **air-gapped environments**, **cost control at scale**, or **deep customization**. +The [**cloud hosted version**](https://app.futureagi.com) is the easiest way to run Future AGI, with nothing to operate. Self-host when you need: -## Quick start +- **Data residency**: keep all data inside your own network +- **Air-gapped environments**: run with no outbound dependencies +- **Cost control at scale**: own the infrastructure +- **Deep customization**: modify the open-source stack to fit your needs -```bash -git clone https://github.com/future-agi/future-agi.git -cd future-agi -cp .env.example .env -docker pull futureagi/future-agi:v1.8.19_base -docker compose up -``` - -First boot builds from source (~10–15 min). After `Application startup complete`: - -| Service | URL | -|---|---| -| Frontend | http://localhost:3000 | -| Backend API | http://localhost:8000 | -| PeerDB UI | http://localhost:3001 — `peerdb` / `peerdb` | - -## Deployment options - -| Option | Status | -|---|---| -| Docker Compose | Available | -| Helm / Kubernetes | Coming soon | -| Air-gapped | Coming soon | +## What you deploy -## Architecture - -21 containers across four layers. +Self-hosting brings up the full platform (around **21 containers, with no external dependencies**) across four layers: ``` Browser └─ frontend (React/nginx) - └─ backend (Django) ──── gateway (Go) ──── OpenAI · Anthropic · Gemini · Bedrock - ├── postgres primary DB + WAL replication + └─ backend (Django) ──── gateway (Go) ──── OpenAI · Anthropic · Gemini · Bedrock + ├── postgres primary database ├── clickhouse analytics store ├── redis cache / pub-sub ├── minio object storage @@ -52,51 +31,31 @@ Browser postgres ──── PeerDB CDC ──── clickhouse (continuous replication) ``` -**Application** — `frontend` · `backend` · `worker` · `gateway` · `serving` · `code-executor` - -**Data** — `postgres` · `clickhouse` · `redis` · `minio` +- **Application**: `frontend`, `backend`, `worker`, `gateway`, `serving`, `code-executor` +- **Data**: `postgres`, `clickhouse`, `redis`, `minio` +- **Workflow**: `temporal` +- **CDC**: PeerDB (continuous Postgres → ClickHouse replication) -**Workflow** — `temporal` +Everything runs on your machines; nothing leaves your network. The full service-by-service breakdown lives in [Configure](/docs/self-hosting/configure) -**CDC (PeerDB)** — `peerdb-catalog` · `peerdb-temporal` · `peerdb-minio` · `peerdb-flow-api` · `peerdb-flow-worker` · `peerdb-flow-snapshot-worker` · `peerdb-server` · `peerdb-ui` · `peerdb-temporal-init` · `peerdb-init` +## Deployment options -| Layer | Service | Purpose | -|---|---|---| -| App | `frontend` | React SPA served by nginx | -| App | `backend` | Django REST + gRPC + WebSocket API | -| App | `worker` | Temporal worker — evals, agent loops, data jobs | -| App | `gateway` | Go LLM proxy — routing, retries, rate limits, logging | -| App | `serving` | Embeddings and small model inference | -| App | `code-executor` | nsjail-sandboxed eval code runner (`privileged: true` required) | -| Data | `postgres` | Primary DB — users, traces, datasets, evals, prompts | -| Data | `clickhouse` | Analytics DB — replicated from Postgres via PeerDB | -| Data | `redis` | Cache, rate limits, WebSocket pub/sub | -| Data | `minio` | S3-compatible object storage (swap for S3 in prod) | -| Workflow | `temporal` | Durable workflow engine — shares main Postgres | -| CDC | PeerDB stack | Continuous Postgres → ClickHouse replication (10 services) | +| Option | Status | +|---|---| +| Docker Compose | Available | +| Helm / Kubernetes | Coming soon | +| Air-gapped | Coming soon | -## Next Steps +## Where to go next - - - Hardware tiers, platform compatibility, ports reference. - - - Setup, deployment modes, day-to-day operations. - - - Full `.env` reference — secrets, ports, flags, keys. - - - LLM gateway providers, PeerDB mirrors, Temporal workers. - - - Create accounts via email or Django shell. + + + System requirements, prerequisites, environment variables, and setup - - Hardening, backups, monitoring, upgrades. + + Fixes for common errors and answers to frequent questions - - Solutions for every known error. + + Get help from the Future AGI team and community diff --git a/src/pages/docs/self-hosting/configuration/system.mdx b/src/pages/docs/self-hosting/configuration/system.mdx new file mode 100644 index 00000000..11be1545 --- /dev/null +++ b/src/pages/docs/self-hosting/configuration/system.mdx @@ -0,0 +1,143 @@ +--- +title: "System Configuration" +description: "Configure the moving parts beyond .env — the LLM gateway config.yaml with provider keys, PeerDB Postgres-to-ClickHouse replication mirrors, and Temporal worker concurrency." +--- + +## Introduction + +A few parts of the stack are configured outside `.env`: the LLM gateway needs a `config.yaml` listing its providers, PeerDB needs its replication mirrors running, and Temporal workers can be tuned for throughput. This page covers all three. Set your secrets and provider keys in [Environment Variables](/docs/self-hosting/configuration/environment) first — the config here references them. + +## LLM Gateway + +The gateway is a Go proxy that routes every model call the platform makes. It reads a `config.yaml` that lists which providers it may use and which models each exposes. + + +Model calls fail until this file exists. The gateway ships with `config.example.yaml` (OpenAI enabled) but **not** a live `config.yaml` — you create one in the steps below. + + + + +```bash +cp futureagi/agentcc-gateway/config.example.yaml \ + futureagi/agentcc-gateway/config.yaml +``` + + + +Edit `config.yaml` — uncomment the providers you want and reference their keys with `${VAR}` interpolation. Set the matching keys (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, …) in `.env`. See the provider examples below. + + + +Point the gateway volume at your `config.yaml` in `docker-compose.yml`: + +```yaml +volumes: + - ./futureagi/agentcc-gateway/config.yaml:/app/config.yaml:ro +``` + +```bash +docker compose up -d --force-recreate gateway +``` + + + + +`config.yaml` is gitignored and holds live API keys. Treat it as a secret — never commit it. + + +### Provider Examples + + + +```yaml +providers: + openai: + api_key: "${OPENAI_API_KEY}" + api_format: "openai" + models: [gpt-4o, gpt-4o-mini] + + anthropic: + api_key: "${ANTHROPIC_API_KEY}" + api_format: "anthropic" + models: [claude-opus-4-5, claude-sonnet-4-5] + + gemini: + api_key: "${GOOGLE_API_KEY}" + api_format: "gemini" + models: [gemini-2.0-flash, gemini-1.5-pro] +``` + + +```yaml +providers: + bedrock: + api_key: "${AWS_SECRET_ACCESS_KEY}" + api_format: "bedrock" + region: "${AWS_REGION}" + access_key: "${AWS_ACCESS_KEY_ID}" + models: [anthropic.claude-3-5-sonnet-20241022-v2:0] +``` + + +```yaml +providers: + vertex: + base_url: "https://us-central1-aiplatform.googleapis.com" + api_key: "${GOOGLE_ACCESS_TOKEN}" + api_format: "gemini" + headers: + x-gcp-project: "${GCP_PROJECT_ID}" + x-gcp-location: "us-central1" + models: [gemini-2.0-flash-001] +``` + +Vertex uses a Bearer token, not a static API key. Rotate `GOOGLE_ACCESS_TOKEN` with a sidecar that calls `gcloud auth print-access-token`. + + + +For routing rules, rate limits, caching, and the full config reference, see [Agent Command Center → Self-Hosted](/docs/command-center/deployment/self-hosted). + +## PeerDB Replication + +PeerDB continuously replicates Postgres tables into ClickHouse (change-data-capture) so trace and eval analytics stay fast. It runs on its own — the only thing you typically touch is a first-boot timing fix. + + +**First-boot timing.** `peerdb-init` runs the moment the stack starts, sometimes before Django has finished its migrations. If mirrors show "not started" in the PeerDB UI, re-run init once the backend is up: + +```bash +docker compose logs -f backend # wait for "Application startup complete" +docker compose run --rm peerdb-init bash /setup.sh # re-run init +``` + + +Verify at [http://localhost:3001](http://localhost:3001) — mirrors should move to `running` within seconds. Re-run the same init command after any upgrade that changes replicated tables. + +## Temporal Workers + +Temporal runs the platform's background jobs and evaluation pipelines. How those jobs are distributed across workers depends on one flag. + +**All-queue (default).** One worker polls every task queue. Controlled by `TEMPORAL_ALL_QUEUES=true` in `.env`. This is the right setup for most self-hosted deployments. + +**Per-queue (dev overlay).** Six dedicated workers, one per queue, brought up by the [dev overlay](/docs/self-hosting/install#dev-overlay): + +| Service | Queue | Typical concurrency | +|---|---|---| +| `worker-default` | `default` | 100 | +| `worker-tasks-s` | `tasks_s` | 200 | +| `worker-tasks-l` | `tasks_l` | 50 | +| `worker-tasks-xl` | `tasks_xl` | 10 | +| `worker-trace-ingestion` | `trace_ingestion` | 100 | +| `worker-agent-compass` | `agent_compass` | 50 | + +Tune throughput with `TEMPORAL_MAX_CONCURRENT_ACTIVITIES` and `TEMPORAL_MAX_CONCURRENT_WORKFLOW_TASKS` in `.env`. The Temporal UI is available in dev mode at [http://localhost:8085](http://localhost:8085). + +## Dive Deeper + + + + Hardening, backups, and monitoring before going live. + + + Fixes for gateway, PeerDB, and Temporal errors. + +