Public HTTP API gateway (Rust + axum).
Authenticates clients, validates requests against the OpenAPI contract, and
forwards to the internal gRPC orchestrator. The gateway is stateless (no
database) — per-user API-key lookup and quota counting are delegated to the
orchestrator over gRPC.
client --HTTP/JSON--> gateway --gRPC--> orchestrator --gRPC--> sidecars
The contract lives in FoldForge/proto,
vendored here as a git submodule at ./proto and compiled by build.rs with
tonic-build.
git clone --recurse-submodules git@github.com:FoldForge/gateway.git
cd gateway
cargo run # listens on 0.0.0.0:8080
curl localhost:8080/v1/healthzIf you cloned without submodules:
git submodule update --init --recursive| route | auth | purpose |
|---|---|---|
POST /v1/workflows |
yes | submit a workflow (consumes quota for API-key callers) |
GET /v1/workflows |
yes | list workflows (paginated, ?label=k=v filter) |
GET /v1/workflows/{id} |
yes | get one workflow |
DELETE /v1/workflows/{id} |
yes | cancel a workflow |
POST /v1/workflows/{id}/retry |
yes | re-queue a FAILED workflow |
GET /v1/workflows/{id}/events |
yes | resumable SSE progress stream (Last-Event-ID) |
GET /v1/workflows/{id}/artifacts/{*key} |
yes | stream an artifact the caller's workflow produced (tenant-scoped) |
GET /v1/healthz |
no | shallow liveness |
GET /v1/readyz |
no | deep readiness (probes orchestrator + DB) |
GET /metrics |
no | Prometheus request metrics |
Two credential kinds on the Authorization: Bearer <token> header:
- Static
api_token— admin / break-glass, always valid, not quota-limited. The only credential unlessapi_keys_enabled = true. - Per-user API keys (when enabled) — any other bearer is SHA-256-hashed (the
raw key never leaves the gateway), validated against the orchestrator's key
store via gRPC (cached for
auth_cache_ttl_seconds), andPOST /v1/workflowsis counted against the key's quota. Create keys with the orchestrator CLI:foldforge-orchestrator apikey create <principal> [limit] [window_s].
| var | default | meaning |
|---|---|---|
BIND_ADDR |
0.0.0.0:8080 |
HTTP bind address |
ORCHESTRATOR_ENDPOINT |
http://127.0.0.1:50051 |
gRPC upstream |
API_TOKEN |
dev-token |
static admin / break-glass bearer token |
ORCHESTRATOR_RPC_TIMEOUT_SECONDS |
10 |
per-call gRPC deadline (not the SSE stream) |
API_KEYS_ENABLED |
false |
enable per-user API keys + quotas |
AUTH_CACHE_TTL_SECONDS |
30 |
API-key auth cache TTL (= revocation latency bound) |
AUTH_CACHE_MAX_ENTRIES |
10000 |
auth cache size cap |
OBJECT_STORE_CONNECT_TIMEOUT_SECONDS |
5 |
artifact-store TCP/TLS connect bound |
OBJECT_STORE_READ_TIMEOUT_SECONDS |
30 |
artifact-store per-read (idle) stall bound |
R2_ENDPOINT |
(empty) | object-store endpoint for artifact downloads (empty → the download endpoint returns 503 once the caller is authorized) |
R2_REGION |
auto |
object-store region (sigv4) |
R2_BUCKET |
foldforge |
the single bucket the artifact endpoint serves |
R2_ACCESS_KEY_ID / R2_SECRET_ACCESS_KEY |
(empty) | object-store credentials (never hardcoded) |
(The FOLDFORGE_GATEWAY__ prefix is omitted from the var names above for brevity;
e.g. the bind address is FOLDFORGE_GATEWAY__BIND_ADDR.)
Production-shaped control-plane edge: bearer + per-user-key auth, request
validation, retry, resumable SSE, artifact streaming, Prometheus metrics,
readiness/liveness split, and per-call gRPC deadlines. Reads/mutations and artifact
downloads are tenant-scoped — the caller's principal is forwarded to the
orchestrator as ff-principal metadata, and a non-owner gets 404 (no existence
leak). Errors are structured {code, message, retryable}: actionable 4xx messages
reach the caller verbatim, while 5xx return a stable "internal server error" and
log the real cause server-side (no internal detail leaks). See
../foldforge/docs/MILESTONE-hardening.md
and MILESTONE-hardening-2.md for the hardening history.
MIT