Skip to content

docs: add API key rate limits guide for Kubernetes installation#448

Open
jpshackelford wants to merge 11 commits intomainfrom
docs/api-key-rate-limits-guide
Open

docs: add API key rate limits guide for Kubernetes installation#448
jpshackelford wants to merge 11 commits intomainfrom
docs/api-key-rate-limits-guide

Conversation

@jpshackelford
Copy link
Copy Markdown
Contributor

Summary

Add documentation for the per-API-key rate limiting feature in the Runtime API. This feature was implemented in OpenHands/runtime-api PR #457 (Linear: APP-1117).

Changes

  • New guide: enterprise/k8s-install/rate-limits.mdx - Comprehensive documentation for configuring per-API-key rate limits
  • Updated navigation: Added the new page to docs.json
  • Updated index: Added card link in enterprise/k8s-install/index.mdx

Guide Contents

The new rate limits guide covers:

  1. Overview - What rate limiting does and why it's useful
  2. How it works - The authentication flow and rate limit enforcement (fixed window strategy)
  3. Configuration examples - curl commands for creating/updating API keys with max_requests_per_minute
  4. API Reference - Full documentation of CreateApiKeyRequest, UpdateApiKeyRequest, and ApiKeyResponse models
  5. Recommended configurations - Examples for:
    • Evaluation/testing keys (100 req/min)
    • Production integration keys (500 req/min)
    • High-volume automation keys (1000 req/min)
  6. Monitoring and troubleshooting - How to track and debug rate limit issues

Related


This PR was created by an AI assistant (OpenHands) on behalf of @jpshackelford.

@jpshackelford can click here to continue refining the PR

@jpshackelford jpshackelford marked this pull request as ready for review April 10, 2026 19:01
Add documentation for the per-API-key rate limiting feature in the Runtime API.
This feature was implemented in OpenHands/runtime-api PR #457 (APP-1117).

The guide covers:
- How rate limiting works (per-key, fixed window strategy)
- Configuring rate limits when creating or updating API keys
- API reference for CreateApiKeyRequest, UpdateApiKeyRequest, ApiKeyResponse
- Recommended configurations for different use cases
- Monitoring and troubleshooting rate limit issues

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Needs rework - Critical accuracy issues

Verified against runtime-api source. The API endpoints, authentication mechanism, and response formats documented here do not match the actual implementation. This documentation would fail immediately if anyone tried to use it.

…minute is not set

Add warning to make it clear that API keys without max_requests_per_minute
configured will have no rate limiting applied.

Co-authored-by: openhands <openhands@all-hands.dev>
Remove examples that might imply specific default values.

Co-authored-by: openhands <openhands@all-hands.dev>
Remove generic troubleshooting advice and add specific details about
the log message format and log level (WARNING) for rate limit events.

Co-authored-by: openhands <openhands@all-hands.dev>
Addressed all critical accuracy issues identified in PR review:

- Fixed API endpoints: /api/keys → /api/admin/api-keys
- Fixed authentication mechanism: documented JWT challenge-response
  flow instead of incorrect X-Admin-Key header
- Fixed key_type enum values: documented 'evaluation' and 'openhands_app'
  instead of invalid 'production' value
- Fixed id field type: UUID string instead of integer
- Fixed API key prefix: 'ah-' instead of 'sk-'
- Fixed rate limit error message format: '500 per 1 minute' instead of
  '500/minute' to match limits library output

Co-authored-by: openhands <openhands@all-hands.dev>
Completely rewrote the documentation to be practical for cluster administrators:

- Added clear explanation of the internal API key architecture (OpenHands
  Server → Runtime API, not user-facing)
- Clarified that this is separate from user API keys (sk-oh-*)
- Added step-by-step kubectl commands to retrieve the admin password
- Provided complete bash script for the PBKDF2 authentication flow
- Showed how to find the existing 'default' key ID
- Added guidance on choosing appropriate rate limit values
- Included troubleshooting section with kubectl log commands
- Removed API reference tables (not needed for this single-key workflow)

Co-authored-by: openhands <openhands@all-hands.dev>
Added comprehensive troubleshooting section explaining:

- Rate limiting history: hardcoded 100 req/min in older versions vs
  configurable (no limit by default) in newer versions
- Chart version to image tag mapping table (0.2.8 = sha-1a920e8, etc.)
- Step-by-step diagnostic commands to check chart version, running image,
  and error message format
- How to identify old vs new rate limiting by error message format
- Upgrade instructions to get the new configurable rate limiting

This helps administrators who upgraded but are still seeing 429 errors
understand that they may be running an older image with hardcoded limits.

Co-authored-by: openhands <openhands@all-hands.dev>
Replaced the manual step-by-step instructions with a single, well-commented
bash script (set-rate-limit.sh) that administrators can copy and run.

The script:
- Takes runtime API URL and rate limit as arguments
- Retrieves admin password from Kubernetes secret automatically
- Handles the PBKDF2 challenge-response authentication
- Finds the 'default' API key and shows its current rate limit
- Updates the rate limit and confirms the change
- Includes clear error messages at each step
- Uses extensive comments to explain what each section does

This makes it much easier for administrators to configure rate limits
without having to understand and execute each step manually.

Co-authored-by: openhands <openhands@all-hands.dev>
Added 'Accessing the Runtime API' section explaining two options:

Option A: External URL - for deployments with runtime-api ingress enabled
Option B: Port-forward - for deployments without external ingress, using
  kubectl port-forward to svc/oh-main-runtime-api

This ensures the script works regardless of whether the Runtime API is
exposed externally or only accessible within the cluster.

Co-authored-by: openhands <openhands@all-hands.dev>
Rewrote the script to run commands inside the runtime-api pod via kubectl exec:

- No longer requires external Runtime API URL
- No need for curl or python3 installed locally
- Only prerequisite is kubectl access to the cluster
- Script finds the runtime-api pod automatically
- Runs Python inside the pod (which already has Python installed)
- Uses localhost:5000 to connect to the API from within the pod

This is much simpler for administrators since it works regardless of
whether the Runtime API has external ingress configured.

Co-authored-by: openhands <openhands@all-hands.dev>
Added --check flag that allows administrators to inspect the current rate
limit configuration without making any changes. This is useful for:

- Verifying the current state before making changes
- Troubleshooting rate limit issues
- Confirming changes after an update

Usage:
  ./set-rate-limit.sh --check    # View current limit
  ./set-rate-limit.sh 500        # Set limit to 500
  ./set-rate-limit.sh null       # Remove limit

Co-authored-by: openhands <openhands@all-hands.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants