docs: add API key rate limits guide for Kubernetes installation#448
Open
jpshackelford wants to merge 11 commits intomainfrom
Open
docs: add API key rate limits guide for Kubernetes installation#448jpshackelford wants to merge 11 commits intomainfrom
jpshackelford wants to merge 11 commits intomainfrom
Conversation
Add documentation for the per-API-key rate limiting feature in the Runtime API. This feature was implemented in OpenHands/runtime-api PR #457 (APP-1117). The guide covers: - How rate limiting works (per-key, fixed window strategy) - Configuring rate limits when creating or updating API keys - API reference for CreateApiKeyRequest, UpdateApiKeyRequest, ApiKeyResponse - Recommended configurations for different use cases - Monitoring and troubleshooting rate limit issues Co-authored-by: openhands <openhands@all-hands.dev>
28a0111 to
c700f06
Compare
all-hands-bot
requested changes
Apr 10, 2026
Contributor
all-hands-bot
left a comment
There was a problem hiding this comment.
🔴 Needs rework - Critical accuracy issues
Verified against runtime-api source. The API endpoints, authentication mechanism, and response formats documented here do not match the actual implementation. This documentation would fail immediately if anyone tried to use it.
…minute is not set Add warning to make it clear that API keys without max_requests_per_minute configured will have no rate limiting applied. Co-authored-by: openhands <openhands@all-hands.dev>
Remove examples that might imply specific default values. Co-authored-by: openhands <openhands@all-hands.dev>
Remove generic troubleshooting advice and add specific details about the log message format and log level (WARNING) for rate limit events. Co-authored-by: openhands <openhands@all-hands.dev>
Addressed all critical accuracy issues identified in PR review: - Fixed API endpoints: /api/keys → /api/admin/api-keys - Fixed authentication mechanism: documented JWT challenge-response flow instead of incorrect X-Admin-Key header - Fixed key_type enum values: documented 'evaluation' and 'openhands_app' instead of invalid 'production' value - Fixed id field type: UUID string instead of integer - Fixed API key prefix: 'ah-' instead of 'sk-' - Fixed rate limit error message format: '500 per 1 minute' instead of '500/minute' to match limits library output Co-authored-by: openhands <openhands@all-hands.dev>
Completely rewrote the documentation to be practical for cluster administrators: - Added clear explanation of the internal API key architecture (OpenHands Server → Runtime API, not user-facing) - Clarified that this is separate from user API keys (sk-oh-*) - Added step-by-step kubectl commands to retrieve the admin password - Provided complete bash script for the PBKDF2 authentication flow - Showed how to find the existing 'default' key ID - Added guidance on choosing appropriate rate limit values - Included troubleshooting section with kubectl log commands - Removed API reference tables (not needed for this single-key workflow) Co-authored-by: openhands <openhands@all-hands.dev>
Added comprehensive troubleshooting section explaining: - Rate limiting history: hardcoded 100 req/min in older versions vs configurable (no limit by default) in newer versions - Chart version to image tag mapping table (0.2.8 = sha-1a920e8, etc.) - Step-by-step diagnostic commands to check chart version, running image, and error message format - How to identify old vs new rate limiting by error message format - Upgrade instructions to get the new configurable rate limiting This helps administrators who upgraded but are still seeing 429 errors understand that they may be running an older image with hardcoded limits. Co-authored-by: openhands <openhands@all-hands.dev>
Replaced the manual step-by-step instructions with a single, well-commented bash script (set-rate-limit.sh) that administrators can copy and run. The script: - Takes runtime API URL and rate limit as arguments - Retrieves admin password from Kubernetes secret automatically - Handles the PBKDF2 challenge-response authentication - Finds the 'default' API key and shows its current rate limit - Updates the rate limit and confirms the change - Includes clear error messages at each step - Uses extensive comments to explain what each section does This makes it much easier for administrators to configure rate limits without having to understand and execute each step manually. Co-authored-by: openhands <openhands@all-hands.dev>
Added 'Accessing the Runtime API' section explaining two options: Option A: External URL - for deployments with runtime-api ingress enabled Option B: Port-forward - for deployments without external ingress, using kubectl port-forward to svc/oh-main-runtime-api This ensures the script works regardless of whether the Runtime API is exposed externally or only accessible within the cluster. Co-authored-by: openhands <openhands@all-hands.dev>
Rewrote the script to run commands inside the runtime-api pod via kubectl exec: - No longer requires external Runtime API URL - No need for curl or python3 installed locally - Only prerequisite is kubectl access to the cluster - Script finds the runtime-api pod automatically - Runs Python inside the pod (which already has Python installed) - Uses localhost:5000 to connect to the API from within the pod This is much simpler for administrators since it works regardless of whether the Runtime API has external ingress configured. Co-authored-by: openhands <openhands@all-hands.dev>
Added --check flag that allows administrators to inspect the current rate limit configuration without making any changes. This is useful for: - Verifying the current state before making changes - Troubleshooting rate limit issues - Confirming changes after an update Usage: ./set-rate-limit.sh --check # View current limit ./set-rate-limit.sh 500 # Set limit to 500 ./set-rate-limit.sh null # Remove limit Co-authored-by: openhands <openhands@all-hands.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add documentation for the per-API-key rate limiting feature in the Runtime API. This feature was implemented in OpenHands/runtime-api PR #457 (Linear: APP-1117).
Changes
enterprise/k8s-install/rate-limits.mdx- Comprehensive documentation for configuring per-API-key rate limitsdocs.jsonenterprise/k8s-install/index.mdxGuide Contents
The new rate limits guide covers:
max_requests_per_minuteCreateApiKeyRequest,UpdateApiKeyRequest, andApiKeyResponsemodelsRelated
This PR was created by an AI assistant (OpenHands) on behalf of @jpshackelford.
@jpshackelford can click here to continue refining the PR