docs: clarify public-URL requirement for custom inference endpoints#179
docs: clarify public-URL requirement for custom inference endpoints#179hongyi-chen wants to merge 1 commit into
Conversation
Surface the public-reachability requirement up front and correct the "internal gateways" framing that led customers (e.g. Octopus Energy with an internal LiteLLM proxy) to expect internal-only endpoints to work. - Add an up-front statement that the endpoint must be reachable at a public URL, calling out internal-only services like a private LiteLLM proxy as rejected. - Qualify "internal gateway" mentions in the frontmatter description, Key features, and How it works with the public-URL requirement. - Rename "Using local models" to "Network requirements" and broaden it to cover internal gateways/proxies in addition to local models. Co-Authored-By: Oz <oz-agent@warp.dev>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR clarifies that custom inference endpoints must be reachable from Warp's servers at a public URL and updates the internal gateway/local model guidance accordingly. The heading rename does not appear to break existing anchors, and the changed links/anchors map to headings in the diff.
Concerns
- Add a brief access-control caveat to the internal gateway exposure guidance so readers do not publish an unauthenticated private proxy while satisfying the public URL requirement.
Verdict
Found: 0 critical, 0 important, 1 suggestions
Approve with nits
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| This requirement applies to any endpoint that isn't already publicly accessible: | ||
|
|
||
| To route through a model running on your own machine (for example, Ollama, LM Studio, vLLM, or llama.cpp), expose it through a tunneling service like [ngrok](https://ngrok.com/) and use the public tunnel URL as the base URL in your endpoint configuration. | ||
| * **Internal gateways and proxies** - An internal LiteLLM proxy, corporate AI gateway, or other service that only resolves inside your private network or VPN can't be reached by Warp. Expose it at a public HTTPS URL — for example, through a load balancer, an API gateway, or a tunneling service — before configuring it in Warp. |
There was a problem hiding this comment.
💡 [SUGGESTION] [SECURITY] Preserve authentication or access controls when telling readers to expose an internal gateway publicly, so the fix for reachability does not imply publishing an unauthenticated proxy.
| * **Internal gateways and proxies** - An internal LiteLLM proxy, corporate AI gateway, or other service that only resolves inside your private network or VPN can't be reached by Warp. Expose it at a public HTTPS URL — for example, through a load balancer, an API gateway, or a tunneling service — before configuring it in Warp. | |
| * **Internal gateways and proxies** - An internal LiteLLM proxy, corporate AI gateway, or other service that only resolves inside your private network or VPN can't be reached by Warp. Expose it at a public HTTPS URL, and keep the public endpoint authenticated or protected by access controls — for example, through a load balancer, an API gateway, or a tunneling service — before configuring it in Warp. |
Summary
Improves the Custom inference endpoint docs page to fix confusion raised by Octopus Energy (and flagged by Trevor + Hong Yi): the page promoted "internal gateways / internal infrastructure you already run," which led customers to expect an internal-only LiteLLM proxy to work. Because requests route through Warp's servers, the endpoint must be reachable at a public URL — that requirement was buried mid-page under "Using local models."
Changes
No anchors elsewhere link to
#using-local-models, so the heading rename is safe. Style lint passes with no new issues.Conversation: https://staging.warp.dev/conversation/c53a093e-9d6c-4bb0-9089-9b0c8fe0e738
Run: https://oz.staging.warp.dev/runs/019e9398-8a60-77a5-bc9b-703a7efe858d
This PR was generated with Oz.