Skip to content

docs: clarify public-URL requirement for custom inference endpoints#179

Open
hongyi-chen wants to merge 1 commit into
mainfrom
docs/clarify-custom-inference-public-url
Open

docs: clarify public-URL requirement for custom inference endpoints#179
hongyi-chen wants to merge 1 commit into
mainfrom
docs/clarify-custom-inference-public-url

Conversation

@hongyi-chen
Copy link
Copy Markdown
Collaborator

Summary

Improves the Custom inference endpoint docs page to fix confusion raised by Octopus Energy (and flagged by Trevor + Hong Yi): the page promoted "internal gateways / internal infrastructure you already run," which led customers to expect an internal-only LiteLLM proxy to work. Because requests route through Warp's servers, the endpoint must be reachable at a public URL — that requirement was buried mid-page under "Using local models."

Changes

  • Surface the requirement up front — Added a bold statement directly after the intro that the endpoint must be reachable at a public URL, explicitly calling out internal-only services (like a private LiteLLM proxy) as rejected, with a link to the details.
  • Correct the "internal gateway" framing — Qualified the internal-gateway mentions in the frontmatter description, Key features, and How it works so they no longer imply internal-only endpoints work.
  • Rename and broaden "Using local models" → "Network requirements" — The section now covers internal gateways/proxies (the exact Octopus Energy / LiteLLM case) in addition to local models, while keeping the existing tunneling guidance.

No anchors elsewhere link to #using-local-models, so the heading rename is safe. Style lint passes with no new issues.

Conversation: https://staging.warp.dev/conversation/c53a093e-9d6c-4bb0-9089-9b0c8fe0e738
Run: https://oz.staging.warp.dev/runs/019e9398-8a60-77a5-bc9b-703a7efe858d

This PR was generated with Oz.

Surface the public-reachability requirement up front and correct the
"internal gateways" framing that led customers (e.g. Octopus Energy with
an internal LiteLLM proxy) to expect internal-only endpoints to work.

- Add an up-front statement that the endpoint must be reachable at a
  public URL, calling out internal-only services like a private LiteLLM
  proxy as rejected.
- Qualify "internal gateway" mentions in the frontmatter description,
  Key features, and How it works with the public-URL requirement.
- Rename "Using local models" to "Network requirements" and broaden it
  to cover internal gateways/proxies in addition to local models.

Co-Authored-By: Oz <oz-agent@warp.dev>
@cla-bot cla-bot Bot added the cla-signed label Jun 4, 2026
@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Jun 4, 2026 5:10pm

Request Review

@hongyi-chen hongyi-chen marked this pull request as ready for review June 4, 2026 17:54
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented Jun 4, 2026

@hongyi-chen

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR clarifies that custom inference endpoints must be reachable from Warp's servers at a public URL and updates the internal gateway/local model guidance accordingly. The heading rename does not appear to break existing anchors, and the changed links/anchors map to headings in the diff.

Concerns

  • Add a brief access-control caveat to the internal gateway exposure guidance so readers do not publish an unauthenticated private proxy while satisfying the public URL requirement.

Verdict

Found: 0 critical, 0 important, 1 suggestions

Approve with nits

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

This requirement applies to any endpoint that isn't already publicly accessible:

To route through a model running on your own machine (for example, Ollama, LM Studio, vLLM, or llama.cpp), expose it through a tunneling service like [ngrok](https://ngrok.com/) and use the public tunnel URL as the base URL in your endpoint configuration.
* **Internal gateways and proxies** - An internal LiteLLM proxy, corporate AI gateway, or other service that only resolves inside your private network or VPN can't be reached by Warp. Expose it at a public HTTPS URL — for example, through a load balancer, an API gateway, or a tunneling service — before configuring it in Warp.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 [SUGGESTION] [SECURITY] Preserve authentication or access controls when telling readers to expose an internal gateway publicly, so the fix for reachability does not imply publishing an unauthenticated proxy.

Suggested change
* **Internal gateways and proxies** - An internal LiteLLM proxy, corporate AI gateway, or other service that only resolves inside your private network or VPN can't be reached by Warp. Expose it at a public HTTPS URL — for example, through a load balancer, an API gateway, or a tunneling service — before configuring it in Warp.
* **Internal gateways and proxies** - An internal LiteLLM proxy, corporate AI gateway, or other service that only resolves inside your private network or VPN can't be reached by Warp. Expose it at a public HTTPS URL, and keep the public endpoint authenticated or protected by access controls — for example, through a load balancer, an API gateway, or a tunneling service — before configuring it in Warp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant