From b923c84c8dc6a4fae374847a4d135dd057ec9d94 Mon Sep 17 00:00:00 2001 From: "promptless[bot]" Date: Fri, 26 Jun 2026 18:34:25 +0000 Subject: [PATCH 1/3] Update Serverless create-endpoint flow with new deployment paths (DOCS-446) --- serverless/endpoints/overview.mdx | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/serverless/endpoints/overview.mdx b/serverless/endpoints/overview.mdx index 3d40078c..245e3852 100644 --- a/serverless/endpoints/overview.mdx +++ b/serverless/endpoints/overview.mdx @@ -44,15 +44,18 @@ Before creating an endpoint, ensure you have a [handler function](/serverless/wo 1. Navigate to the [Serverless section](https://www.console.runpod.io/serverless) and click **New Endpoint**. -2. Choose your deployment source: - - **Import Git Repository**: See [Deploy from GitHub](/serverless/workers/github-integration) - - **Import from Docker Registry**: See [Deploy from Docker Hub](/serverless/workers/deploy) - - **Ready-to-Deploy Repos**: Select a preconfigured endpoint -3. Configure your endpoint: - - **Endpoint Name** and **Type** (Queue-based or Load balancer) - - **GPU Configuration** and worker settings +2. Choose your deployment path: + - **Hello World**: Runpod forks a starter worker template into a new GitHub repo in your account. Choose Queue-based or Load balancing, then click **Deploy**. + - **Hugging Face LLM**: Search for any text-generation model on Hugging Face (for example, type "Gemma" to find Gemma 4), select it, and click **Create Endpoint**. Runpod deploys a vLLM endpoint for you. + - **Docker**: Deploy from a container image. Select a saved Serverless template to fill in the container configuration automatically, or skip the template and enter an image name manually. See [Deploy from Docker](/serverless/workers/deploy). + - **GitHub**: Select a repository, filtering by code owner if needed. Runpod checks for a Dockerfile and runs a background check on your handler: queue-based endpoints check for handler files, and load balancing endpoints check for a `/ping` path. See [Deploy from GitHub](/serverless/workers/github-integration). + - **Hub**: Opens the Hub browser, where you can browse and deploy prebuilt workers. This replaces the previous "Ready-to-Deploy Repos" option. See [Hub overview](/hub/overview). + - **Flash**: A guided setup flow for [Flash](/flash/overview) that walks you through installing the SDK, initializing your project, and sending your first command. Steps complete automatically as you progress. +3. For the GitHub, Docker, and Hello World paths, configure your endpoint before deploying: + - **Endpoint name** and **type** (Queue-based or Load balancing) + - **GPU** configuration and worker scaling - **Model** (optional): Enter a Hugging Face URL for [cached models](/serverless/endpoints/model-caching) - - **Environment Variables**: See [environment variables](/serverless/development/environment-variables) + - **Environment variables** and container configuration. See [environment variables](/serverless/development/environment-variables). 4. Click **Deploy Endpoint**. From 9579db9e8fb1cc7736305f12eeb51b0ae7108659 Mon Sep 17 00:00:00 2001 From: "promptless[bot]" Date: Tue, 30 Jun 2026 16:19:36 +0000 Subject: [PATCH 2/3] Link Queue-based and Load balancing to Flash create-endpoints anchors --- serverless/endpoints/overview.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/serverless/endpoints/overview.mdx b/serverless/endpoints/overview.mdx index 245e3852..5f9f97b0 100644 --- a/serverless/endpoints/overview.mdx +++ b/serverless/endpoints/overview.mdx @@ -45,7 +45,7 @@ Before creating an endpoint, ensure you have a [handler function](/serverless/wo 1. Navigate to the [Serverless section](https://www.console.runpod.io/serverless) and click **New Endpoint**. 2. Choose your deployment path: - - **Hello World**: Runpod forks a starter worker template into a new GitHub repo in your account. Choose Queue-based or Load balancing, then click **Deploy**. + - **Hello World**: Runpod forks a starter worker template into a new GitHub repo in your account. Choose [Queue-based](/flash/create-endpoints#queue-based-endpoints) or [Load balancing](/flash/create-endpoints#load-balanced-endpoints), then click **Deploy**. - **Hugging Face LLM**: Search for any text-generation model on Hugging Face (for example, type "Gemma" to find Gemma 4), select it, and click **Create Endpoint**. Runpod deploys a vLLM endpoint for you. - **Docker**: Deploy from a container image. Select a saved Serverless template to fill in the container configuration automatically, or skip the template and enter an image name manually. See [Deploy from Docker](/serverless/workers/deploy). - **GitHub**: Select a repository, filtering by code owner if needed. Runpod checks for a Dockerfile and runs a background check on your handler: queue-based endpoints check for handler files, and load balancing endpoints check for a `/ping` path. See [Deploy from GitHub](/serverless/workers/github-integration). From 6cb60365aa2a63d4665e44ef24a621dd983e3878 Mon Sep 17 00:00:00 2001 From: lgunreddi Date: Tue, 30 Jun 2026 13:06:04 -0400 Subject: [PATCH 3/3] Update overview.mdx --- serverless/endpoints/overview.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/serverless/endpoints/overview.mdx b/serverless/endpoints/overview.mdx index 5f9f97b0..2a4f55dc 100644 --- a/serverless/endpoints/overview.mdx +++ b/serverless/endpoints/overview.mdx @@ -45,14 +45,14 @@ Before creating an endpoint, ensure you have a [handler function](/serverless/wo 1. Navigate to the [Serverless section](https://www.console.runpod.io/serverless) and click **New Endpoint**. 2. Choose your deployment path: - - **Hello World**: Runpod forks a starter worker template into a new GitHub repo in your account. Choose [Queue-based](/flash/create-endpoints#queue-based-endpoints) or [Load balancing](/flash/create-endpoints#load-balanced-endpoints), then click **Deploy**. + - **Hello World**: Runpod forks a starter worker template into a new GitHub repo in your account. Choose Queue-based or Load balancing, then click **Deploy**. - **Hugging Face LLM**: Search for any text-generation model on Hugging Face (for example, type "Gemma" to find Gemma 4), select it, and click **Create Endpoint**. Runpod deploys a vLLM endpoint for you. - **Docker**: Deploy from a container image. Select a saved Serverless template to fill in the container configuration automatically, or skip the template and enter an image name manually. See [Deploy from Docker](/serverless/workers/deploy). - **GitHub**: Select a repository, filtering by code owner if needed. Runpod checks for a Dockerfile and runs a background check on your handler: queue-based endpoints check for handler files, and load balancing endpoints check for a `/ping` path. See [Deploy from GitHub](/serverless/workers/github-integration). - **Hub**: Opens the Hub browser, where you can browse and deploy prebuilt workers. This replaces the previous "Ready-to-Deploy Repos" option. See [Hub overview](/hub/overview). - **Flash**: A guided setup flow for [Flash](/flash/overview) that walks you through installing the SDK, initializing your project, and sending your first command. Steps complete automatically as you progress. 3. For the GitHub, Docker, and Hello World paths, configure your endpoint before deploying: - - **Endpoint name** and **type** (Queue-based or Load balancing) + - **Endpoint name** and **type** ([Queue-based](/flash/create-endpoints#queue-based-endpoints) or [Load balancing](/flash/create-endpoints#load-balanced-endpoints)) - **GPU** configuration and worker scaling - **Model** (optional): Enter a Hugging Face URL for [cached models](/serverless/endpoints/model-caching) - **Environment variables** and container configuration. See [environment variables](/serverless/development/environment-variables).