diff --git a/docs/cli/Guides/swarm-vllm-s3.md b/docs/cli/Guides/swarm-vllm-s3.md
new file mode 100644
index 00000000..5861d465
--- /dev/null
+++ b/docs/cli/Guides/swarm-vllm-s3.md
@@ -0,0 +1,217 @@
+---
+id: "swarm-vllm-s3"
+title: "Super Swarm: LLM Deployment with S3 Storage"
+slug: "/guides/swarm-vllm-s3"
+sidebar_position: 21
+---
+
+This guide provides step-by-step instructions for deploying an LLM on Super Swarm using an S3 object storage, with Qwen2.5 as an example. Modify the deployment script if you want to launch another model.
+
+## Prerequisites
+
+- [kubectl](https://kubernetes.io/docs/tasks/tools/)
+- [helm](https://helm.sh/docs/intro/install/)
+- [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)
+- A domain to construct an API endpoint hostname
+
+## 1. Download the deployment script
+
+Download and rename the deployment script [`deploy_qwen_s3.sh`](/files/deploy_qwen_s3.sh).
+
+In the script, find `BASE_DOMAIN="${BASE_DOMAIN:-superprotocol.com}"` and replace `superprotocol.com` with your domain.
+
+Modify the deployment configuration and `vllmConfig` if you are deploying another model.
+
+## 2. Sign in to Super Swarm
+
+In the Super Swarm dashboard, sign in using either Google (recommended) or MetaMask.
+
+
+
+
+## 3. Create a service account
+
+**3.1.** Open **Service Accounts** and click **Create Service Account**:
+
+
+
+
+
+**3.2.** Provide a name and click **Create**:
+
+
+
+
+
+**3.3.** Copy and save both Access and Secret keys and click **Done**:
+
+
+
+
+## 4. Create a bucket
+
+**4.1.** Open **Object Storage** and click **Create Bucket**:
+
+
+
+
+
+**4.2.** Provide a name for the bucket and click **Create Bucket**:
+
+
+
+
+
+## 5. Provide access to the bucket
+
+**5.1.** In Object Storage, click **Policy Rules**:
+
+
+
+
+
+**5.2.** Click **+Grant Access** in the top-right corner, select a Service Account, and click **Grant Access**:
+
+
+
+
+## 6. Download a model from Hugging Face
+
+This guide uses Qwen2.5 as an example. If you already have the model, skip this step.
+
+**6.1.** Install [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/installation).
+
+**6.2.** Download the model:
+
+```shell
+hf download Qwen/Qwen2.5-1.5B-Instruct --local-dir ./qwen-1.5b
+```
+
+## 7. Upload the model
+
+**7.1.** In **Object Storage**, click **Connect Info** to see your S3 Endpoint, Bucket ID, and the region:
+
+
+
+
+
+
+
+
+
+**7.2.** Export the following variables to set up the connection:
+
+```shell
+export AWS_ACCESS_KEY_ID=""
+export AWS_SECRET_ACCESS_KEY=""
+export AWS_DEFAULT_REGION="us-east-1"
+export S3_ENDPOINT=""
+export S3_BUCKET=""
+```
+
+Replace:
+- `` and `` with the keys you obtained in [Step 3](/cli/guides/swarm-vllm-s3#3-create-a-service-account).
+- `` and `` with corresponding values in the **Connect Info**.
+
+Ensure `AWS_DEFAULT_REGION` matches the region in the **Connect Info**.
+
+**7.3.** Upload the model:
+
+```shell
+aws s3 sync ./qwen-1.5b s3://${S3_BUCKET}/models/qwen-1.5b/ \
+ --endpoint-url ${S3_ENDPOINT} \
+ --exclude ".cache/*"
+```
+
+**7.4.** Check if the model was uploaded successfully:
+
+```shell
+aws s3 ls s3://${S3_BUCKET}/models/qwen-1.5b/ \
+ --endpoint-url ${S3_ENDPOINT}
+```
+
+## 8. Create a Kubernetes cluster
+
+**8.1.** Go to **Kubernetes** and click **Create Cluster**:
+
+
+
+
+
+**8.2.** Provide a name, add a **GPU** to the cluster, allocate resources, and click **Create Cluster**:
+
+
+
+
+## 9. Download the cluster configuration file
+
+
+
+
+## 10. Point `kubectl` to the configuration file
+
+Execute the following command:
+
+```shell
+export KUBECONFIG=-kubeconfig.yaml
+```
+
+Replace `-kubeconfig.yaml` with the name of the cluster configuration file.
+
+## 11. Set the API key
+
+Choose a password that will protect your API endpoints. Execute the following command and type your chosen secret (characters won't be displayed):
+
+```shell
+read -rs API_KEY && export API_KEY
+```
+
+## 12. Deploy the model
+
+Execute the deployment script:
+
+```shell
+bash deploy_qwen_s3.sh
+```
+
+## 13. Confirm DNS records
+
+Back in the Super Swarm dashboard, go to **Ingresses** and check the hostname listed there:
+
+
+
+
+
+At your DNS provider, add a CNAME record pointing to the hostname and a TXT record for domain verification.
+
+Ensure the statuses have changed to **Verified** and **Delegated**. This may take a couple of minutes.
+
+
+
+
+## 14. Publish the cluster
+
+Go to **Kubernetes** and publish the cluster.
+
+
+
+
+
+## 15. Send a test request
+
+In the following test request, replace `` with your domain.
+
+```shell
+curl https://qwen-vllm-s3./v1/chat/completions \
+ -H "Authorization: Bearer ${API_KEY}" \
+ -H "Content-Type: application/json" \
+ -d '{
+ "model": "qwen",
+ "messages": [{"role": "user", "content": "Hello! What model are you?"}],
+ "max_tokens": 100
+ }'
+```
+
+## Support
+
+If you have any issues or questions, contact Super Protocol on [Discord](https://discord.gg/superprotocol) or via the [contact form](https://superprotocol.zendesk.com/hc/en-us/requests/new).
\ No newline at end of file
diff --git a/docs/cli/Guides/swarm-vllm.md b/docs/cli/Guides/swarm-vllm.md
index 52f4a9ae..0cf02b08 100644
--- a/docs/cli/Guides/swarm-vllm.md
+++ b/docs/cli/Guides/swarm-vllm.md
@@ -1,50 +1,63 @@
---
id: "swarm-vllm"
-title: "vLLM on Super Swarm"
+title: "Super Swarm: LLM Deployment"
slug: "/guides/swarm-vllm"
sidebar_position: 20
---
-This guide provides step-by-step instructions for deploying MedGemma and Apertus on Super Swarm using vLLM.
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+This guide provides step-by-step instructions for deploying an LLM on Super Swarm using [vLLM](https://github.com/vllm-project/vllm), with MedGemma and Apertus as examples. Modify the deployment script if you want to launch another model.
## Prerequisites
- [kubectl](https://kubernetes.io/docs/tasks/tools/)
- [helm](https://helm.sh/docs/intro/install/)
-- A domain
+- A domain to construct API endpoint hostnames
- For [MedGemma](https://huggingface.co/google/medgemma-1.5-4b-it): a Hugging Face token from an account that has already accepted the model's terms
-Also, download and rename deployment scripts:
+## 1. Download and update deployment scripts
+
+
+
+ Download and rename the deployment script [`deploy_medgemma_official.sh`](/files/deploy_medgemma_official.sh)
+
+
+ Download and rename the deployment script [`deploy_apertus_official.sh`](/files/deploy_apertus_official.sh)
+
+
+
+In the script, find `BASE_DOMAIN="${BASE_DOMAIN:-superprotocol.com}"` and replace `superprotocol.com` with your domain.
-- [`deploy_medgemma_official.sh`](/files/deploy_medgemma_official.sh)
-- [`deploy_apertus_official.sh`](/files/deploy_apertus_official.sh)
+Modify the deployment parameters if you are using another model.
-## 1. Sign in to Super Swarm
+## 2. Sign in to Super Swarm
-In the Super Swarm dashboard, sign in using MetaMask:
+In the Super Swarm dashboard, sign in using either Google (recommended) or MetaMask.
-
+
-## 2. Create a Kubernetes cluster
+## 3. Create a Kubernetes cluster
-2.1. Go to **Kubernetes** and press **Create Cluster**:
+**3.1.** Go to **Kubernetes** and click **Create Cluster**:
-
+
-2.2. Add a GPU to the cluster, allocate resources, and press **Create Cluster**:
+**3.2.** Provide a name, add a **GPU** to the cluster, allocate resources, and click **Create Cluster**:
-
+
-## 3. Download the cluster configuration file
+## 4. Download the cluster configuration file
-
+
-## 4. Point `kubectl` to the configuration file
+## 5. Point `kubectl` to the configuration file
Execute the following command:
@@ -54,13 +67,9 @@ export KUBECONFIG=-kubeconfig.yaml
Replace `-kubeconfig.yaml` with the name of the downloaded configuration file.
-## 5. Update the scripts
-
-In both scripts (`deploy_medgemma_official.sh` and `deploy_apertus_official.sh`), find `BASE_DOMAIN="${BASE_DOMAIN:-monai-swarm.win}"` and replace `monai-swarm.win` with your domain.
-
## 6. Set the API key
-Choose any password that will protect your API endpoints. Execute the following command and type your chosen secret (characters won't be displayed):
+Choose a password that will protect your API endpoints. Execute the following command and type your chosen secret (characters won't be displayed):
```shell
read -rs API_KEY && export API_KEY
@@ -68,45 +77,44 @@ read -rs API_KEY && export API_KEY
## 7. Deploy the model
-### Apertus
-
-```shell
-bash deploy_apertus_official.sh
-```
-
-The deployment usually takes 5-7 minutes.
-
-A working Apertus config is already set in the script:
-
-```
-dtype=bfloat16
-max-model-len=32768
-gpu-memory-utilization=0.55
-max-num-seqs=8
-max-num-batched-tokens=4096
-```
-
-### MedGemma
-
-```shell
-export HF_TOKEN=hf_xxx
-bash deploy_medgemma_official.sh
-```
-
-Replace `hf_xxx` with an HF_TOKEN.
-
-Alternatively, create a `.hf_token` file with the token next to `deploy_medgemma_official.sh`; the script will read it automatically.
-
-A working MedGemma config is already set in the script:
-
-```
-dtype=bfloat16
-max-model-len=8192
-gpu-memory-utilization=0.40
---mm-processor-cache-gb 1
-max-num-seqs=4
-max-num-batched-tokens=2048
-```
+
+
+ ```shell
+ export HF_TOKEN=hf_xxx
+ bash deploy_medgemma_official.sh
+ ```
+
+ Replace `hf_xxx` with an HF_TOKEN.
+
+ Alternatively, create a `.hf_token` file with the token next to `deploy_medgemma_official.sh`; the script will read it automatically.
+
+ A working MedGemma configuration is already set in the script:
+
+ ```
+ dtype=bfloat16
+ max-model-len=8192
+ gpu-memory-utilization=0.40
+ --mm-processor-cache-gb 1
+ max-num-seqs=4
+ max-num-batched-tokens=2048
+ ```
+
+
+ ```shell
+ bash deploy_apertus_official.sh
+ ```
+
+ A working Apertus configuration is already set in the script:
+
+ ```
+ dtype=bfloat16
+ max-model-len=32768
+ gpu-memory-utilization=0.55
+ max-num-seqs=8
+ max-num-batched-tokens=4096
+ ```
+
+
## 8. Check Kubernetes
@@ -126,58 +134,72 @@ Expected output:
Back in the Super Swarm dashboard, go to **Ingresses** and note the two hostnames listed there.
-
+
For each hostname, add a CNAME record pointing to it and a TXT record for domain verification at your DNS provider.
-## 10. Publish the cluster
-
-In the Super Swarm dashboard, go to **Kubernetes** and publish the cluster.
+Back in the Super Swarm dashboard, ensure the statuses are **Verified** and **Delegated**. This may take a couple of minutes.
-
+
-## 11. Send test requests
-
-In the test requests below, replace:
-
-- `` with your domain.
-- `` with the key you set in [Step 6](/cli/guides/swarm-vllm#6-set-the-api-key).
+## 10. Publish the cluster
-### Apertus
+Go to **Kubernetes** and publish the cluster.
-```shell
-curl https://apertus-vllm./v1/completions \
- -H 'Authorization: Bearer ' \
- -H 'Content-Type: application/json' \
- -d '{
- "model": "swiss-ai/Apertus-8B-2509",
- "prompt": "Write a concise technical summary of Kubernetes GPU scheduling.",
- "temperature": 0,
- "max_tokens": 200
- }'
-```
+
+
-### MedGemma
+## 11. Send test requests
-```shell
-curl https://medgemma-vllm./v1/chat/completions \
- -H 'Authorization: Bearer ' \
- -H 'Content-Type: application/json' \
- -d '{
- "model": "google/medgemma-1.5-4b-it",
- "messages": [
- {
- "role": "user",
- "content": [
- {"type": "text", "text": "Describe this image briefly."},
- {"type": "image_url", "image_url": {"url": "data:image/png;base64,PASTE_BASE64_HERE"}}
- ]
- }
- ],
- "temperature": 0,
- "max_tokens": 120
- }'
-```
\ No newline at end of file
+
+
+ In the following test request, replace:
+
+ - `` with your domain.
+ - `` with a base64-encoded image. To convert an image, use the command: `base64 -i your-image.png`.
+
+ Ensure that `image/png` matches your actual file type; use `image/jpeg` for JPG files, for example.
+
+ ```shell
+ curl https://medgemma-vllm./v1/chat/completions \
+ -H 'Authorization: Bearer ${API_KEY}' \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "model": "google/medgemma-1.5-4b-it",
+ "messages": [
+ {
+ "role": "user",
+ "content": [
+ {"type": "text", "text": "Describe this image briefly."},
+ {"type": "image_url", "image_url": {"url": "data:image/png;base64,"}}
+ ]
+ }
+ ],
+ "temperature": 0,
+ "max_tokens": 120
+ }'
+ ```
+
+
+ In the following test request, replace `` with your domain.
+
+ ```shell
+ curl https://apertus-vllm./v1/completions \
+ -H 'Authorization: Bearer ${API_KEY}' \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "model": "swiss-ai/Apertus-8B-2509",
+ "prompt": "Write a concise technical summary of Kubernetes GPU scheduling.",
+ "temperature": 0,
+ "max_tokens": 200
+ }'
+ ```
+
+
+
+## Support
+
+If you have any issues or questions, contact Super Protocol on [Discord](https://discord.gg/superprotocol) or via the [contact form](https://superprotocol.zendesk.com/hc/en-us/requests/new).
\ No newline at end of file
diff --git a/docs/cli/images/swarm-connect-info.png b/docs/cli/images/swarm-connect-info.png
new file mode 100644
index 00000000..0bb18c99
Binary files /dev/null and b/docs/cli/images/swarm-connect-info.png differ
diff --git a/docs/cli/images/swarm-create-bucket.png b/docs/cli/images/swarm-create-bucket.png
new file mode 100644
index 00000000..a43438aa
Binary files /dev/null and b/docs/cli/images/swarm-create-bucket.png differ
diff --git a/docs/cli/images/create-kubernetes-space.png b/docs/cli/images/swarm-create-kubernetes-space.png
similarity index 100%
rename from docs/cli/images/create-kubernetes-space.png
rename to docs/cli/images/swarm-create-kubernetes-space.png
diff --git a/docs/cli/images/swarm-create-service-account-keys.png b/docs/cli/images/swarm-create-service-account-keys.png
new file mode 100644
index 00000000..a495e42d
Binary files /dev/null and b/docs/cli/images/swarm-create-service-account-keys.png differ
diff --git a/docs/cli/images/swarm-create-service-account-window.png b/docs/cli/images/swarm-create-service-account-window.png
new file mode 100644
index 00000000..1f31dff2
Binary files /dev/null and b/docs/cli/images/swarm-create-service-account-window.png differ
diff --git a/docs/cli/images/swarm-create-service-account.png b/docs/cli/images/swarm-create-service-account.png
new file mode 100644
index 00000000..37aa9739
Binary files /dev/null and b/docs/cli/images/swarm-create-service-account.png differ
diff --git a/docs/cli/images/swarm-ingresses-s3-verified.png b/docs/cli/images/swarm-ingresses-s3-verified.png
new file mode 100644
index 00000000..878ff977
Binary files /dev/null and b/docs/cli/images/swarm-ingresses-s3-verified.png differ
diff --git a/docs/cli/images/swarm-ingresses-s3.png b/docs/cli/images/swarm-ingresses-s3.png
new file mode 100644
index 00000000..1625e653
Binary files /dev/null and b/docs/cli/images/swarm-ingresses-s3.png differ
diff --git a/docs/cli/images/swarm-ingresses-verified.png b/docs/cli/images/swarm-ingresses-verified.png
new file mode 100644
index 00000000..878ff977
Binary files /dev/null and b/docs/cli/images/swarm-ingresses-verified.png differ
diff --git a/docs/cli/images/swarm-ingresses-vllm-verified.png b/docs/cli/images/swarm-ingresses-vllm-verified.png
new file mode 100644
index 00000000..7551feef
Binary files /dev/null and b/docs/cli/images/swarm-ingresses-vllm-verified.png differ
diff --git a/docs/cli/images/swarm-ingresses-vllm.png b/docs/cli/images/swarm-ingresses-vllm.png
new file mode 100644
index 00000000..eb23ac48
Binary files /dev/null and b/docs/cli/images/swarm-ingresses-vllm.png differ
diff --git a/docs/cli/images/ingresses.png b/docs/cli/images/swarm-ingresses.png
similarity index 100%
rename from docs/cli/images/ingresses.png
rename to docs/cli/images/swarm-ingresses.png
diff --git a/docs/cli/images/kubernetes-create-cluster.png b/docs/cli/images/swarm-kubernetes-create-cluster.png
similarity index 100%
rename from docs/cli/images/kubernetes-create-cluster.png
rename to docs/cli/images/swarm-kubernetes-create-cluster.png
diff --git a/docs/cli/images/kubernetes-download-kubeconfig.png b/docs/cli/images/swarm-kubernetes-download-kubeconfig.png
similarity index 100%
rename from docs/cli/images/kubernetes-download-kubeconfig.png
rename to docs/cli/images/swarm-kubernetes-download-kubeconfig.png
diff --git a/docs/cli/images/kubernetes-publish-cluster.png b/docs/cli/images/swarm-kubernetes-publish-cluster.png
similarity index 100%
rename from docs/cli/images/kubernetes-publish-cluster.png
rename to docs/cli/images/swarm-kubernetes-publish-cluster.png
diff --git a/docs/cli/images/swarm-log-in.png b/docs/cli/images/swarm-log-in.png
deleted file mode 100644
index e7abee2f..00000000
Binary files a/docs/cli/images/swarm-log-in.png and /dev/null differ
diff --git a/docs/cli/images/swarm-object-storage-connect-info.png b/docs/cli/images/swarm-object-storage-connect-info.png
new file mode 100644
index 00000000..a0ae8e3a
Binary files /dev/null and b/docs/cli/images/swarm-object-storage-connect-info.png differ
diff --git a/docs/cli/images/swarm-object-storage-policy-rules.png b/docs/cli/images/swarm-object-storage-policy-rules.png
new file mode 100644
index 00000000..b6bd7faf
Binary files /dev/null and b/docs/cli/images/swarm-object-storage-policy-rules.png differ
diff --git a/docs/cli/images/swarm-object-storage.png b/docs/cli/images/swarm-object-storage.png
new file mode 100644
index 00000000..76a71e73
Binary files /dev/null and b/docs/cli/images/swarm-object-storage.png differ
diff --git a/docs/cli/images/swarm-policy-rules-grant-access.png b/docs/cli/images/swarm-policy-rules-grant-access.png
new file mode 100644
index 00000000..bb388350
Binary files /dev/null and b/docs/cli/images/swarm-policy-rules-grant-access.png differ
diff --git a/docs/cli/images/swarm-sign-in.png b/docs/cli/images/swarm-sign-in.png
new file mode 100644
index 00000000..d73fae84
Binary files /dev/null and b/docs/cli/images/swarm-sign-in.png differ
diff --git a/static/files/deploy_apertus_official.sh b/static/files/deploy_apertus_official.sh
index 1487a1c7..333dcde1 100755
--- a/static/files/deploy_apertus_official.sh
+++ b/static/files/deploy_apertus_official.sh
@@ -3,7 +3,7 @@ set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-BASE_DOMAIN="${BASE_DOMAIN:-monai-swarm.win}"
+BASE_DOMAIN="${BASE_DOMAIN:-superprotocol.com}"
API_HOST="${API_HOST:-apertus-vllm.${BASE_DOMAIN}}"
MODEL_NAME="${MODEL_NAME:-swiss-ai/Apertus-8B-2509}"
MODEL_ENTRY_NAME="${MODEL_ENTRY_NAME:-apertus}"
diff --git a/static/files/deploy_medgemma_official.sh b/static/files/deploy_medgemma_official.sh
index 7845a04e..4cc0bc05 100755
--- a/static/files/deploy_medgemma_official.sh
+++ b/static/files/deploy_medgemma_official.sh
@@ -3,7 +3,7 @@ set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-BASE_DOMAIN="${BASE_DOMAIN:-monai-swarm.win}"
+BASE_DOMAIN="${BASE_DOMAIN:-superprotocol.com}"
API_HOST="${API_HOST:-medgemma-vllm.${BASE_DOMAIN}}"
MODEL_NAME="${MODEL_NAME:-google/medgemma-1.5-4b-it}"
MODEL_ENTRY_NAME="${MODEL_ENTRY_NAME:-medgemma}"
diff --git a/static/files/deploy_qwen_s3.sh b/static/files/deploy_qwen_s3.sh
new file mode 100644
index 00000000..8de83c74
--- /dev/null
+++ b/static/files/deploy_qwen_s3.sh
@@ -0,0 +1,235 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+# ===========================
+# VALIDATE REQUIRED VARS
+# ===========================
+if [ -z "${API_KEY:-}" ]; then
+ echo "ERROR: API_KEY must be set. Execute:" >&2
+ echo " read -rs API_KEY && export API_KEY" >&2
+ exit 1
+fi
+
+if [ -z "${AWS_ACCESS_KEY_ID:-}" ] || [ -z "${AWS_SECRET_ACCESS_KEY:-}" ]; then
+ echo "ERROR: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY must be set." >&2
+ echo " export AWS_ACCESS_KEY_ID=" >&2
+ echo " export AWS_SECRET_ACCESS_KEY=" >&2
+ exit 1
+fi
+
+if [ -z "${S3_ENDPOINT:-}" ] || [ -z "${S3_BUCKET:-}" ]; then
+ echo "ERROR: S3_ENDPOINT and S3_BUCKET must be set." >&2
+ echo " export S3_ENDPOINT=" >&2
+ echo " export S3_BUCKET=" >&2
+ exit 1
+fi
+
+S3_MODEL_PATH="${S3_MODEL_PATH:-models/qwen-1.5b}"
+
+# ===========================
+# DEPLOYMENT CONFIG
+# ===========================
+BASE_DOMAIN="${BASE_DOMAIN:-superprotocol.com}"
+API_HOST="${API_HOST:-qwen-vllm-s3.${BASE_DOMAIN}}"
+MODEL_NAME="s3://${S3_BUCKET}/${S3_MODEL_PATH}"
+MODEL_ENTRY_NAME="${MODEL_ENTRY_NAME:-qwen}"
+RELEASE_NAME="${RELEASE_NAME:-vllm-s3}"
+IMAGE_REPOSITORY="${IMAGE_REPOSITORY:-vllm/vllm-openai}"
+IMAGE_TAG="${IMAGE_TAG:-v0.8.5}"
+GPU_MEMORY_UTILIZATION="${GPU_MEMORY_UTILIZATION:-0.85}"
+MAX_MODEL_LEN="${MAX_MODEL_LEN:-4096}"
+CPU_REQUEST="${CPU_REQUEST:-4}"
+MEMORY_REQUEST="${MEMORY_REQUEST:-16Gi}"
+GPU_COUNT="${GPU_COUNT:-1}"
+PVC_STORAGE="${PVC_STORAGE:-10Gi}"
+INGRESS_CLASS="${INGRESS_CLASS:-nginx}"
+
+need() { command -v "$1" >/dev/null 2>&1 || { echo "Missing dependency: $1" >&2; exit 1; }; }
+need kubectl
+need helm
+
+NAMESPACE="${NAMESPACE:-$(kubectl config view --minify -o jsonpath='{..namespace}' 2>/dev/null || true)}"
+if [ -z "${NAMESPACE}" ]; then
+ NAMESPACE="llm"
+fi
+
+SECRET_NAME="${RELEASE_NAME}-auth"
+S3_SECRET_NAME="${RELEASE_NAME}-s3-creds"
+SERVICE_NAME="${RELEASE_NAME}-${MODEL_ENTRY_NAME}-engine-service"
+INGRESS_NAME="${RELEASE_NAME}-api-ingress"
+
+echo "==> Runtime: vLLM (official helm chart) + S3 model"
+echo "==> Namespace: ${NAMESPACE}"
+echo "==> Release: ${RELEASE_NAME}"
+echo "==> API host: ${API_HOST}"
+echo "==> Model (S3): ${MODEL_NAME}"
+echo "==> S3 endpoint: ${S3_ENDPOINT}"
+echo "==> Image: ${IMAGE_REPOSITORY}:${IMAGE_TAG}"
+echo
+
+kubectl get ns "${NAMESPACE}" >/dev/null 2>&1 || kubectl create ns "${NAMESPACE}"
+
+helm repo add vllm https://vllm-project.github.io/production-stack >/dev/null 2>&1 || true
+helm repo update >/dev/null 2>&1
+
+# API key secret
+cat < "${VALUES_FILE}" < Values file:"
+cat "${VALUES_FILE}"
+echo
+
+KUBECONFIG="${KUBECONFIG:-}" helm upgrade --install "${RELEASE_NAME}" vllm/vllm-stack \
+ --namespace "${NAMESPACE}" \
+ -f "${VALUES_FILE}" \
+ --skip-crds \
+ --wait --timeout=20m
+
+cat < Pods:"
+kubectl -n "${NAMESPACE}" get pods -o wide
+echo
+echo "==> Services:"
+kubectl -n "${NAMESPACE}" get svc -o wide
+echo
+echo "==> Ingress:"
+kubectl -n "${NAMESPACE}" get ingress -o wide
+echo
+echo "==> Waiting for vLLM pod readiness..."
+kubectl -n "${NAMESPACE}" wait --for=condition=ready pod \
+ -l "model=${MODEL_ENTRY_NAME},helm-release-name=${RELEASE_NAME}" \
+ --timeout=900s
+echo
+echo "==================================="
+echo "Ready. API base URL: http://${API_HOST}/v1"
+echo "Model: ${MODEL_NAME}"
+echo "Smoke test:"
+echo " curl http://${API_HOST}/v1/models -H 'Authorization: Bearer \${API_KEY}'"
+echo "==================================="