Skip to content

ci(workflows): use gpu runner label#7476

Open
njzjz-bot wants to merge 1 commit into
deepmodeling:developfrom
njzjz-bot:openclaw/use-gpu-runner
Open

ci(workflows): use gpu runner label#7476
njzjz-bot wants to merge 1 commit into
deepmodeling:developfrom
njzjz-bot:openclaw/use-gpu-runner

Conversation

@njzjz-bot

Copy link
Copy Markdown

Problem

  • deepmodeling has a new dynamic runner using the gpu label.
  • CUDA workflows still targeted the old nvidia runner label.

Change

  • Update CUDA workflow runs-on labels from nvidia to gpu.

Notes

  • Verified with git diff --check and confirmed no runs-on: nvidia remains under .github/workflows.

Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)

Use the new dynamic runner label for CUDA workflows in the deepmodeling organization.

Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
@njzjz njzjz requested a review from mohanchen June 16, 2026 18:44
@njzjz-bot

njzjz-bot commented Jun 16, 2026

Copy link
Copy Markdown
Author

Current dynamic GPU runner setup

This PR switches GPU jobs from runs-on: nvidia to runs-on: gpu because deepmodeling has deployed the new dynamic runner setup. In this setup, the gpu label is the stable GitHub Actions entry point, and the backend is already based on Actions Runner Controller (ARC).

The current scheme can be summarized as:

ARC + Alibaba Cloud ACK / ECI for dynamically provisioned GPU runners

This is the recommended cloud-native container-based runner architecture when CI jobs can run inside Docker containers.

  • Core component: GitHub's officially supported Actions Runner Controller (ARC).
  • How it works:
    1. ARC is deployed in an Alibaba Cloud Kubernetes environment, such as ACK.
    2. ARC watches the GitHub Actions job queue through the runner scale set API.
    3. When a CI job using the gpu label enters the queued state, ARC dynamically schedules a fresh runner Pod in the cluster.
    4. After the job finishes, the Pod is automatically released.
  • Elastic and cost-efficient setup: with Alibaba Cloud ECI virtual nodes, runner Pods can be backed by serverless container instances instead of pre-reserved ECS worker nodes. This keeps idle cost close to zero: no queued GPU CI jobs means no runner capacity to maintain; when a job arrives, an isolated container instance is created on demand, billed by actual runtime, and destroyed after completion.

So this PR is not introducing a separate future ARC proposal. It is updating the workflows to use the gpu label provided by the current ARC-backed dynamic runner scheme.


Authored by OpenClaw (version: 2026.5.28 e932160, model: custom-chat-jinzhezeng-group/gpt-5.5)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant