Steps to reproduce
Problem
When dstack-shim starts a job container, the entrypoint runs an install_pkg openssh-server step (plus, in the non-shim flow, install_pkg curl and a curl download of the runner binary). On hosts that
only have outbound internet access through an HTTP proxy, this step fails with a network/DNS error even though the shim itself was started with http_proxy/https_proxy exported and Docker pulls succeed (the
Docker daemon reads its own proxy config).
The root cause is that no proxy env vars are forwarded into the container, so apt-get/yum/apk inside the entrypoint shell never see them.
Repro
- Run
dstack-shim on a host that requires http_proxy/https_proxy for outbound traffic.
- Submit any job that uses an image without
sshd preinstalled (e.g., a plain ubuntu:22.04).
- Container entrypoint exits non-zero in
apt-get update / apt-get install -y openssh-server.
Where the gap is
The entrypoint chain is built in getSSHShellCommands() and runs install_pkg openssh-server:
https://github.com/dstackai/dstack/blob/master/runner/internal/shim/docker.go#L984-L1000
The container is created in createContainer with Env populated only from PJRT_DEVICE (plus GPU/HABANA vars later); nothing forwards proxy vars from the shim's own environment:
https://github.com/dstackai/dstack/blob/master/runner/internal/shim/docker.go#L862-L901
TaskConfig has no env field at all, so the server can't pass them through either:
https://github.com/dstackai/dstack/blob/master/runner/internal/shim/models.go#L83-L105
The same install snippet exists on the Python side (non-shim flow) and has the same gap:
https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/base/compute.py#L982-L1027
Docker does not auto-propagate the daemon's environment into containers, so even if proxy vars are set on the host (e.g., via get_shim_env or cloud-init), they don't reach the entrypoint.
Suggested fix
Minimal change in createContainer (runner/internal/shim/docker.go): forward proxy vars from the shim's own environment into the container Env:
for _, name := range []string{
"http_proxy", "https_proxy", "no_proxy",
"HTTP_PROXY", "HTTPS_PROXY", "NO_PROXY",
} {
if v, ok := os.LookupEnv(name); ok {
envVars = append(envVars, name+"="+v)
}
}
A more flexible follow-up would be to add an Env field to TaskConfig so the server can pass per-task env vars to the shim.
Environment
- dstack version:
0.20.21
- Backend:
ssh
- Host OS:
Ubuntu 26.04 LTS
- Container image:
nvidia/cuda:12.8.0-devel-ubuntu24.04
Actual behaviour
No response
Expected behaviour
No response
dstack version
0.20.21
Server logs
Exited (none)
W: Some index files failed to download. They have been ignored, or old ones used instead.
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package openssh-server
Additional information
No response
Steps to reproduce
Problem
When
dstack-shimstarts a job container, the entrypoint runs aninstall_pkg openssh-serverstep (plus, in the non-shim flow,install_pkg curland acurldownload of the runner binary). On hosts thatonly have outbound internet access through an HTTP proxy, this step fails with a network/DNS error even though the shim itself was started with
http_proxy/https_proxyexported and Docker pulls succeed (theDocker daemon reads its own proxy config).
The root cause is that no proxy env vars are forwarded into the container, so
apt-get/yum/apkinside the entrypoint shell never see them.Repro
dstack-shimon a host that requireshttp_proxy/https_proxyfor outbound traffic.sshdpreinstalled (e.g., a plainubuntu:22.04).apt-get update/apt-get install -y openssh-server.Where the gap is
The entrypoint chain is built in
getSSHShellCommands()and runsinstall_pkg openssh-server:https://github.com/dstackai/dstack/blob/master/runner/internal/shim/docker.go#L984-L1000
The container is created in
createContainerwithEnvpopulated only fromPJRT_DEVICE(plus GPU/HABANA vars later); nothing forwards proxy vars from the shim's own environment:https://github.com/dstackai/dstack/blob/master/runner/internal/shim/docker.go#L862-L901
TaskConfighas no env field at all, so the server can't pass them through either:https://github.com/dstackai/dstack/blob/master/runner/internal/shim/models.go#L83-L105
The same install snippet exists on the Python side (non-shim flow) and has the same gap:
https://github.com/dstackai/dstack/blob/master/src/dstack/_internal/core/backends/base/compute.py#L982-L1027
Docker does not auto-propagate the daemon's environment into containers, so even if proxy vars are set on the host (e.g., via
get_shim_envor cloud-init), they don't reach the entrypoint.Suggested fix
Minimal change in
createContainer(runner/internal/shim/docker.go): forward proxy vars from the shim's own environment into the containerEnv:A more flexible follow-up would be to add an
Envfield toTaskConfigso the server can pass per-task env vars to the shim.Environment
0.20.21sshUbuntu 26.04 LTSnvidia/cuda:12.8.0-devel-ubuntu24.04Actual behaviour
No response
Expected behaviour
No response
dstack version
0.20.21
Server logs
Additional information
No response