fix(atelet): stream sandbox asset downloads instead of buffering in memory by dims · Pull Request #281 · agent-substrate/substrate

Davanum Srinivas (dims) · 2026-06-19T13:22:43Z

Previously, fetchAsset retrieved each asset through FetchFromGCS, which reads the entire object into a byte slice before hashing and writing it. Peak memory therefore scaled with the asset size. This was acceptable for the roughly 50 MB runsc binary, but when we go to micro-VM(s) the runtime's kernel and root filesystem images range from hundreds of megabytes to several gigabytes and would exhaust the memory of atelet, which is shared across all actors on a node.

The download now streams directly to the temporary file, computing the SHA-256 digest in the same pass via io.MultiWriter and bounding the transfer with an io.LimitReader. Peak memory is now the size of the io.Copy buffer, independent of the asset size. The LimitReader also provides a secondary safeguard for disk usage against a misconfigured or malicious URL that returns an unbounded stream. The limit is 8 GiB and is declared as a variable so that tests may lower it.

The digest is verified only after the copy completes. A size or hash failure therefore leaves the data at the temporary path, which is subsequently removed; it is never renamed to the content-addressed cache path, so a failed download cannot corrupt the cache.

This change also introduces ategcs.Open, a streaming reader, along with the accompanying TestFetchAssetStreaming.

Fixes #<issue_number_goes_here>

It's a good idea to open an issue first for discussion.

Tests pass
Appropriate changes to documentation are included in the PR

…emory Previously, fetchAsset retrieved each asset through FetchFromGCS, which reads the entire object into a byte slice before hashing and writing it. Peak memory therefore scaled with the asset size. This was acceptable for the roughly 50 MB runsc binary, but the micro-VM runtime's kernel and root filesystem images range from hundreds of megabytes to several gigabytes and would exhaust the memory of atelet, which is shared across all actors on a node. The download now streams directly to the temporary file, computing the SHA-256 digest in the same pass via io.MultiWriter and bounding the transfer with an io.LimitReader. Peak memory is now the size of the io.Copy buffer, independent of the asset size. The LimitReader also provides a secondary safeguard for disk usage against a misconfigured or malicious URL that returns an unbounded stream. The limit is 8 GiB and is declared as a variable so that tests may lower it. The digest is verified only after the copy completes. A size or hash failure therefore leaves the data at the temporary path, which is subsequently removed; it is never renamed to the content-addressed cache path, so a failed download cannot corrupt the cache. This change also introduces ategcs.Open, a streaming reader, along with the accompanying TestFetchAssetStreaming.

Davanum Srinivas (dims) · 2026-06-19T13:22:54Z

cc Tim Hockin (@thockin) Benjamin Elder (@BenTheElder)

Benjamin Elder (BenTheElder) self-requested a review June 22, 2026 05:32

Benjamin Elder (BenTheElder) self-assigned this Jun 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(atelet): stream sandbox asset downloads instead of buffering in memory#281

fix(atelet): stream sandbox asset downloads instead of buffering in memory#281
Davanum Srinivas (dims) wants to merge 1 commit into
agent-substrate:mainfrom
dims:fix/atelet-stream-asset-download

Davanum Srinivas (dims) commented Jun 19, 2026

Uh oh!

Davanum Srinivas (dims) commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Davanum Srinivas (dims) commented Jun 19, 2026

Uh oh!

Davanum Srinivas (dims) commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants