Describe the bug
Summary
We observed transient read failures immediately after Aws::S3::TransferManager#download_file with a String destination path. First attempt fails, retry succeeds.
Environment
- aws-sdk-s3: 1.213.0
- aws-sdk-core: 3.242.0
- Ruby: 4.0.3p0
- Runtime: Kubernetes / Linux
- Service: nds
Observed error
DuckDB::Error: IO Error: Could not read enough bytes from file "/projection-cache/bundled/.duckdb_hardlink_036c9176-bfd7-4228-a936-a14491ba0be4.duckdb": attempted to read 262144 bytes from location 1323008
Representative log:
2026-05-26T10:00:37.641503943Z stdout F pid=1 tid=4037d WARN: DuckDB::Error: IO Error: Could not read enough bytes from file "/projection-cache/bundled/.duckdb_hardlink_036c9176-bfd7-4228-a936-a14491ba0be4.duckdb": attempted to read 262144 bytes from location 1323008
Usage pattern
We use Aws::S3::TransferManager and call:
transfer_manager.download_file(destination_path, bucket: bucket, key: s3_key)
# then file is read immediately by another component
Regression Issue
Expected Behavior
When using Aws::S3::TransferManager#download_file with a String destination path, the destination should never be observed in a partially written state by concurrent readers. Immediately reading the destination path after download_file returns should be safe and should not intermittently fail with short-read / incomplete-file errors.
Current Behavior
We intermittently see read failures immediately after download_file, then success on retry.
Representative error:
DuckDB::Error: IO Error: Could not read enough bytes from file "/projection-cache/bundled/.duckdb_hardlink_036c9176-bfd7-4228-a936-a14491ba0be4.duckdb": attempted to read 262144 bytes from location 1323008
Representative log:
2026-05-26T10:00:37.641503943Z stdout F pid=1 tid=4037d WARN: DuckDB::Error: IO Error: Could not read enough bytes from file "/projection-cache/bundled/.duckdb_hardlink_036c9176-bfd7-4228-a936-a14491ba0be4.duckdb": attempted to read 262144 bytes from location 1323008
Operational pattern:
- first attempt fails
- retry succeeds
This suggests a transient file-availability/consistency window.
Reproduction Steps
-
Use Aws::S3::TransferManager and call:
transfer_manager.download_file(destination_path, bucket: bucket, key: key)
-
Immediately open/read destination_path from another component/thread/process.
-
Repeat under concurrent load with repeated writes to the same destination path.
-
Observe occasional short-read/incomplete-read failures on first attempt; retry usually succeeds.
Notes:
- We cannot provide a minimal deterministic standalone repro yet (issue is intermittent).
- We observed this in Kubernetes production workloads.
Possible Solution
No response
Additional Information/Context
We suspect small-object behavior differences in Aws::S3::FileDownloader:
|
def multipart_download(opts) |
As a mitigation on our side, we now force temp destination + File.rename to final path at the application layer before exposing the final cache path.
Gem name ('aws-sdk', 'aws-sdk-resources' or service gems like 'aws-sdk-s3') and its version
aws-sdk-s3
Environment details (Version of Ruby, OS environment)
- aws-sdk-s3: 1.213.0 - aws-sdk-core: 3.242.0 - Ruby: 4.0.3p0 - Runtime: Kubernetes / Linux - Service: nds
Describe the bug
Summary
We observed transient read failures immediately after
Aws::S3::TransferManager#download_filewith a String destination path. First attempt fails, retry succeeds.Environment
Observed error
DuckDB::Error: IO Error: Could not read enough bytes from file "/projection-cache/bundled/.duckdb_hardlink_036c9176-bfd7-4228-a936-a14491ba0be4.duckdb": attempted to read 262144 bytes from location 1323008Representative log:
2026-05-26T10:00:37.641503943Z stdout F pid=1 tid=4037d WARN: DuckDB::Error: IO Error: Could not read enough bytes from file "/projection-cache/bundled/.duckdb_hardlink_036c9176-bfd7-4228-a936-a14491ba0be4.duckdb": attempted to read 262144 bytes from location 1323008Usage pattern
We use
Aws::S3::TransferManagerand call:Regression Issue
Expected Behavior
When using Aws::S3::TransferManager#download_file with a String destination path, the destination should never be observed in a partially written state by concurrent readers. Immediately reading the destination path after download_file returns should be safe and should not intermittently fail with short-read / incomplete-file errors.
Current Behavior
We intermittently see read failures immediately after download_file, then success on retry.
Representative error:
DuckDB::Error: IO Error: Could not read enough bytes from file "/projection-cache/bundled/.duckdb_hardlink_036c9176-bfd7-4228-a936-a14491ba0be4.duckdb": attempted to read 262144 bytes from location 1323008
Representative log:
2026-05-26T10:00:37.641503943Z stdout F pid=1 tid=4037d WARN: DuckDB::Error: IO Error: Could not read enough bytes from file "/projection-cache/bundled/.duckdb_hardlink_036c9176-bfd7-4228-a936-a14491ba0be4.duckdb": attempted to read 262144 bytes from location 1323008
Operational pattern:
This suggests a transient file-availability/consistency window.
Reproduction Steps
Use Aws::S3::TransferManager and call:
transfer_manager.download_file(destination_path, bucket: bucket, key: key)
Immediately open/read destination_path from another component/thread/process.
Repeat under concurrent load with repeated writes to the same destination path.
Observe occasional short-read/incomplete-read failures on first attempt; retry usually succeeds.
Notes:
Possible Solution
No response
Additional Information/Context
We suspect small-object behavior differences in Aws::S3::FileDownloader:
aws-sdk-ruby/gems/aws-sdk-s3/lib/aws-sdk-s3/file_downloader.rb
Line 140 in cef4dd5
As a mitigation on our side, we now force temp destination + File.rename to final path at the application layer before exposing the final cache path.
Gem name ('aws-sdk', 'aws-sdk-resources' or service gems like 'aws-sdk-s3') and its version
aws-sdk-s3
Environment details (Version of Ruby, OS environment)