Describe the feature
For workloads that issue many small GetObject requests concurrently, the S3CrtClient currently forces an oversized memory footprint per request and offers no way to tell the CRT how big an object actually is. Two related changes would remove this:
- Allow
partSize (or a per-request part size) below the current 8 MiB floor for the read path.
- Expose
object_size_hint on GetObjectRequest (and related read requests) so it is passed through to aws_s3_meta_request_options.object_size_hint.
Current behavior
1. The SDK clamps partSize to an 8 MiB minimum for all operations, including reads.
In S3CrtClient::init (generated/src/aws-cpp-sdk-s3-crt/source/S3CrtClient.cpp, ~L375):
static const size_t DEFAULT_PART_SIZE = 8 * 1024 * 1024; // 8MB
s3CrtConfig.part_size = config.partSize < DEFAULT_PART_SIZE ? DEFAULT_PART_SIZE : config.partSize;
Any partSize below 8 MiB is silently raised to 8 MiB before it ever reaches the CRT.
2. aws-c-s3 imposes no minimum part size on reads — the 5 MiB minimum is upload-only.
In aws_s3_client_make_meta_request (aws-c-s3/source/s3_client.c), the AWS_S3_META_REQUEST_TYPE_GET_OBJECT branch passes part_size straight through, while the minimum (g_s3_min_upload_part_size, ~L1399) is enforced only in the PUT_OBJECT branch. Constants (aws-c-s3/source/s3_util.c):
const size_t g_s3_min_upload_part_size = MB_TO_BYTES(5); // upload-only
const uint64_t g_default_part_size_fallback = MB_TO_BYTES(8); // dynamic default
So the SDK's 8 MiB read floor is a wrapper-imposed limit with no corresponding constraint in the C library.
3. No per-request CRT options are surfaced.
The GetObject paths build options with AWS_ZERO_STRUCT(options) and only set endpoint, callbacks, shutdown_callback, type, signing_config, and message. object_size_hint, per-request part_size, and multipart_upload_threshold are left zeroed — even though aws_s3_meta_request_options exposes them (aws-c-s3/include/aws/s3/s3_client.h: object_size_hint ~L1009, per-request part_size ~L896).
Use Case
Our workload reads a large number of small objects (well under 8 MiB) in parallel. With S3CrtClient this is memory-bound rather than network-bound:
- Each auto-ranged-get meta request's first (size-discovery) request reserves a full
part_size buffer from the CRT buffer pool's primary area, regardless of the real object size. With the SDK's effective 8 MiB floor, every in-flight small download pins ~8 MiB.
- This caps achievable concurrency for a given
memoryLimitBytes and wastes pool memory on objects that may be a few KiB each.
The underlying CRT already supports the right primitives; the SDK simply does not surface them.
Proposed Solution
-
Surface object_size_hint on GetObjectRequest (and other read meta requests), wiring it to options.object_size_hint in S3CrtClient.
-
Relax the read-path part-size floor. Either stop clamping for GET_OBJECT, or apply the 8 MiB floor only where the CRT actually requires it (the PUT_OBJECT path already enforces its own minimum safely).
-
(Nice to have) Expose per-request part_size (aws_s3_meta_request_options.part_size) so part sizing can be tuned per download without a client-wide setting.
-
Fix the partSize doc comment in S3CrtClientConfiguration.h (~L87), which is inaccurate:
"defaults to 8MB, if user set it to be less than 5MB, CRT will set it to 5MB."
The real floor is 8 MiB (not 5 MiB), it is applied by the SDK wrapper (not the CRT), and it currently applies to both reads and writes. (Example of the inaccuracy: setting partSize = 6 MiB — legal per the docs — silently yields 8 MiB.)
Other Information
Enabler
awslabs/aws-c-s3#639 ("Use object_size_hint to size the discovery buffer for small, non-ranged GetObject") makes the CRT right-size the size-discovery buffer from object_size_hint instead of reserving a full part_size. Once that lands, passing object_size_hint from the SDK directly reduces per-request memory for small downloads — which is the main win for this workload.
Acknowledgements
Describe the feature
For workloads that issue many small
GetObjectrequests concurrently, theS3CrtClientcurrently forces an oversized memory footprint per request and offers no way to tell the CRT how big an object actually is. Two related changes would remove this:partSize(or a per-request part size) below the current 8 MiB floor for the read path.object_size_hintonGetObjectRequest(and related read requests) so it is passed through toaws_s3_meta_request_options.object_size_hint.Current behavior
1. The SDK clamps
partSizeto an 8 MiB minimum for all operations, including reads.In
S3CrtClient::init(generated/src/aws-cpp-sdk-s3-crt/source/S3CrtClient.cpp, ~L375):Any
partSizebelow 8 MiB is silently raised to 8 MiB before it ever reaches the CRT.2.
aws-c-s3imposes no minimum part size on reads — the 5 MiB minimum is upload-only.In
aws_s3_client_make_meta_request(aws-c-s3/source/s3_client.c), theAWS_S3_META_REQUEST_TYPE_GET_OBJECTbranch passespart_sizestraight through, while the minimum (g_s3_min_upload_part_size, ~L1399) is enforced only in thePUT_OBJECTbranch. Constants (aws-c-s3/source/s3_util.c):So the SDK's 8 MiB read floor is a wrapper-imposed limit with no corresponding constraint in the C library.
3. No per-request CRT options are surfaced.
The
GetObjectpaths build options withAWS_ZERO_STRUCT(options)and only set endpoint, callbacks,shutdown_callback,type,signing_config, andmessage.object_size_hint, per-requestpart_size, andmultipart_upload_thresholdare left zeroed — even thoughaws_s3_meta_request_optionsexposes them (aws-c-s3/include/aws/s3/s3_client.h:object_size_hint~L1009, per-requestpart_size~L896).Use Case
Our workload reads a large number of small objects (well under 8 MiB) in parallel. With
S3CrtClientthis is memory-bound rather than network-bound:part_sizebuffer from the CRT buffer pool's primary area, regardless of the real object size. With the SDK's effective 8 MiB floor, every in-flight small download pins ~8 MiB.memoryLimitBytesand wastes pool memory on objects that may be a few KiB each.The underlying CRT already supports the right primitives; the SDK simply does not surface them.
Proposed Solution
Surface
object_size_hintonGetObjectRequest(and other read meta requests), wiring it tooptions.object_size_hintinS3CrtClient.Relax the read-path part-size floor. Either stop clamping for
GET_OBJECT, or apply the 8 MiB floor only where the CRT actually requires it (thePUT_OBJECTpath already enforces its own minimum safely).(Nice to have) Expose per-request
part_size(aws_s3_meta_request_options.part_size) so part sizing can be tuned per download without a client-wide setting.Fix the
partSizedoc comment inS3CrtClientConfiguration.h(~L87), which is inaccurate:The real floor is 8 MiB (not 5 MiB), it is applied by the SDK wrapper (not the CRT), and it currently applies to both reads and writes. (Example of the inaccuracy: setting
partSize = 6 MiB— legal per the docs — silently yields 8 MiB.)Other Information
Enabler
awslabs/aws-c-s3#639 ("Use object_size_hint to size the discovery buffer for small, non-ranged GetObject") makes the CRT right-size the size-discovery buffer from
object_size_hintinstead of reserving a fullpart_size. Once that lands, passingobject_size_hintfrom the SDK directly reduces per-request memory for small downloads — which is the main win for this workload.Acknowledgements