-
Notifications
You must be signed in to change notification settings - Fork 182
[DOC-14129]: Magma Compaction rate limiter documentation #4122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release/8.1
Are you sure you want to change the base?
Changes from all commits
0d4b89e
6048ae0
c04f983
1130cbc
6bf5b6b
3fdb80e
721650e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,102 @@ | ||
| = Magma Compaction Rate Limiting | ||
| :description: This API is used to set the rate limit for Magma compaction threads. | ||
| :stem: asciimath | ||
| [abstract] | ||
| {description} | ||
|
|
||
| == Description | ||
|
|
||
| Use these API calls to limit the disk I/O bandwidth (in bytes per second) that Magma Storage Engine Compaction may consume. | ||
| The limit is global to the Data Service and shared across all buckets on the node | ||
|
|
||
|
RayOffiah marked this conversation as resolved.
|
||
| Magma supports 2 types of compaction: | ||
|
|
||
| Log-Structured Merge (LSM) Tree Compactions:: | ||
| These compactions are short-running and essential for preserving the structure of the Log-Structured Merge tree. | ||
| Delaying these operations can lead to increased read latency, as the read path may need to use multiple files, resulting in higher read amplification. | ||
| If this type of compaction is lagging for level-0 of the LSM Tree, the system performs level-0 compaction within the writer threads themselves | ||
|
|
||
| Data Log Compactions:: | ||
| Data compactions can be de-prioritized compared to LSM Tree compactions. | ||
| The only consequence of slowing these operations is increased space usage. | ||
|
|
||
| == Usage | ||
|
|
||
| Rate limiting helps wherever compaction can surge and contend with front-end I/O, for example: | ||
|
|
||
| * Short-burst, write-heavy workloads, where spikes in writes drive spikes in compaction. | ||
| * Large collection drops, which mark a large amount of data for removal at once. | ||
| * Sudden large number document expirations, such as a batch of items with closer TTL ranges expiring together. | ||
|
|
||
| The trade-off is that compaction runs more slowly, so disk usage may stay higher for longer. | ||
| Set the limit low enough to protect front-end write bursts, but high enough that compaction keeps pace with the rate at which data is written. | ||
|
|
||
|
|
||
| == Recommended Value | ||
|
|
||
| A practical starting point is 100 MB/s (104857600 bytes/s), or about 10× your incoming write rate — estimated as writes per second per node × average document size. | ||
| For example, 10,000 writes/s of 1 KB documents is approximately 10 MB/s incoming, giving a ~100 MB/s limit. | ||
|
|
||
| == Settings | ||
|
|
||
| [cols="1,2,1"] | ||
| |=== | ||
| | Setting | Description| Range | ||
|
|
||
| | `magma_compaction_rate_limit` | ||
| | Controls the I/O bandwidth | ||
| (given in bytes per second) | ||
| that compactions across all shards consume. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we say "compactions on all shards on one node consume"? |
||
| Setting this value to 0 disables compaction rate limiting. | ||
| | stem:[0 … 2^64 - 1] | ||
| (default: 0) | ||
|
|
||
|
|
||
| |=== | ||
|
|
||
| == HTTP Methods and URIs | ||
|
|
||
| [source, shell] | ||
| ---- | ||
| GET /pools/default/settings/memcached/global | ||
| POST /pools/default/settings/memcached/global | ||
|
|
||
| ---- | ||
|
|
||
| == Examples | ||
|
|
||
| .Retrieve the compaction limit settings | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd prefer if we changed this to "Compaction Rate Limit Settings" (and on line 85) |
||
| [source, shell] | ||
| ---- | ||
| curl -u Administrator:password -X GET http://localhost:8091/pools/default/settings/memcached/global | ||
| ---- | ||
|
|
||
| Running this command returns the global settings: | ||
|
|
||
| [source, json] | ||
| ---- | ||
| { | ||
| "max_connections": 20000, | ||
| "system_connections": 5000, | ||
| "magma_enable_compaction_dataonly_ratelimiting": false | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we don't have data-only rate limiting exposed, looking for this name instead of |
||
| } | ||
| ---- | ||
|
|
||
| .Setting the compaction rate | ||
| [source, shell] | ||
| ---- | ||
| curl -u Administrator:password -X POST localhost:8091/pools/default/settings/memcached/global \ | ||
| -d magma_compaction_rate_limit=500 | jq | ||
| ---- | ||
|
|
||
| The call executes and returns a JSON object containing the new setting. | ||
|
|
||
| [source, json] | ||
| ---- | ||
| { | ||
| "max_connections": 20000, | ||
| "system_connections": 5000, | ||
| "magma_compaction_rate_limit": 500, | ||
| "magma_enable_compaction_dataonly_ratelimiting": false | ||
| } | ||
| ---- | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be worth adding that "though the setting is global, the application is on a per-node basis, allowing compactions on each node to consume a maximum of (compaction rate limit) bytes per second"