Skip to content
This repository was archived by the owner on Apr 28, 2025. It is now read-only.

Commit 2250a8f

Browse files
author
Tyler Reid
authored
Merge pull request #370 from grafana/treid/not-found-index-details
Clarify the gsutil mv command for moving corrupted blocks
2 parents f0ed263 + d346fbb commit 2250a8f

File tree

2 files changed

+7
-8
lines changed

2 files changed

+7
-8
lines changed

CHANGELOG.md

+1
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@
4141
* "Sharding Initial State Sync" row - information about the initial state sync procedure when sharding is enabled.
4242
* "Sharding Runtime State Sync" row - information about various state operations which occur when sharding is enabled (replication, fetch, marge, persist).
4343
* [ENHANCEMENT] Added 256MB memory ballast to querier. #369
44+
* [ENHANCEMENT] Update gsutil command for `not healthy index found` playbook #370
4445
* [BUGFIX] Fixed `CortexIngesterHasNotShippedBlocks` alert false positive in case an ingester instance had ingested samples in the past, then no traffic was received for a long period and then it started receiving samples again. #308
4546
* [BUGFIX] Alertmanager: fixed `--alertmanager.cluster.peers` CLI flag passed to alertmanager when HA is enabled. #329
4647
* [BUGFIX] Fixed `CortexInconsistentRuntimeConfig` metric. #335

cortex-mixin/docs/playbooks.md

+6-8
Original file line numberDiff line numberDiff line change
@@ -372,24 +372,22 @@ How to **investigate**:
372372
The compactor may fail to compact blocks due a corrupted block index found in one of the source blocks:
373373
374374
```
375-
level=error ts=2020-07-12T17:35:05.516823471Z caller=compactor.go:339 component=compactor msg="failed to compact user blocks" user=REDACTED err="compaction: group 0@6672437747845546250: block with not healthy index found /data/compact/0@6672437747845546250/REDACTED; Compaction level 1; Labels: map[__org_id__:REDACTED]: 1/1183085 series have an average of 1.000 out-of-order chunks: 0.000 of these are exact duplicates (in terms of data and time range)"
375+
level=error ts=2020-07-12T17:35:05.516823471Z caller=compactor.go:339 component=compactor msg="failed to compact user blocks" user=REDACTED-TENANT err="compaction: group 0@6672437747845546250: block with not healthy index found /data/compact/0@6672437747845546250/REDACTED-BLOCK; Compaction level 1; Labels: map[__org_id__:REDACTED]: 1/1183085 series have an average of 1.000 out-of-order chunks: 0.000 of these are exact duplicates (in terms of data and time range)"
376376
```
377377
378378
When this happen you should:
379379
1. Rename the block prefixing it with `corrupted-` so that it will be skipped by the compactor and queriers. Keep in mind that doing so the block will become invisible to the queriers too, so its series/samples will not be queried. If the corruption affects only 1 block whose compaction `level` is 1 (the information is stored inside its `meta.json`) then Cortex guarantees no data loss because all the data is replicated across other blocks. In all other cases, there may be some data loss once you rename the block and stop querying it.
380380
2. Ensure the compactor has recovered
381381
3. Investigate offline the root cause (eg. download the corrupted block and debug it locally)
382382
383-
To rename a block stored on GCS you can use the `gsutil` CLI:
384-
383+
To rename a block stored on GCS you can use the `gsutil` CLI command:
385384
```
386-
# Replace the placeholders:
387-
# - BUCKET: bucket name
388-
# - TENANT: tenant ID
389-
# - BLOCK: block ID
390-
391385
gsutil mv gs://BUCKET/TENANT/BLOCK gs://BUCKET/TENANT/corrupted-BLOCK
392386
```
387+
Where:
388+
- `BUCKET` is the gcs bucket name the compactor is using. The cell's bucket name is specified as the `blocks_storage_bucket_name` in the cell configuration
389+
- `TENANT` is the tenant id reported in the example error message above as `REDACTED-TENANT`
390+
- `BLOCK` is the last part of the file path reported as `REDACTED-BLOCK` in the example error message above
393391
394392
### CortexBucketIndexNotUpdated
395393

0 commit comments

Comments
 (0)