Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,16 +52,20 @@ Provisioned monitoring assets live under:

- `monitoring/prometheus/prometheus.yml`
- `monitoring/grafana/dashboards/elastickv-cluster-overview.json`
- `monitoring/grafana/dashboards/elastickv-cluster-summary.json`
- `monitoring/grafana/dashboards/elastickv-dynamodb.json`
- `monitoring/grafana/dashboards/elastickv-raft-status.json`
- `monitoring/grafana/dashboards/elastickv-redis-summary.json`
- `monitoring/grafana/dashboards/elastickv-pebble-internals.json`
- `monitoring/grafana/provisioning/`
- `monitoring/docker-compose.yml`

The provisioned dashboards are organized by operator task:

- `Elastickv Cluster Overview` is the landing page for leader identity, cluster-wide latency/error posture, and per-node Raft health
- `Elastickv Request Health` is the DynamoDB/API drilldown for slow operations, noisy nodes, and hot/erroring tables
- `Elastickv Cluster` is the landing page for leader identity, cluster-wide latency/error posture, and per-node Raft health
- `Elastickv DynamoDB` is the DynamoDB-compatible API drilldown for slow operations, noisy nodes, and hot/erroring tables
- `Elastickv Raft Status` is the control-plane drilldown for membership, leader changes, failed proposals, node state, index drift, backlog, and leader contact
- `Elastickv Redis` is the Redis-compatible API drilldown for per-command throughput/latency/errors, with a collapsible `Hot Path` row for GET fast-path (PR #560) verification
- `Elastickv Pebble Internals` is the storage-engine drilldown for block cache, L0 pressure, compactions, memtables, and store write conflicts

If you bind `--metricsAddress` to a non-loopback address, `--metricsToken` is required. Prometheus must send the same bearer token, for example:

Expand Down
6 changes: 4 additions & 2 deletions docs/redis_hotpath_dashboard.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# Redis Hot Path Dashboard (PR #560 verification)

`monitoring/grafana/dashboards/elastickv-redis-hotpath.json` is the
The "Hot Path (legacy PR #560)" collapsed row at the bottom of
`monitoring/grafana/dashboards/elastickv-redis-summary.json` is the
operator view for the Redis GET hot path. It was added to confirm that
PR #560 (`a45ca291` "perf(redis): fast-path GET to avoid ~17-seek type
probe") landed cleanly in production.
probe") landed cleanly in production. Expand the row to see the
panels described below.

## How to confirm #560 worked

Expand Down
2 changes: 1 addition & 1 deletion internal/raftengine/etcd/engine.go
Original file line number Diff line number Diff line change
Expand Up @@ -750,7 +750,7 @@ func (e *Engine) LastQuorumAck() time.Time {
// heartbeat channel was full. Monotonic across the life of the engine.
// Surfaced to Prometheus via the monitoring package so the hot-path
// dashboard can graph stepCh saturation alongside LinearizableRead
// rate (see monitoring/grafana/dashboards/elastickv-redis-hotpath.json).
// rate (see the "Hot Path" row in monitoring/grafana/dashboards/elastickv-redis-summary.json).
func (e *Engine) DispatchDropCount() uint64 {
if e == nil {
return 0
Expand Down
2 changes: 1 addition & 1 deletion kv/coordinator.go
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ type LeaseReadObserver interface {
// WithLeaseReadObserver wires a LeaseReadObserver onto a Coordinate.
// This is the mechanism monitoring uses to surface the lease-hit ratio
// panel on the Redis hot-path dashboard (see
// monitoring/grafana/dashboards/elastickv-redis-hotpath.json).
// the "Hot Path" row in monitoring/grafana/dashboards/elastickv-redis-summary.json).
//
// Typed-nil guard: a caller passing a typed-nil pointer
// (e.g. `var o *myObserver; WithLeaseReadObserver(o)`) produces an
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1280,7 +1280,7 @@
},
"timepicker": {},
"timezone": "browser",
"title": "Elastickv Cluster Overview",
"title": "Elastickv Cluster",
"uid": "elastickv-cluster",
"version": 2
}
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
}
]
},
"description": "DynamoDB-compatible request health for elastickv: throughput, latency, errors, tables, and noisy nodes.",
"description": "DynamoDB-compatible API health for elastickv: per-operation throughput and latency, per-table request/error breakdown, and item-volume panels.",
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 1,
Expand Down Expand Up @@ -918,7 +918,7 @@
"elastickv",
"dynamodb",
"requests",
"summary"
"api"
],
"templating": {
"list": [
Expand Down Expand Up @@ -1019,7 +1019,7 @@
},
"timepicker": {},
"timezone": "browser",
"title": "Elastickv Request Health",
"uid": "elastickv-cluster-summary",
"title": "Elastickv DynamoDB",
"uid": "elastickv-dynamodb",
"version": 2
}
Loading
Loading