docs: How to deploy a replicated ClickHouse cluster with embedded ClickHouse Keeper#790
docs: How to deploy a replicated ClickHouse cluster with embedded ClickHouse Keeper#790zlcnju wants to merge 7 commits into
Conversation
…kHouse Keeper Covers coordination topology choice (external ZooKeeper / standalone Keeper / embedded Keeper), a full ClickHouseInstallation manifest with init-container-generated raft configuration, quorum and replication verification, and recommended settings for log storage workloads. Notes that the platform ClickHouse Operator (0.20.x) does not include the upstream ClickHouseKeeperInstallation controller, so embedded or standalone Keeper is required. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughA new Markdown guide is added that documents deploying a replicated ClickHouse cluster using ClickHouse Keeper embedded in each ClickHouse Pod via the ClickHouse Operator. The guide covers topology selection, prerequisite environment variables, a headless Service manifest, a full Embedded Keeper Deployment Documentation
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@docs/en/solutions/How_to_Deploy_a_Replicated_ClickHouse_Cluster_with_Embedded_ClickHouse_Keeper.md`:
- Around line 286-287: Update the paragraph to explicitly state that scaling
shards requires updating both SHARDS_COUNT and REPLICAS_COUNT because the Raft
config generation loop uses SHARDS_COUNT as well as REPLICAS_COUNT; mention the
server_id formula `SHARD * REPLICAS_COUNT + REPLICA + 1`, the init container
that writes into the in-memory emptyDir and merges via `include_from`, and that
static fragments under `layout.shards[].files` remain unchanged—so when adding
shards you must increase SHARDS_COUNT (not just REPLICAS_COUNT) to avoid
truncating the generated member list.
- Around line 212-241: The init script uses DOMAIN=$(hostname -d) which returns
only the DNS search suffix, breaking the regex that expects the full pod FQDN;
change DOMAIN assignment to use the full hostname (DOMAIN=$(hostname -f) or
`hostname --fqdn`) so the existing regex that extracts DOMAIN_NAME and
DOMAIN_SUFFIX from the full FQDN works correctly; keep the current regex and
downstream variables (DOMAIN_NAME, DOMAIN_SUFFIX, MY_ID, KEEPER_ID) unchanged so
the generated Raft peer hostnames are correct.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6cef2b9e-dc4b-42d4-bc22-4ceea6233807
📒 Files selected for processing (1)
docs/en/solutions/How_to_Deploy_a_Replicated_ClickHouse_Cluster_with_Embedded_ClickHouse_Keeper.md
…chop number Use the platform clickhouse-operator component version (v4.2.3 on ACP 4.2) in the Environment and operator-managed-Keeper note, since the upstream Altinity 0.20.x base version is not meaningful to ACP users. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…m.zookeeper_connection) Lead the quorum verification with the dedicated clickhouse-keeper-client CLI, clarify that mntr is ClickHouse Keeper's own 4lw command (not an external ZooKeeper), and switch the connection check to system.zookeeper_connection. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Parameterize the image, StorageClass, and PVC size with
${CLICKHOUSE_IMAGE}/${STORAGE_CLASS}/${STORAGE_SIZE} instead of bare
<...> placeholders, matching the existing ${NAMESPACE}/${CHI_NAME} style.
- List all required environment variables up front in Prerequisites.
- Render manifests with envsubst using an explicit variable allow-list so
the init container's runtime shell variables are preserved.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remove the note about the upstream ClickHouseKeeperInstallation controller not being present; keep only a neutral description of how the operator consumes spec.configuration.zookeeper. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Address review feedback: the init container's raft member list loops over both SHARDS_COUNT and REPLICAS_COUNT, so both must match the layout. Add a note that embedded Keeper suits small clusters and many-shard clusters should use a standalone Keeper quorum. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Match the ecosystem layout (kafka/redis/zookeeper/... each have their own directory) by placing the article under docs/en/solutions/ecosystem/clickhouse/. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
Adds a How-To article: How to Deploy a Replicated ClickHouse Cluster with Embedded ClickHouse Keeper (
docs/en/solutions/).ClickHouseKeeperInstallation(CHK) controller — the CHK CRD is not installed — so Keeper must run embedded in the ClickHouse Pods or as a standalone StatefulSet.ClickHouseInstallationmanifest: 1 shard × 3 replicas forming a 3-member Keeper Raft quorum, statickeeper_config.xmlinjected per shard, dynamicserver_id/raft_configurationgenerated by an init container viainclude_from, anti-affinity, raft-port readiness probe, per-replica Service exposing 9181/9444.mntrfour-letter command,system.zookeeper, ReplicatedMergeTree smoke test across replicas,system.replicashealth.max_execution_time, retention TTLs) and the trade-offs/constraints of the embedded topology.Validation
Validated on an ACP 4.2 environment (ClickHouse Operator v4.2.3 / chop 0.20.0, ClickHouse Server 25.x):
🤖 Generated with Claude Code
Summary by CodeRabbit