Skip to content

[configure] "\"rejected connection\" EOF Warnings in etcd Pod Logs Are TLS-Probe Noise"#381

Open
jing2uo wants to merge 1 commit intomainfrom
kb/2026-02/rejected-connection-eof-warnings-in-etcd
Open

[configure] "\"rejected connection\" EOF Warnings in etcd Pod Logs Are TLS-Probe Noise"#381
jing2uo wants to merge 1 commit intomainfrom
kb/2026-02/rejected-connection-eof-warnings-in-etcd

Conversation

@jing2uo
Copy link
Copy Markdown
Collaborator

@jing2uo jing2uo commented Apr 24, 2026

新增一篇 ACP KB 文章,归入 configure 区域。

✅ 自动化验证通过 — 可自动合并 — 3 / 3 条验证步骤在真实 Kubernetes 集群上按文章命令跑通(2026-04-24T05:05:30Z)。

configure 区域建议 reviewer

kb/OWNERS.md + kb/KB_REVIEWERS.md 该区域的活跃人自动挑选,@ 错了请无视。

@changluyi @zhangzujian @oilbeater

没有 GitHub handle 的贡献者(本区域相关请人工 ping):

Summary by CodeRabbit

Release Notes

  • Documentation
    • New troubleshooting guide for understanding etcd "rejected connection" log warnings with EOF errors
    • Includes cluster health verification checklists, log filtering recommendations, and diagnostic commands to identify warning sources

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 24, 2026

Walkthrough

A new solution document is introduced explaining that etcd "rejected connection" logs ending with error: EOF typically originate from TLS handshake completion without subsequent gRPC requests in healthy clusters, and are generally informational rather than indicative of real issues.

Changes

Cohort / File(s) Summary
Documentation
docs/en/solutions/rejected_connection_EOF_Warnings_in_etcd_Pod_Logs_Are_TLS_Probe_Noise.md
New solution document explaining that etcd "rejected connection" + EOF error patterns are TLS probe noise in healthy clusters. Includes health verification checklist (endpoint health, /readyz, operator conditions), filtering guidance for log collectors, diagnostic commands to count warnings and identify probe sources, and criteria for determining when the warnings indicate real TLS issues.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Poem

🐰 A warning hopped by, oh what a fright,
But EOF whispers—just TLS in flight!
No quorum's broken, no latency's high,
Just probes shaking hands then waving goodbye. 🌿

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main change: documenting that 'rejected connection' EOF warnings in etcd logs are TLS-probe noise, which is the core focus of the new documentation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch kb/2026-02/rejected-connection-eof-warnings-in-etcd

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
docs/en/solutions/rejected_connection_EOF_Warnings_in_etcd_Pod_Logs_Are_TLS_Probe_Noise.md (1)

96-96: Minor wording cleanup

“with one of the following” reads cleaner than “together with one of the following.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/en/solutions/rejected_connection_EOF_Warnings_in_etcd_Pod_Logs_Are_TLS_Probe_Noise.md`
at line 96, The phrase "A cluster that is actually unhealthy will show the same
line together with one of the following, and this is the case that needs
investigation:" should be simplified to "A cluster that is actually unhealthy
will show the same line with one of the following, and this is the case that
needs investigation:" — update that sentence in the document (locate the
sentence by the exact text "A cluster that is actually unhealthy will show the
same line together with one of the following, and this is the case that needs
investigation:") to remove the word "together" for clearer wording.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@docs/en/solutions/rejected_connection_EOF_Warnings_in_etcd_Pod_Logs_Are_TLS_Probe_Noise.md`:
- Around line 35-36: The wording in the paragraph that attributes the TCP/TLS
handshake-only probe pattern definitively to upstream components is too strong;
change the sentence referencing `api-server` and `kube-controller-manager` to
present them as possible sources rather than definitive actors (e.g., replace
"The behaviour is driven by upstream `api-server` and `kube-controller-manager`"
with language like "Possible sources include the `api-server` and
`kube-controller-manager`"), and add a short qualifier that concrete
packet/process evidence is required to confirm the exact probe origin.

---

Nitpick comments:
In
`@docs/en/solutions/rejected_connection_EOF_Warnings_in_etcd_Pod_Logs_Are_TLS_Probe_Noise.md`:
- Line 96: The phrase "A cluster that is actually unhealthy will show the same
line together with one of the following, and this is the case that needs
investigation:" should be simplified to "A cluster that is actually unhealthy
will show the same line with one of the following, and this is the case that
needs investigation:" — update that sentence in the document (locate the
sentence by the exact text "A cluster that is actually unhealthy will show the
same line together with one of the following, and this is the case that needs
investigation:") to remove the word "together" for clearer wording.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 67311695-ca9c-4c5f-8e56-fde125cd89d6

📥 Commits

Reviewing files that changed from the base of the PR and between 5c78e3d and 9bc16d1.

📒 Files selected for processing (1)
  • docs/en/solutions/rejected_connection_EOF_Warnings_in_etcd_Pod_Logs_Are_TLS_Probe_Noise.md

Comment on lines +35 to +36
The behaviour is driven by upstream `api-server` and `kube-controller-manager` components that perform TCP-level liveness/readiness checks against the etcd endpoint. They open a connection, complete the handshake to confirm the serving certificate is valid, and close without issuing a request — this is a cheap way to verify that etcd is accepting TLS traffic without consuming any API quota or writing to the raft log.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Overstated source attribution for the probe pattern

This section is too definitive and likely inaccurate as written. kube-controller-manager is not typically a direct etcd TCP/TLS prober in standard control-plane setups, and kube-apiserver etcd interactions are generally request-level rather than “handshake-only then close.” Please soften this to “possible sources” unless you have packet/process evidence.

Suggested wording adjustment
-The behaviour is driven by upstream `api-server` and `kube-controller-manager` components that perform TCP-level liveness/readiness checks against the etcd endpoint. They open a connection, complete the handshake to confirm the serving certificate is valid, and close without issuing a request — this is a cheap way to verify that etcd is accepting TLS traffic without consuming any API quota or writing to the raft log.
+This behaviour is commonly caused by control-plane or monitoring probes that open a TLS connection and close it before sending a gRPC payload. The exact source is deployment-specific; confirm it by correlating `remote-addr` with control-plane/monitoring endpoints before attributing it to a specific component.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
The behaviour is driven by upstream `api-server` and `kube-controller-manager` components that perform TCP-level liveness/readiness checks against the etcd endpoint. They open a connection, complete the handshake to confirm the serving certificate is valid, and close without issuing a request — this is a cheap way to verify that etcd is accepting TLS traffic without consuming any API quota or writing to the raft log.
This behaviour is commonly caused by control-plane or monitoring probes that open a TLS connection and close it before sending a gRPC payload. The exact source is deployment-specific; confirm it by correlating `remote-addr` with control-plane/monitoring endpoints before attributing it to a specific component.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/en/solutions/rejected_connection_EOF_Warnings_in_etcd_Pod_Logs_Are_TLS_Probe_Noise.md`
around lines 35 - 36, The wording in the paragraph that attributes the TCP/TLS
handshake-only probe pattern definitively to upstream components is too strong;
change the sentence referencing `api-server` and `kube-controller-manager` to
present them as possible sources rather than definitive actors (e.g., replace
"The behaviour is driven by upstream `api-server` and `kube-controller-manager`"
with language like "Possible sources include the `api-server` and
`kube-controller-manager`"), and add a short qualifier that concrete
packet/process evidence is required to confirm the exact probe origin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant