Skip to content

rename to vsphere-passthrough-node-controller + add crash-fence controller#7

Open
Varashi wants to merge 2 commits into
mainfrom
feat/crash-fence-controller
Open

rename to vsphere-passthrough-node-controller + add crash-fence controller#7
Varashi wants to merge 2 commits into
mainfrom
feat/crash-fence-controller

Conversation

@Varashi
Copy link
Copy Markdown
Owner

@Varashi Varashi commented Jun 3, 2026

Adds a second, optional controller (fence.py, fence.enabledoff by default) sharing this image and reusing the vCenter client + node↔VM map.

Why

A passthrough-GPU VM can't be vSphere-HA-restarted elsewhere during a host crash, so the node stays down and its RWO volume stays attached to the dead node — k8s won't auto-detach an unreachable node's volume (split-brain protection). A rescheduled stateful pod then hangs on Multi-Attach indefinitely. The node.kubernetes.io/out-of-service taint fixes this (force-detach + force-delete), but must only ever hit a confirmed-dead node.

What it does

Two-gate fence, sustained for fence.graceSeconds:

  1. k8s — node NotReady
  2. vCenter — that node's VM runtime.connectionState ∈ {disconnected, inaccessible, orphaned}

A clean (maintenance) power-off leaves the VM connected; only a real host loss makes it disconnected — so this is disjoint from the maintenance controller and the two never collide. Un-fences on recovery (connected + Ready).

Taint/un-taint only — power-on stays with vSphere HA (it restarts passthrough VMs on the original host once it reconnects); eviction is handled by tolerationSeconds + the taint.

Isolation

Separate Deployment, own ServiceAccount, least-privilege ClusterRole (nodes get/list/watch/patch only — no pods/eviction), independent kill switch (fence.enabled) and fence.dryRun. Off by default — a mis-fire is destructive.

Validation

Rollout

  1. Merge + tag v0.5.0 (CI builds image incl. fence.py + publishes chart).
  2. Bump the HR in Varashi/k8s, set fence.enabled: true + fence.dryRun: true, watch its decisions.
  3. Flip dryRun: false once trusted.

Closes the controller design in Varashi/k8s#166.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added an optional "crash-fence" controller to detect unplanned host crashes and automatically taint/un-taint nodes to enable volume detachment and pod rescheduling (disabled by default, configurable).
  • Documentation

    • Updated README and CHANGELOG with controller details, configuration knobs, and v0.5.0 notes; updated chart install/verify instructions.
  • Chores

    • Project and chart renamed to vsphere-passthrough-node-controller; chart and image defaults/version bumped to v0.5.0; CI/release workflow and install notes updated.
  • Tests

    • Added unit tests covering the fence controller fencing/unfencing logic.
  • Misc

    • Updated .gitignore to ignore Python cache.

…rash)

Adds fence.py — a second, optional Deployment (fence.enabled, off by
default) sharing this image and reusing the vCenter client + node↔VM map.
Automates non-graceful node shutdown for passthrough-GPU workers that
vSphere HA can't restart elsewhere during a host crash: applies the
node.kubernetes.io/out-of-service taint to a node confirmed dead by BOTH
k8s (NotReady) and vCenter (VM connectionState disconnected/inaccessible/
orphaned), sustained graceSeconds, so RWO volumes force-detach and pods
reschedule; removes it on recovery (connected + Ready).

Disjoint from the maintenance controller (clean power-off stays
'connected', only a crash goes 'disconnected') — no coordination needed.
Taint/un-taint only; power-on stays with vSphere HA. Own SA +
least-privilege ClusterRole (nodes only) + kill switch + dryRun.

test_fence.py covers the two-gate guard, grace hold, no-double-apply,
un-fence on recovery, and the partition/notfound guards (8/8).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 3, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

The pull request adds a new optional crash-fence controller (fence.py) that monitors GPU nodes during unplanned ESXi host failures. When both Kubernetes node readiness and vCenter VM connection state indicate a host crash, it applies an out-of-service taint to force pod detachment in stuck Multi-Attach scenarios. The feature includes shared infrastructure helpers in controller.py, comprehensive test coverage, Helm deployment templates, Docker image updates, and documentation.

Changes

Crash-fence controller feature

Layer / File(s) Summary
Shared taint and VM state constants and helpers
controller.py
Module constants define the out-of-service taint key/value and VM dead connection states. VSphereClient.get_vm_connection_state() queries vm.runtime.connectionState, and K8sClient gains taint query/apply/remove methods that preserve existing taints and support dry-run logging.
Fence controller main logic and reconciliation loop
fence.py
FenceController runs a polling loop that reconciles GPU nodes using two-gate logic: applies the out-of-service taint only when Kubernetes node is not Ready AND vCenter VM is disconnected, after both persist for a grace period; removes the taint when both conditions clear. Handles transient Kubernetes errors with retry.
Fence controller unit tests and test infrastructure
test_fence.py
Provides eight test scenarios covering two-gate fencing logic by stubbing heavy dependencies (kubernetes, pyVmomi) and injecting fake implementations. Tests cover no-fence cases, grace period timing, duplicate prevention, and recovery/un-taint flows, with result aggregation and exit status reporting.
Helm deployment template for fence controller
chart/templates/fence.yaml
Conditionally renders ServiceAccount, optional least-privilege ClusterRole with node list/watch/get/patch/update, and a single-replica Deployment running fence.py. Wires configuration via vCenter secret and environment variables for poll/grace/dry-run/GPU label/TLS settings.
Helm chart values for fence controller configuration
chart/values.yaml
Adds fence config block (disabled by default) with poll/grace timing, dry-run mode, and resource sizing for the Deployment; also updates image repository default.
Release metadata, container image, and documentation
chart/Chart.yaml, Dockerfile, README.md, CHANGELOG.md, .gitignore, chart/templates/*, .github/workflows/*
Bumps chart version to 0.5.0, renames chart/project to vsphere-passthrough-node-controller, includes fence.py in Docker image, updates Helm helper names and NOTES, updates CI/release workflow references, documents crash-fence in README/CHANGELOG, and ignores Python bytecode cache.

Sequence Diagram(s)

sequenceDiagram
  participant FenceController
  participant K8sClient
  participant VSphereClient
  FenceController->>K8sClient: list nodes (GPU label)
  K8sClient-->>FenceController: node list
  loop for each node
    FenceController->>K8sClient: get node Ready status
    K8sClient-->>FenceController: Ready state
    FenceController->>VSphereClient: get_vm_connection_state
    VSphereClient-->>FenceController: "connected" or "disconnected"/"inaccessible"/"orphaned"
    alt both gates active + grace elapsed
      FenceController->>K8sClient: apply_out_of_service_taint
      K8sClient-->>FenceController: taint applied
    else both gates inactive + taint present
      FenceController->>K8sClient: remove_out_of_service_taint
      K8sClient-->>FenceController: taint removed
    end
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Poem

🐰 A crash-fence grows to catch the stragglers,
Two gates must both swing open wide,
When nodes fall silent, vCenter whispers,
The taint descends—let pods decide! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately and concisely summarizes the two main changes: project rename to vsphere-passthrough-node-controller and addition of the crash-fence controller feature.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/crash-fence-controller

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CHANGELOG.md`:
- Line 10: The changelog declares "## [0.5.0]" but the reference links at the
bottom still treat 0.4.3 as the latest; add a "[0.5.0]:" link definition
pointing to the compare URL for v0.4.3...v0.5.0 (or the repo tag URL for v0.5.0)
and update the "[Unreleased]:" reference to compare against v0.5.0 (change its
right-hand/latest tag from v0.4.3 to v0.5.0); locate the "## [0.5.0]" header and
the link definitions named "[0.5.0]:" and "[Unreleased]:" to make these edits so
the release compare navigation is correct.

In `@chart/templates/fence.yaml`:
- Around line 27-30: The ClusterRole rules in fence.yaml grant both "patch" and
"update" on the "nodes" resource; remove the "update" verb and keep only "patch"
(and any existing verbs like "get","list","watch") to enforce least-privilege
for taint operations—locate the rules block (apiGroups: [""], resources:
["nodes"], verbs: [...]) and delete "update" from the verbs array.
- Line 5: The conditional in the template uses the redundant Go template "and"
operator with a single argument ({{- if and .Values.serviceAccount.create }});
change it to a direct boolean check by replacing the use of "and" with a simple
if on .Values.serviceAccount.create (i.e., use {{- if
.Values.serviceAccount.create }}), ensuring any matching {{ end }} remains
unchanged.

In `@controller.py`:
- Around line 484-488: has_out_of_service_taint currently matches taints by key
only which can erroneously detect/remove taints; update it and the related
functions apply_out_of_service_taint and remove_out_of_service_taint to match
the full taint identity (key, value, effect) — or at minimum key+effect —
instead of key-only. Locate uses of OUT_OF_SERVICE_TAINT_KEY and change the
predicate in has_out_of_service_taint to compare (t.key, t.value, t.effect) to
the controller's OUT_OF_SERVICE taint triple, ensure apply_out_of_service_taint
creates the taint with the same (key,value,effect) tuple, and change
remove_out_of_service_taint to filter/remove only taints matching that full
tuple so you don't touch taints the controller didn't create.

In `@README.md`:
- Around line 121-129: Update the README examples to reference the 0.5.0
chart/tag wherever examples show 0.4.3 so the documented fence.* fields (dryRun,
graceSeconds, pollSeconds) match the supported release, and update the
source-layout description that currently says “single controller.py” to reflect
the new split (mention both controller.py and fence.py) so readers know fence.py
exists; search for occurrences of the chart tag/version, the phrase "single
controller.py", and the fence.* example block to make the changes consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 8030bd06-6a49-48c2-bef3-c134b0815e0d

📥 Commits

Reviewing files that changed from the base of the PR and between d69ef1d and 19c498c.

📒 Files selected for processing (10)
  • .gitignore
  • CHANGELOG.md
  • Dockerfile
  • README.md
  • chart/Chart.yaml
  • chart/templates/fence.yaml
  • chart/values.yaml
  • controller.py
  • fence.py
  • test_fence.py

Comment thread CHANGELOG.md

## [Unreleased]

## [0.5.0] — 2026-06-03
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add the 0.5.0 reference links and move Unreleased baseline forward.

0.5.0 is declared, but the link definitions still treat 0.4.3 as latest. This breaks compare navigation for the new release.

Suggested link fixes
-[Unreleased]: https://github.com/Varashi/gpu-node-vsphere-maintenance-controller/compare/v0.4.3...HEAD
+[Unreleased]: https://github.com/Varashi/gpu-node-vsphere-maintenance-controller/compare/v0.5.0...HEAD
+[0.5.0]: https://github.com/Varashi/gpu-node-vsphere-maintenance-controller/compare/v0.4.4...v0.5.0
+[0.4.4]: https://github.com/Varashi/gpu-node-vsphere-maintenance-controller/compare/v0.4.3...v0.4.4
 [0.4.3]: https://github.com/Varashi/gpu-node-vsphere-maintenance-controller/compare/v0.4.2...v0.4.3
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CHANGELOG.md` at line 10, The changelog declares "## [0.5.0]" but the
reference links at the bottom still treat 0.4.3 as the latest; add a "[0.5.0]:"
link definition pointing to the compare URL for v0.4.3...v0.5.0 (or the repo tag
URL for v0.5.0) and update the "[Unreleased]:" reference to compare against
v0.5.0 (change its right-hand/latest tag from v0.4.3 to v0.5.0); locate the "##
[0.5.0]" header and the link definitions named "[0.5.0]:" and "[Unreleased]:" to
make these edits so the release compare navigation is correct.

{{- $fullname := include "gpu-node-vsphere-maintenance-controller.fullname" . -}}
{{- $fenceName := printf "%s-fence" $fullname -}}
{{- $fenceSA := ternary $fenceName (default "default" .Values.serviceAccount.name) .Values.serviceAccount.create -}}
{{- if and .Values.serviceAccount.create }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove redundant and operator from conditional.

The and operator with a single argument is non-idiomatic Go template syntax. This should be simplified to a direct boolean check.

🔧 Proposed fix
-{{- if and .Values.serviceAccount.create }}
+{{- if .Values.serviceAccount.create }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{{- if and .Values.serviceAccount.create }}
{{- if .Values.serviceAccount.create }}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@chart/templates/fence.yaml` at line 5, The conditional in the template uses
the redundant Go template "and" operator with a single argument ({{- if and
.Values.serviceAccount.create }}); change it to a direct boolean check by
replacing the use of "and" with a simple if on .Values.serviceAccount.create
(i.e., use {{- if .Values.serviceAccount.create }}), ensuring any matching {{
end }} remains unchanged.

Comment on lines +27 to +30
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch", "patch", "update"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Consider removing "update" verb for stricter least-privilege.

The ClusterRole includes both "patch" and "update" verbs, but Kubernetes controllers performing taint operations typically only need "patch". The "update" verb enables full object replacement and is more permissive than necessary for this use case.

🔒 Proposed change for stricter RBAC
 rules:
   - apiGroups: [""]
     resources: ["nodes"]
-    verbs: ["get", "list", "watch", "patch", "update"]
+    verbs: ["get", "list", "watch", "patch"]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch", "patch", "update"]
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch", "patch"]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@chart/templates/fence.yaml` around lines 27 - 30, The ClusterRole rules in
fence.yaml grant both "patch" and "update" on the "nodes" resource; remove the
"update" verb and keep only "patch" (and any existing verbs like
"get","list","watch") to enforce least-privilege for taint operations—locate the
rules block (apiGroups: [""], resources: ["nodes"], verbs: [...]) and delete
"update" from the verbs array.

Comment thread controller.py
Comment on lines +484 to +488
def has_out_of_service_taint(self, node_name) -> bool:
node = self.get_node(node_name)
return any(
t.key == OUT_OF_SERVICE_TAINT_KEY for t in (node.spec.taints or [])
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Match fence taints by full identity, not key-only.

The current checks/removal use only t.key == OUT_OF_SERVICE_TAINT_KEY. If a node has the same key with a different effect/value, fencing can be skipped and un-fencing can remove taints this controller did not create.

Use (key, value, effect) (or at least key + effect) consistently in has_out_of_service_taint, apply_out_of_service_taint, and remove_out_of_service_taint.

Suggested fix
+    def _is_fence_taint(self, t) -> bool:
+        return (
+            t.key == OUT_OF_SERVICE_TAINT_KEY
+            and getattr(t, "value", None) == OUT_OF_SERVICE_TAINT_VALUE
+            and getattr(t, "effect", None) == "NoExecute"
+        )
+
     def has_out_of_service_taint(self, node_name) -> bool:
         node = self.get_node(node_name)
-        return any(
-            t.key == OUT_OF_SERVICE_TAINT_KEY for t in (node.spec.taints or [])
-        )
+        return any(self._is_fence_taint(t) for t in (node.spec.taints or []))
@@
-        if any(t.key == OUT_OF_SERVICE_TAINT_KEY for t in taints):
+        if any(self._is_fence_taint(t) for t in taints):
             return  # already fenced
@@
-        if not any(t.key == OUT_OF_SERVICE_TAINT_KEY for t in taints):
+        if not any(self._is_fence_taint(t) for t in taints):
             return
@@
-        kept = [t for t in taints if t.key != OUT_OF_SERVICE_TAINT_KEY]
+        kept = [t for t in taints if not self._is_fence_taint(t)]
         self._patch_taints(node_name, kept)

Also applies to: 500-501, 518-525

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@controller.py` around lines 484 - 488, has_out_of_service_taint currently
matches taints by key only which can erroneously detect/remove taints; update it
and the related functions apply_out_of_service_taint and
remove_out_of_service_taint to match the full taint identity (key, value,
effect) — or at minimum key+effect — instead of key-only. Locate uses of
OUT_OF_SERVICE_TAINT_KEY and change the predicate in has_out_of_service_taint to
compare (t.key, t.value, t.effect) to the controller's OUT_OF_SERVICE taint
triple, ensure apply_out_of_service_taint creates the taint with the same
(key,value,effect) tuple, and change remove_out_of_service_taint to
filter/remove only taints matching that full tuple so you don't touch taints the
controller didn't create.

Comment thread README.md
Comment on lines +121 to +129
Enable it (and start with `dryRun` to watch its decisions):

```yaml
fence:
enabled: true
dryRun: true # logs "would fence" without tainting; flip to false when confident
graceSeconds: 60 # both gates must hold this long before fencing
pollSeconds: 20
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update adjacent README examples to 0.5.0 to keep fence docs actionable.

This section documents fence.* values, but later examples still pin chart tag 0.4.3; that version won’t include this feature. Also, the README still says the source layout is a “single controller.py”, which is now outdated after adding fence.py.

Suggested doc updates
-  --version 0.4.3 \
+  --version 0.5.0 \

-    tag: 0.4.3
+    tag: 0.5.0

-Source layout is deliberately tiny — a single `controller.py` plus a
+Source layout is deliberately tiny — `controller.py` + `fence.py` plus a
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 121 - 129, Update the README examples to reference
the 0.5.0 chart/tag wherever examples show 0.4.3 so the documented fence.*
fields (dryRun, graceSeconds, pollSeconds) match the supported release, and
update the source-layout description that currently says “single controller.py”
to reflect the new split (mention both controller.py and fence.py) so readers
know fence.py exists; search for occurrences of the chart tag/version, the
phrase "single controller.py", and the fence.* example block to make the changes
consistently.

The product now covers two passthrough-node host-lifecycle concerns —
planned maintenance and (new) crash fencing — so "maintenance-controller"
no longer fits. Renames repo/image/chart/templates/helpers/docs to
vsphere-passthrough-node-controller (image + chart at
ghcr.io/varashi/[charts/]vsphere-passthrough-node-controller).

GitHub repo renamed (redirects cover old git/web refs). The internal
maintenance-state annotation domain (vsphere-maintenance.boeye.net/*) is
left unchanged — it's controller state, not user-facing, and no nodes
carry it currently.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Varashi Varashi changed the title feat: optional crash-fence controller (out-of-service taint on host crash) rename to vsphere-passthrough-node-controller + add crash-fence controller Jun 3, 2026
@Varashi
Copy link
Copy Markdown
Owner Author

Varashi commented Jun 3, 2026

Folded in the product rename gpu-node-vsphere-maintenance-controller → vsphere-passthrough-node-controller (GitHub repo renamed; redirects cover old refs). The product now spans two passthrough-node host-lifecycle concerns (planned maintenance + crash fencing), so the old name no longer fit. New artifacts on release: ghcr.io/varashi/vsphere-passthrough-node-controller (image) + oci://ghcr.io/varashi/charts/vsphere-passthrough-node-controller (chart). Internal annotation domain vsphere-maintenance.boeye.net/* left unchanged. The k8s-repo cutover (OCIRepository + HR + namespace rename + enable fence in dryRun) is a separate PR to merge after v0.5.0 is published.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
chart/templates/fence.yaml (2)

27-30: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Consider removing "update" verb for stricter least-privilege.

The ClusterRole includes both "patch" and "update" verbs, but Kubernetes controllers performing taint operations typically only need "patch". The "update" verb enables full object replacement and is more permissive than necessary for this use case.

🔒 Proposed change for stricter RBAC
 rules:
   - apiGroups: [""]
     resources: ["nodes"]
-    verbs: ["get", "list", "watch", "patch", "update"]
+    verbs: ["get", "list", "watch", "patch"]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@chart/templates/fence.yaml` around lines 27 - 30, The ClusterRole rules for
resource "nodes" currently list both "patch" and "update"; remove the "update"
verb to adhere to least-privilege (keep "patch" for taint operations) by editing
the rules block that targets resources: ["nodes"] (the ClusterRole/rules stanza
in the fence template) so it no longer includes "update" while retaining "get",
"list", "watch", and "patch".

5-5: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove redundant and operator from conditional.

The and operator with a single argument is non-idiomatic Go template syntax. This should be simplified to a direct boolean check.

🔧 Proposed fix
-{{- if and .Values.serviceAccount.create }}
+{{- if .Values.serviceAccount.create }}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@chart/templates/fence.yaml` at line 5, The conditional uses the redundant Go
template "and" operator with a single argument; replace the expression that
references "and .Values.serviceAccount.create" with a direct boolean check of
".Values.serviceAccount.create" in the template conditional (i.e., remove the
"and" operator) so the if statement becomes a simple "{{- if
.Values.serviceAccount.create }}" style check.
CHANGELOG.md (1)

189-200: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add the 0.5.0 and 0.4.4 reference links and move Unreleased baseline forward.

The changelog declares ## [0.5.0] and ## [0.4.4] but the link definitions still treat 0.4.3 as latest. This breaks compare navigation for the new releases.

🔗 Suggested link fixes
-[Unreleased]: https://github.com/Varashi/vsphere-passthrough-node-controller/compare/v0.4.3...HEAD
+[Unreleased]: https://github.com/Varashi/vsphere-passthrough-node-controller/compare/v0.5.0...HEAD
+[0.5.0]: https://github.com/Varashi/vsphere-passthrough-node-controller/compare/v0.4.4...v0.5.0
+[0.4.4]: https://github.com/Varashi/vsphere-passthrough-node-controller/compare/v0.4.3...v0.4.4
 [0.4.3]: https://github.com/Varashi/vsphere-passthrough-node-controller/compare/v0.4.2...v0.4.3
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CHANGELOG.md` around lines 189 - 200, Update the changelog link definitions
so the new releases are referenced and Unreleased compares against the newest
tag: add link definitions for [0.4.4] and [0.5.0] (e.g. [0.4.4]:
https://github.com/Varashi/vsphere-passthrough-node-controller/compare/v0.4.3...v0.4.4
and [0.5.0]:
https://github.com/Varashi/vsphere-passthrough-node-controller/compare/v0.4.3...v0.5.0),
then change the [Unreleased] baseline to compare from v0.5.0
(https://github.com/Varashi/vsphere-passthrough-node-controller/compare/v0.5.0...HEAD)
so the "Unreleased" link points to the correct latest released tag.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@CHANGELOG.md`:
- Around line 189-200: Update the changelog link definitions so the new releases
are referenced and Unreleased compares against the newest tag: add link
definitions for [0.4.4] and [0.5.0] (e.g. [0.4.4]:
https://github.com/Varashi/vsphere-passthrough-node-controller/compare/v0.4.3...v0.4.4
and [0.5.0]:
https://github.com/Varashi/vsphere-passthrough-node-controller/compare/v0.4.3...v0.5.0),
then change the [Unreleased] baseline to compare from v0.5.0
(https://github.com/Varashi/vsphere-passthrough-node-controller/compare/v0.5.0...HEAD)
so the "Unreleased" link points to the correct latest released tag.

In `@chart/templates/fence.yaml`:
- Around line 27-30: The ClusterRole rules for resource "nodes" currently list
both "patch" and "update"; remove the "update" verb to adhere to least-privilege
(keep "patch" for taint operations) by editing the rules block that targets
resources: ["nodes"] (the ClusterRole/rules stanza in the fence template) so it
no longer includes "update" while retaining "get", "list", "watch", and "patch".
- Line 5: The conditional uses the redundant Go template "and" operator with a
single argument; replace the expression that references "and
.Values.serviceAccount.create" with a direct boolean check of
".Values.serviceAccount.create" in the template conditional (i.e., remove the
"and" operator) so the if statement becomes a simple "{{- if
.Values.serviceAccount.create }}" style check.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 714fa2e1-53cb-49df-8080-13f3f34992ac

📥 Commits

Reviewing files that changed from the base of the PR and between 19c498c and b5e3502.

📒 Files selected for processing (16)
  • .github/workflows/ci.yaml
  • .github/workflows/release.yaml
  • CHANGELOG.md
  • Dockerfile
  • README.md
  • chart/Chart.yaml
  • chart/templates/NOTES.txt
  • chart/templates/_helpers.tpl
  • chart/templates/clusterrole.yaml
  • chart/templates/clusterrolebinding.yaml
  • chart/templates/configmap.yaml
  • chart/templates/deployment.yaml
  • chart/templates/fence.yaml
  • chart/templates/secret.yaml
  • chart/templates/serviceaccount.yaml
  • chart/values.yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant