Skip to content

<fix>[storage]: fix encrypted data volume host selection#4299

Open
ZStack-Robot wants to merge 1 commit into
feature-zsv-5.1.0-encryptionfrom
sync/zstackio/codex/encrypted-data-volume-create@@2
Open

<fix>[storage]: fix encrypted data volume host selection#4299
ZStack-Robot wants to merge 1 commit into
feature-zsv-5.1.0-encryptionfrom
sync/zstackio/codex/encrypted-data-volume-create@@2

Conversation

@ZStack-Robot

Copy link
Copy Markdown
Collaborator

Summary

Fix ZSV-12438 encrypted data volume creation paths where LUKS secret staging or key-provider binding is missing.

Testing

  • cbok zsv compile deployed to 172.26.213.50
  • Manual create-VM matrix on Local/NFS/SBLK/RBD: 20/20 PASS
  • Groovy regression coverage added/updated

Resolves: ZSV-12438

Changes

  • Select a connected KVM host when standalone encrypted data-volume instantiation needs LUKS secret staging.
  • Clean key-provider binding when encrypted instantiation fails.

sync from gitlab !10256

@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@MatheMatrix, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 45 minutes and 38 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ce5f4b4e-f50e-4649-b715-fe57623debc3

📥 Commits

Reviewing files that changed from the base of the PR and between c35fb73 and a3ffdf2.

📒 Files selected for processing (7)
  • header/src/main/java/org/zstack/header/storage/primary/PSCapacityExtensionPoint.java
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageBase.java
  • plugin/localstorage/src/main/java/org/zstack/storage/primary/local/LocalStorageAllocatorFactory.java
  • storage/src/main/java/org/zstack/storage/encrypt/VolumeEncryptedInitialExtension.java
  • storage/src/main/java/org/zstack/storage/volume/VolumeManagerImpl.java
  • test/src/test/groovy/org/zstack/storage/volume/VolumeManagerImplHostSelectionTest.groovy
  • test/src/test/groovy/org/zstack/test/unittest/storage/encrypt/VolumeEncryptedInitialExtensionTest.groovy

概述

本 PR 为加密卷就地创建流程增加了宿主机自动选择能力:在调用方未提供 hostUuid 时,通过新增的 PSCapacityExtensionPoint 扩展点方法从安装 URL 推导或从已连接 KVM 主机中兜底选取目标主机,同步修复了 Ceph LUKS 路径的目标主机回填逻辑,并移除了 VolumeEncryptedInitialExtension 中对 hostUuid 的前置依赖。

变更内容

加密卷就地创建宿主机自动选择流程

层级 / 文件 摘要
PSCapacityExtensionPoint 契约与 LocalStorage 实现
header/.../PSCapacityExtensionPoint.java, plugin/localstorage/.../LocalStorageAllocatorFactory.java
接口新增默认方法 getHostUuidFromAllocatedInstallUrlLocalStorageAllocatorFactory 覆盖并委托 LocalStorageUtils 解析安装 URL 中的主机 UUID。
VolumeEncryptedInitialExtension 移除 hostUuid 前置守卫
storage/.../VolumeEncryptedInitialExtension.java
删除 preInstantiateVolumemsg.getHostUuid() 的空值检测与提前返回,加密卷处理流程不再依赖调用方传入 hostUuid
VolumeManagerImpl 宿主机选择编排与失败清理
storage/.../VolumeManagerImpl.java
注入 VolumeEncryptedResourceKeyBackendcreate-data-volume-from-template 流程引入 selectedHostUuid/encryptInPlaceRequired 状态;分配回调中通过扩展点推导或兜底选择 KVM 主机;下载与加密阶段统一使用 selectedHostUuid;新增 requiresEncryptInPlacegetHostUuidFromAllocatedInstallUrlselectHostForEncryptInPlace 三个私有辅助方法及失败路径 key provider 清理方法。
Ceph LUKS 路径目标主机回填
plugin/ceph/.../CephPrimaryStorageBase.java
createEmptyVolume 的 LUKS 分支在 destHostUuid 为空时调用 findConnectedHostForCephLuks 回填目标主机,prepareLuksEnvelopeDekOnHost 统一使用解析后的 UUID。
单元测试
test/.../VolumeManagerImplHostSelectionTest.groovy, test/.../VolumeEncryptedInitialExtensionTest.groovy
验证 selectHostForEncryptInPlace 正确过滤 Connected/Disconnected 主机引用;验证 preInstantiateVolume 在无 hostUuid 时正确触发快照 key 继承、key provider 挂载和 DEK 材料化。

时序图

sequenceDiagram
  rect rgba(135, 206, 235, 0.5)
    Note over VolumeManagerImpl: create-data-volume-from-template
    VolumeManagerImpl->>VolumeManagerImpl: 计算 encryptInPlaceRequired
    VolumeManagerImpl->>PSCapacityExtensionPoint: getHostUuidFromAllocatedInstallUrl(installUrl)
    PSCapacityExtensionPoint-->>VolumeManagerImpl: hostUuid 或 null
  end
  rect rgba(255, 200, 100, 0.5)
    Note over VolumeManagerImpl: 宿主机兜底选择
    VolumeManagerImpl->>VolumeManagerImpl: selectHostForEncryptInPlace(psUuid)
    VolumeManagerImpl->>DatabaseFacade: 查询 PrimaryStorageHostRefVO (Connected)
    DatabaseFacade-->>VolumeManagerImpl: connectedHostUuids
    VolumeManagerImpl->>DatabaseFacade: 查询 HostVO (KVM/Connected/Enabled)
    DatabaseFacade-->>VolumeManagerImpl: HostInventory(selectedHostUuid)
  end
  rect rgba(144, 238, 144, 0.5)
    Note over VolumeManagerImpl: 加密执行与失败清理
    VolumeManagerImpl->>VolumeInPlaceEncryptor: 使用 selectedHostUuid 执行加密
    alt 创建失败
      VolumeManagerImpl->>VolumeEncryptedResourceKeyBackend: detachKeyProviderFromVolume(volume)
    end
  end
Loading

代码审查复杂度评估

🎯 3 (Moderate) | ⏱️ ~25 分钟

小诗

🐇 小兔子敲代码,加密卷找不到主机愁眉锁,
扩展点一出手,URL 解析不用愁,
KVM 已连接,自动来牵手,
失败也不怕,清理有保守,
测试全覆盖,Bug 无处走!✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed 标题准确概括了主要变更内容,即修复加密数据卷的主机选择问题。
Description check ✅ Passed 描述与变更集相关,解释了修复的具体问题和实现方案,包括测试验证信息。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/zstackio/codex/encrypted-data-volume-create@@2

Comment @coderabbitai help to get the list of available commands and usage tips.

Standalone encrypted data-volume creation can miss VM host context.

Select a connected KVM host when LUKS secret staging is required.

Clean key-provider binding if encrypted instantiation fails.

Tests:
- storage and ceph module compile
- related storage encryption unit tests

Resolves: ZSV-12438

Change-Id: I6f4a843b7c15cead75d75f248d77f8760d7fe106
@MatheMatrix MatheMatrix force-pushed the sync/zstackio/codex/encrypted-data-volume-create@@2 branch from c35fb73 to a3ffdf2 Compare June 18, 2026 02:51
@zstack-robot-2

Copy link
Copy Markdown
Collaborator

Comment from yaohua.wu:

Review: MR !10256 — ZSV-12438

结论: APPROVED(含若干 Suggestion)

背景

ZSV-12438(P0):加密虚拟机内/独立新建加密数据盘报错,zce(ceph)、local 存储均复现。根因是独立加密数据盘 instantiate 时缺少 host 来 staging LUKS secret,且 key-provider 绑定缺失。本 MR 在 zstack 侧承载主机自动选择 + 失败清理的基础逻辑。

变更评估

  • VolumeEncryptedInitialExtension:移除 preInstantiateVolumehostUuid 的空值前置返回,使独立加密数据盘(无 VM、无 hostUuid)也能完成 key provider 绑定与 DEK 材料化。这是核心修复方向,正确。
  • VolumeManagerImpl:create-from-template 流程引入 selectedHostUuid/encryptInPlaceRequired,优先从 allocatedInstallUrl 推导宿主机(local),否则 selectHostForEncryptInPlace 兜底选连接态 KVM 主机;失败路径 detachVolumeKeyProviderOnCreateFailure 清理绑定。skip() 重构为 !encryptInPlaceRequired 与旧逻辑等价,无回归。
  • CephPrimaryStorageBase:LUKS createempty 在 destHost 为空时回填连接态主机,findConnectedHostForCephLuks 增加 Connected/Enabled 过滤 + 排序,避免选到失联/禁用主机。错误信息更明确。
  • 扩展点 getHostUuidFromAllocatedInstallUrl 设计合理:默认返回 null,仅 LocalStorage 实现解析 URL。
  • 单测覆盖 selectHostForEncryptInPlace(Connected 过滤)与无 hostUuid 的 preInstantiateVolume,方向到位。

发现项

🟡 Warning — CephPrimaryStorageBase.java(~L1798)

destHostUuid = findConnectedHostForCephLuks();
HostVO selectedHost = dbf.findByUuid(destHostUuid, HostVO.class);

findConnectedHostForCephLuks() 可能返回 null(无连接态主机)。建议在 findByUuid 前显式判空,避免依赖 findByUuid(null, ...) 的实现兼容性;当前虽有后续 isBlank(destHostUuid) 兜底报错,但 findByUuid(null) 在部分实现下可能抛异常而非返回 null。

🟢 Suggestion — VolumeManagerImpl.selectHostForEncryptInPlace(~L733)
hostRefs 为空(如 NFS 等未填充 PrimaryStorageHostRefVO 的共享存储)时,会跳过 connected 过滤,仅按 cluster + HostVO 状态查询。逻辑可接受,但建议补一行注释说明"无 PS-host ref 表数据时按 cluster 内连接态主机兜底",便于后续维护。

🟢 Suggestion — detachVolumeKeyProviderOnCreateFailure 仅在当前 create 回调失败分支调用。请确认其它创建/instantiate 失败入口(如非模板路径)是否也需要同样的绑定清理,避免遗留孤儿 EncryptedResourceKeyRefVO

🟢 Suggestion — findConnectedHostForCephLuksselectHostForEncryptInPlace 存在相似的"选连接态 KVM 主机"逻辑,分处两层可以理解;若后续有第三处复用,建议下沉为公共工具方法。

关联 MR

  • premium !14397(同源分支 codex/encrypted-data-volume-create@@2):在 API 拦截器层为 APICreateDataVolumeMsg/APICreateDataVolumeFromVolumeTemplateMsg(encrypted=true)及 attach 未实例化加密卷增加"默认 key provider 可用性"校验。
  • 跨仓一致性:premium 侧对"未实例化加密卷"走默认 provider 校验(isVolumeNotInstantiated),与本 MR VolumeEncryptedInitialExtension 在 instantiate 时绑定默认 provider 的行为前后呼应、语义一致。两 MR 必须一起合并:premium 的前置校验依赖本 MR 提供的"未实例化也能绑定默认 provider"能力,否则会出现"校验通过但实际绑定逻辑缺失"。

兼容性

目标分支 feature-zsv-5.1.0-encryptionmerge_status: can_be_merged,与目标分支无冲突。


🤖 Robot Reviewer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants