Skip to content

KVM: fix UEFI disk-only instance snapshot NVRAM handling#13020

Open
Kunalbehbud wants to merge 4 commits intoapache:4.22from
Kunalbehbud:fix/kvm-uefi-disk-only-instance-snapshot-nvram-4.22
Open

KVM: fix UEFI disk-only instance snapshot NVRAM handling#13020
Kunalbehbud wants to merge 4 commits intoapache:4.22from
Kunalbehbud:fix/kvm-uefi-disk-only-instance-snapshot-nvram-4.22

Conversation

@Kunalbehbud
Copy link
Copy Markdown

Description

KVM disk-only instance snapshots do not capture the active UEFI NVRAM state, which makes revert unsafe for UEFI guests.

This PR fixes the KVM disk-only instance snapshot flow to:

  • plumb optional NVRAM sidecar metadata through the management and KVM agent commands
  • copy the active UEFI NVRAM file during snapshot creation and restore it during revert
  • clean up the NVRAM sidecar during delete and merge flows
  • gate create, revert, and sidecar cleanup on host UEFI/NVRAM capabilities and clear stale capability details on reconnect
  • suspend UEFI guests before copying NVRAM, freezing filesystems first when quiescevm=true
  • preserve successful snapshot metadata if post-snapshot thaw/resume fails, surfacing the issue via warning/alert instead of discarding the snapshot
  • reject reverting UEFI disk-only snapshots that do not contain NVRAM state

Older UEFI disk-only snapshots created without an NVRAM sidecar are intentionally rejected on revert.

This PR only covers disk-only instance snapshots for KVM UEFI VMs. snapshotMemory=true / internal snapshots remain out of scope.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

N/A

How Has This Been Tested?

  • mvn -pl engine/orchestration,engine/storage/snapshot,plugins/hypervisors/kvm -am -Dtest=AgentManagerImplTest,KvmFileBasedStorageVmSnapshotStrategyTest,LibvirtDiskOnlyVMSnapshotCommandWrapperTest -Dsurefire.failIfNoSpecifiedTests=false test
  • Live-tested on a 4.22.1 KVM environment with a UEFI VM:
    • created a disk-only instance snapshot
    • reverted the snapshot and booted the VM successfully
    • deleted a parent snapshot and verified the merge path removed the old NVRAM sidecar
    • corrupted the active NVRAM after the merge, reverted again, verified the active NVRAM checksum matched the snapshot sidecar before boot, and booted successfully

How did you try to break this feature and the system with this change?

  • tried reverting a UEFI disk-only snapshot without an NVRAM sidecar and verified the revert is rejected
  • covered stale host capability details on reconnect
  • covered the case where freeze verification fails after a successful freeze and verified cleanup still thaws the guest
  • covered post-snapshot thaw/resume failures and verified the snapshot metadata is preserved while the warning/alert path is exercised

@sureshanaparti
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 14, 2026

Codecov Report

❌ Patch coverage is 52.65589% with 205 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.66%. Comparing base (d75acb6) to head (f657823).
⚠️ Report is 30 commits behind head on 4.22.

Files with missing lines Patch % Lines
...LibvirtCreateDiskOnlyVMSnapshotCommandWrapper.java 53.16% 60 Missing and 14 partials ⚠️
...napshot/KvmFileBasedStorageVmSnapshotStrategy.java 62.38% 30 Missing and 11 partials ⚠️
...LibvirtRevertDiskOnlyVMSnapshotCommandWrapper.java 55.00% 19 Missing and 8 partials ⚠️
...java/com/cloud/agent/manager/AgentManagerImpl.java 41.93% 9 Missing and 9 partials ⚠️
...t/api/storage/CreateDiskOnlyVmSnapshotCommand.java 0.00% 10 Missing ⚠️
...t/api/storage/DeleteDiskOnlyVmSnapshotCommand.java 38.46% 8 Missing ⚠️
...t/api/storage/RevertDiskOnlyVmSnapshotCommand.java 42.85% 8 Missing ⚠️
...LibvirtDeleteDiskOnlyVMSnapshotCommandWrapper.java 66.66% 6 Missing and 2 partials ⚠️
...ervisor/kvm/resource/LibvirtComputingResource.java 0.00% 7 Missing ⚠️
...nt/api/storage/CreateDiskOnlyVmSnapshotAnswer.java 33.33% 4 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.22   #13020      +/-   ##
============================================
+ Coverage     17.61%   17.66%   +0.05%     
- Complexity    15693    15764      +71     
============================================
  Files          5919     5921       +2     
  Lines        532005   532795     +790     
  Branches      65057    65147      +90     
============================================
+ Hits          93696    94117     +421     
- Misses       427746   428037     +291     
- Partials      10563    10641      +78     
Flag Coverage Δ
uitests 3.70% <ø> (-0.01%) ⬇️
unittests 18.73% <52.65%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17485

@winterhazel winterhazel requested a review from JoaoJandre April 14, 2026 14:10
@github-actions
Copy link
Copy Markdown

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

kunal.behbudzade added 3 commits April 14, 2026 20:46
Add the command payload, answer metadata, and host capability plumbing required for KVM disk-only instance snapshots to carry UEFI NVRAM state between management and the KVM agent.

Also synchronize host capability booleans on reconnect so stale UEFI/NVRAM support details are removed when an older agent reconnects.
Copy the active UEFI NVRAM file as a sidecar during disk-only instance snapshot creation, restore it on revert, and clean it up during delete and merge flows.

Also tighten host capability checks, preserve successful snapshot metadata when post-snapshot thaw or resume fails, and reject reverting UEFI disk-only snapshots that do not contain NVRAM state.
Cover host capability synchronization, UEFI NVRAM sidecar handling across create/revert/delete flows, and the running-VM recovery paths for disk-only instance snapshots.
@Kunalbehbud Kunalbehbud force-pushed the fix/kvm-uefi-disk-only-instance-snapshot-nvram-4.22 branch from f657823 to 2bc9051 Compare April 14, 2026 17:50
@Kunalbehbud
Copy link
Copy Markdown
Author

Rebased this branch on the latest 4.22 to resolve the merge conflict in AgentManagerImpl after the recent VDDK host detail changes.

Local verification passed again on top of the updated base:
mvn -pl engine/orchestration,engine/storage/snapshot,plugins/hypervisors/kvm -am -Dtest=AgentManagerImplTest,KvmFileBasedStorageVmSnapshotStrategyTest,LibvirtDiskOnlyVMSnapshotCommandWrapperTest -Dsurefire.failIfNoSpecifiedTests=false test

The new GitHub Actions runs for this fork push are currently in action_required, so they appear to be waiting for maintainer approval before re-running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants