Skip to content

Live migrate VM failed when 'Error-ed" VM snapshots exists and VM stopped. #12998

@sureshanaparti

Description

@sureshanaparti

problem

Live migrate VM failed when 'Error-ed" VM snapshots exists and VM stopped.

2026-04-10 08:06:55,931 DEBUG [c.c.a.m.ClusteredAgentAttache] (Work-Job-Executor-35:[ctx-2b815626, job-117/job-118, ctx-42e19861]) (logid:bfef3797) Seq 1-5825124643027026430: Routed from 32986137363417
2026-04-10 08:06:55,933 DEBUG [c.c.a.t.Request] (Work-Job-Executor-35:[ctx-2b815626, job-117/job-118, ctx-42e19861]) (logid:bfef3797) Seq 1-5825124643027026430: Sending  { Cmd , MgmtId: 32986137363417, via: 1(ref-trl-11538-k-Mol8-suresh-anaparti-kvm1), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.RestoreVMSnapshotCommand":{"snapshots":[{"id":"1","snapshotName":"i-2-3-VM_VS_20260410080110","type":"DiskAndMemory","createTime":"1775808070000","quiescevm":"true"}],"snapshotAndParents":{},"volumeTOs":[{"uuid":"4fbf60a0-9ecb-4fa5-8238-35363309b8fe","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"d90d639c-b4df-387d-a3cc-c4a9fad1a939","name":"ref-trl-11538-k-Mol8-suresh-anaparti-kvm-pri2","id":"2","poolType":"NetworkFilesystem","host":"10.0.32.4","path":"/acs/primary/ref-trl-11538-k-Mol8-suresh-anaparti/ref-trl-11538-k-Mol8-suresh-anaparti-kvm-pri2","port":"2049","url":"NetworkFilesystem://10.0.32.4/acs/primary/ref-trl-11538-k-Mol8-suresh-anaparti/ref-trl-11538-k-Mol8-suresh-anaparti-kvm-pri2/?ROLE=Primary&STOREUUID=d90d639c-b4df-387d-a3cc-c4a9fad1a939","isManaged":"false"}},"name":"ROOT-3","size":"(8.00 GB) 8589934592","path":"4fbf60a0-9ecb-4fa5-8238-35363309b8fe","volumeId":"3","vmName":"i-2-3-VM","accountId":"2","format":"QCOW2","provisioningType":"THIN","poolId":"2","id":"3","deviceId":"0","hypervisorType":"KVM","directDownload":"false","deployAsIs":"false","followRedirects":"false"},{"uuid":"f98a16a9-d53a-4c32-b1c4-f20340435757","volumeType":"DATADISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"c3812f1e-6c98-408b-80bf-8c294ecef883","name":"pflexcspool1","id":"5","poolType":"PowerFlex","host":"10.0.32.170","path":"3357f56800000000","port":"0","url":"PowerFlex://10.0.32.170/3357f56800000000/?ROLE=Primary&STOREUUID=c3812f1e-6c98-408b-80bf-8c294ecef883","isManaged":"true"}},"name":"MyDATA01","size":"(8.00 GB) 8589934592","path":"8a4d19710000005a:vol-20-2f1e-7565","volumeId":"20","vmName":"i-2-3-VM","accountId":"2","format":"QCOW2","provisioningType":"THIN","poolId":"5","id":"20","deviceId":"1","hypervisorType":"KVM","directDownload":"false","deployAsIs":"false","followRedirects":"false"}],"vmName":"i-2-3-VM","guestOSType":"CentOS 5.5 (64-bit)","wait":"0","bypassHostMaintenance":"false"}}] }
2026-04-10 08:06:55,950 DEBUG [c.c.a.t.Request] (AgentManager-Handler-14:[]) (logid:) Seq 1-5825124643027026430: Processing:  { Ans: , MgmtId: 32986137363417, via: 1, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":{"result":"false","details":"java.lang.NullPointerException: Cannot invoke "java.lang.Boolean.booleanValue()" because the return value of "com.cloud.agent.api.VMSnapshotTO.getCurrent()" is null
        at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRestoreVMSnapshotCommandWrapper.execute(LibvirtRestoreVMSnapshotCommandWrapper.java:69)
        at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRestoreVMSnapshotCommandWrapper.execute(LibvirtRestoreVMSnapshotCommandWrapper.java:39)
        at com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78)
        at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1973)
        at com.cloud.agent.Agent.processRequest(Agent.java:778)
        at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1193)
        at com.cloud.utils.nio.Task.call(Task.java:83)
        at com.cloud.utils.nio.Task.call(Task.java:29)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:840)
","wait":"0","bypassHostMaintenance":"false"}}] }
...
2026-04-10 08:06:55,951 INFO  [c.c.v.ClusteredVirtualMachineManagerImpl] (Work-Job-Executor-35:[ctx-2b815626, job-117/job-118, ctx-42e19861]) (logid:bfef3797) Migration was unsuccessful.  Cleaning up: VM instance {"id":3,"instanceName":"i-2-3-VM","state":"Running","type":"User","uuid":"7b213c30-2290-4801-beb9-5a70a3b5ce3d"}

2026-04-10 08:06:59,232 DEBUG [c.c.a.t.Request] (Work-Job-Executor-35:[ctx-2b815626, job-117/job-118, ctx-42e19861]) (logid:bfef3797) Seq 1-5825124643027026433: Sending  { Cmd , MgmtId: 32986137363417, via: 1(ref-trl-11538-k-Mol8-suresh-anaparti-kvm1), Ver: v1, Flags: 100011, [{"com.cloud.agent.api.StopCommand":{"isProxy":"false","checkBeforeCleanup":"false","forceStop":"false","vlanToPersistenceMap":{"2671":"true"},"volumesToDisconnect":[],"vmName":"i-2-3-VM","executeInSequence":"false","wait":"0","bypassHostMaintenance":"false"}}] }
2026-04-10 08:06:59,232 WARN  [c.c.v.ClusteredVirtualMachineManagerImpl] (Work-Job-Executor-35:[ctx-2b815626, job-117/job-118, ctx-42e19861]) (logid:bfef3797) Unable to transition to a new state from Running via OperationFailed

2026-04-10 08:06:59,235 INFO  [c.c.v.VmWorkJobHandlerProxy] (Work-Job-Executor-35:[ctx-2b815626, job-117/job-118, ctx-42e19861]) (logid:bfef3797) Rethrow exception java.lang.ClassCastException: class com.cloud.agent.api.Answer cannot be cast to class com.cloud.agent.api.RestoreVMSnapshotAnswer (com.cloud.agent.api.Answer and com.cloud.agent.api.RestoreVMSnapshotAnswer are in unnamed module of loader 'app')
2026-04-10 08:06:59,235 DEBUG [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-35:[ctx-2b815626, job-117/job-118]) (logid:bfef3797) Done with run of VM work job: com.cloud.vm.VmWorkMigrate for VM 3, job origin: 117
2026-04-10 08:06:59,236 ERROR [c.c.v.VmWorkJobDispatcher] (Work-Job-Executor-35:[ctx-2b815626, job-117/job-118]) (logid:bfef3797) Unable to complete AsyncJob {"accountId":2,"cmd":"com.cloud.vm.VmWorkMigrate","cmdInfo":"rO0ABXNyABpjb20uY2xvdWQudm0uVm1Xb3JrTWlncmF0ZRdxQXtPtzYqAgAGSgAJc3JjSG9zdElkTAAJY2x1c3RlcklkdAAQTGphdmEvbGFuZy9Mb25nO0wABmhvc3RJZHEAfgABTAAFcG9kSWRxAH4AAUwAB3N0b3JhZ2V0AA9MamF2YS91dGlsL01hcDtMAAZ6b25lSWRxAH4AAXhyABNjb20uY2xvdWQudm0uVm1Xb3Jrn5m2VvAlZ2sCAARKAAlhY2NvdW50SWRKAAZ1c2VySWRKAAR2bUlkTAALaGFuZGxlck5hbWV0ABJMamF2YS9sYW5nL1N0cmluZzt4cAAAAAAAAAACAAAAAAAAAAIAAAAAAAAAA3QAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAAAAAAAAAnNyAA5qYXZhLmxhbmcuTG9uZzuL5JDMjyPfAgABSgAFdmFsdWV4cgAQamF2YS5sYW5nLk51bWJlcoaslR0LlOCLAgAAeHAAAAAAAAAAAXEAfgAJcQB-AAlwcQB-AAk","cmdVersion":0,"completeMsid":null,"created":"Fri Apr 10 08:06:34 UTC 2026","id":118,"initMsid":32986137363417,"instanceId":null,"instanceType":null,"lastPolled":null,"lastUpdated":null,"processStatus":0,"removed":null,"result":null,"resultCode":0,"status":"IN_PROGRESS","userId":2,"uuid":"76ff9b90-ac44-4ed9-89d9-43a3a777b16f"}, job origin: 117 java.lang.ClassCastException: class com.cloud.agent.api.Answer cannot be cast to class com.cloud.agent.api.RestoreVMSnapshotAnswer (com.cloud.agent.api.Answer and com.cloud.agent.api.RestoreVMSnapshotAnswer are in unnamed module of loader 'app')
        at com.cloud.vm.VirtualMachineManagerImpl.checkVmOnHost(VirtualMachineManagerImpl.java:2405)
        at com.cloud.vm.VirtualMachineManagerImpl.migrate(VirtualMachineManagerImpl.java:2875)
        at com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrate(VirtualMachineManagerImpl.java:2733)
        at com.cloud.vm.VirtualMachineManagerImpl.orchestrateMigrate(VirtualMachineManagerImpl.java:5524)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:569)
        at com.cloud.vm.VmWorkJobHandlerProxy.handleVmWorkJob(VmWorkJobHandlerProxy.java:102)
        at com.cloud.vm.VirtualMachineManagerImpl.handleVmWorkJob(VirtualMachineManagerImpl.java:5621)
        at com.cloud.vm.VmWorkJobDispatcher.runJob(VmWorkJobDispatcher.java:99)
        at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:661)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:609)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:840)
Image

versions

ACS 4.20.3 RC4 + Advanced Zone + KVM hypervisor

The steps to reproduce the bug

  1. Deploy a VM
  2. Create VM snapshot (should fail, otherwise update state to Error in DB)
  3. Live migrate VM to another host

What to do about it?

Expecting Live migrate VM to succeed (even when failed/error-ed VM snapshots exists), and VM shouldn't be stopped in case of migration failure.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions