Attempting to rescue volume-based instance doesn't return error but leaves VM in error state

Bug #1978958 reported by Jonathan Heathcote
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Medium
Rajesh Tailor

Bug Description

Description
===========

When attempting to rescue a volume-based instance using an image without the hw_rescue_device and hw_rescue_bus properties set (i.e. which doesn't support the stable rescue mode) I would expect the API call to fail leaving the state in its existing state. Instead Nova appears to go ahead with the rescue which fails (since non-stable rescues for volume backed instances aren't supported) leading to an unrecoverable, broken instance.

Steps to reproduce
==================

1. Create an instance booting from a volume (in my case the instance is booting from a volume in Ceph)

2. Attempt to trigger a rescue using an image *without* the hw_rescue_device and hw_rescue_bus properties set using:

     $ openstack --os-compute-api-version 2.87 server rescue --image <IMAGE UUID> <INSTANCE UUID>

Expected result
===============

I would have expected step 2 to fail with an error along the lines of "Cannot rescue a volume-backed instance", leaving the instance unchanged.

My crude skimming of the Nova source appears to try and perform this check here: https://github.com/openstack/nova/blob/f766db261634c8f95f874ba132159f148de9e8bf/nova/compute/api.py#L4514-L4517.

Actual result
=============

The rescue command returns normally and the instance enters the rescue state very briefly before ending up in the Error state with the message:

    Instance 05043c69-3533-4396-a610-98e564fa3ead cannot be rescued: Driver Error: internal error: qemu unexpectedly closed the monitor: 2022-06-16T11:53:51.583795Z qemu-system-x86_64: -blockdev {"driver":"rbd","pool":"ephemeral-vms","image":"05043c69-3533-43

Looking at the nova-compute logs on the hypervisor gets the following

    2022-06-16 11:53:40.765 32936 INFO nova.virt.libvirt.driver [req-bd8be6a4-2195-498e-9209-caed5c525524 ce20f804a02fc250ec44394149d69a682aed245fd50194cd917fd238c2dc52c1 b259e9c9e29949f287d63e3d05ae9a8a - default default] [instance: 05043c69-3533-4396-a610-98e564fa3ead] Attempting rescue
    2022-06-16 11:53:40.768 32936 INFO nova.virt.libvirt.driver [req-bd8be6a4-2195-498e-9209-caed5c525524 ce20f804a02fc250ec44394149d69a682aed245fd50194cd917fd238c2dc52c1 b259e9c9e29949f287d63e3d05ae9a8a - default default] [instance: 05043c69-3533-4396-a610-98e564fa3ead] Creating image
    2022-06-16 11:53:40.943 32936 WARNING nova.compute.manager [req-a39b605e-e2db-45ce-8f7f-c92a192d9fbe 023555cbd3a2484f89984da17ae9bbfb 0c6d3af6910e429894c203af88fc96c4 - default default] [instance: 05043c69-3533-4396-a610-98e564fa3ead] Received unexpected event network-vif-unplugged-464bd8c6-2b41-4648-abe8-d900c88dd996 for instance with vm_state active a>
    2022-06-16 11:53:50.665 32936 INFO nova.virt.libvirt.driver [-] [instance: 05043c69-3533-4396-a610-98e564fa3ead] Instance destroyed successfully.
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest [req-bd8be6a4-2195-498e-9209-caed5c525524 ce20f804a02fc250ec44394149d69a682aed245fd50194cd917fd238c2dc52c1 b259e9c9e29949f287d63e3d05ae9a8a - default default] Error launching a defined domain with XML: <domain type='kvm'>
      <name>instance-00015604</name>
      <uuid>05043c69-3533-4396-a610-98e564fa3ead</uuid>
      ...snip...
        <disk type='network' device='disk'>
          <driver name='qemu' type='raw' cache='writeback' discard='unmap'/>
          <auth username='cinder'>
            <secret type='ceph' uuid='...'/>
          </auth>
          <source protocol='rbd' name='ephemeral-vms/05043c69-3533-4396-a610-98e564fa3ead_disk.rescue'>
            <host name='...snip...' port='6789'/>
            <host name='...snip...' port='6789'/>
            <host name='...snip...' port='6789'/>
          </source>
          <target dev='sda' bus='scsi'/>
          <address type='drive' controller='0' bus='0' target='0' unit='0'/>
        </disk>
        <disk type='network' device='disk'>
          <driver name='qemu' type='raw' cache='writeback' discard='unmap'/>
          <auth username='cinder'>
            <secret type='ceph' uuid='...'/>
          </auth>
          <source protocol='rbd' name='ephemeral-vms/05043c69-3533-4396-a610-98e564fa3ead_disk'>
            <host name='...snip...' port='6789'/>
            <host name='...snip...' port='6789'/>
            <host name='...snip...' port='6789'/>
          </source>
          <target dev='sdb' bus='scsi'/>
          <address type='drive' controller='0' bus='0' target='0' unit='1'/>
        </disk>
        ...snip...
    : libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2022-06-16T11:53:51.583795Z qemu-system-x86_64: -blockdev {"driver":"rbd","pool":"ephemeral-vms","image":"05043c69-3533-4396-a610-98e564fa3ead_disk","server":[{"host":"10.81.240.10","port":"6789"},{"host":"10.81.240.11","port":"6789"},{"host":"10.81.240.12","port":"6789"}],"user":"cinder","auth-client-required":["cephx","none"],"key-secret":"libvirt-1-storage-secret0","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: error reading header from 05043c69-3533-4396-a610-98e564fa3ead_disk: No such file or directory
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest Traceback (most recent call last):
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/virt/libvirt/guest.py", line 159, in launch
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest return self._domain.createWithFlags(flags)
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/eventlet/tpool.py", line 193, in doit
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest result = proxy_call(self._autowrap, f, *args, **kwargs)
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/eventlet/tpool.py", line 151, in proxy_call
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest rv = execute(f, *args, **kwargs)
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/eventlet/tpool.py", line 132, in execute
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest six.reraise(c, e, tb)
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/six.py", line 719, in reraise
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest raise value
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/eventlet/tpool.py", line 86, in tworker
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest rv = meth(*args, **kwargs)
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/libvirt.py", line 1265, in createWithFlags
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2022-06-16T11:53:51.583795Z qemu-system-x86_64: -blockdev {"driver":"rbd","pool":"ephemeral-vms","image":"05043c69-3533-4396-a610-98e564fa3ead_disk","server":[{"host":"10.81.240.10","port":"6789"},{"host":"10.81.240.11","port":"6789"},{"host":"10.81.240.12","port":"6789"}],"user":"cinder","auth-client-required":["cephx","none"],"key-secret":"libvirt-1-storage-secret0","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: error reading header from 05043c69-3533-4396-a610-98e564fa3ead_disk: No such file or directory
    2022-06-16 11:53:51.807 32936 ERROR nova.virt.libvirt.guest
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [req-bd8be6a4-2195-498e-9209-caed5c525524 ce20f804a02fc250ec44394149d69a682aed245fd50194cd917fd238c2dc52c1 b259e9c9e29949f287d63e3d05ae9a8a - default default] [instance: 05043c69-3533-4396-a610-98e564fa3ead] Error trying to Rescue Instance: libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2022-06-16T11:53:51.583795Z qemu-system-x86_64: -blockdev {"driver":"rbd","pool":"ephemeral-vms","image":"05043c69-3533-4396-a610-98e564fa3ead_disk","server":[{"host":"10.81.240.10","port":"6789"},{"host":"10.81.240.11","port":"6789"},{"host":"10.81.240.12","port":"6789"}],"user":"cinder","auth-client-required":["cephx","none"],"key-secret":"libvirt-1-storage-secret0","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: error reading header from 05043c69-3533-4396-a610-98e564fa3ead_disk: No such file or directory
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] Traceback (most recent call last):
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/compute/manager.py", line 4207, in rescue_instance
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] self.driver.rescue(context, instance, network_info,
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 4145, in rescue
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] self._create_guest(
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 7197, in _create_guest
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] guest.launch(pause=pause)
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/virt/libvirt/guest.py", line 162, in launch
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] LOG.exception('Error launching a defined domain with XML: %s',
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_utils/excutils.py", line 227, in __exit__
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] self.force_reraise()
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] raise self.value
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/virt/libvirt/guest.py", line 159, in launch
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] return self._domain.createWithFlags(flags)
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/eventlet/tpool.py", line 193, in doit
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] result = proxy_call(self._autowrap, f, *args, **kwargs)
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/eventlet/tpool.py", line 151, in proxy_call
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] rv = execute(f, *args, **kwargs)
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/eventlet/tpool.py", line 132, in execute
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] six.reraise(c, e, tb)
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/six.py", line 719, in reraise
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] raise value
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/eventlet/tpool.py", line 86, in tworker
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] rv = meth(*args, **kwargs)
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/libvirt.py", line 1265, in createWithFlags
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead] libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2022-06-16T11:53:51.583795Z qemu-system-x86_64: -blockdev {"driver":"rbd","pool":"ephemeral-vms","image":"05043c69-3533-4396-a610-98e564fa3ead_disk","server":[{"host":"10.81.240.10","port":"6789"},{"host":"10.81.240.11","port":"6789"},{"host":"10.81.240.12","port":"6789"}],"user":"cinder","auth-client-required":["cephx","none"],"key-secret":"libvirt-1-storage-secret0","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: error reading header from 05043c69-3533-4396-a610-98e564fa3ead_disk: No such file or directory
    2022-06-16 11:53:51.808 32936 ERROR nova.compute.manager [instance: 05043c69-3533-4396-a610-98e564fa3ead]
    2022-06-16 11:53:52.310 32936 INFO nova.compute.manager [req-bd8be6a4-2195-498e-9209-caed5c525524 ce20f804a02fc250ec44394149d69a682aed245fd50194cd917fd238c2dc52c1 b259e9c9e29949f287d63e3d05ae9a8a - default default] [instance: 05043c69-3533-4396-a610-98e564fa3ead] Successfully reverted task state from rescuing on failure for instance.
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server [req-bd8be6a4-2195-498e-9209-caed5c525524 ce20f804a02fc250ec44394149d69a682aed245fd50194cd917fd238c2dc52c1 b259e9c9e29949f287d63e3d05ae9a8a - default default] Exception during message handling: nova.exception.InstanceNotRescuable: Instance 05043c69-3533-4396-a610-98e564fa3ead cannot be rescued: Driver Error: internal error: qemu unexpectedly closed the monitor: 2022-06-16T11:53:51.583795Z qemu-system-x86_64: -blockdev {"driver":"rbd","pool":"ephemeral-vms","image":"05043c69-3533-4396-a610-98e564fa3ead_disk","server":[{"host":"10.81.240.10","port":"6789"},{"host":"10.81.240.11","port":"6789"},{"host":"10.81.240.12","port":"6789"}],"user":"cinder","auth-client-required":["cephx","none"],"key-secret":"libvirt-1-storage-secret0","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: error reading header from 05043c69-3533-4396-a610-98e564fa3ead_disk: No such file or directory
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/compute/manager.py", line 4207, in rescue_instance
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server self.driver.rescue(context, instance, network_info,
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 4145, in rescue
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server self._create_guest(
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/virt/libvirt/driver.py", line 7197, in _create_guest
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server guest.launch(pause=pause)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/virt/libvirt/guest.py", line 162, in launch
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server LOG.exception('Error launching a defined domain with XML: %s',
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_utils/excutils.py", line 227, in __exit__
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server self.force_reraise()
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server raise self.value
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/virt/libvirt/guest.py", line 159, in launch
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server return self._domain.createWithFlags(flags)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/eventlet/tpool.py", line 193, in doit
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server result = proxy_call(self._autowrap, f, *args, **kwargs)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/eventlet/tpool.py", line 151, in proxy_call
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server rv = execute(f, *args, **kwargs)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/eventlet/tpool.py", line 132, in execute
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server six.reraise(c, e, tb)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/six.py", line 719, in reraise
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server raise value
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/eventlet/tpool.py", line 86, in tworker
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server rv = meth(*args, **kwargs)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/libvirt.py", line 1265, in createWithFlags
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2022-06-16T11:53:51.583795Z qemu-system-x86_64: -blockdev {"driver":"rbd","pool":"ephemeral-vms","image":"05043c69-3533-4396-a610-98e564fa3ead_disk","server":[{"host":"10.81.240.10","port":"6789"},{"host":"10.81.240.11","port":"6789"},{"host":"10.81.240.12","port":"6789"}],"user":"cinder","auth-client-required":["cephx","none"],"key-secret":"libvirt-1-storage-secret0","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: error reading header from 05043c69-3533-4396-a610-98e564fa3ead_disk: No such file or directory
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred:
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_messaging/rpc/dispatcher.py", line 229, in _do_dispatch
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/exception_wrapper.py", line 71, in wrapped
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server _emit_versioned_exception_notification(
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_utils/excutils.py", line 227, in __exit__
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server self.force_reraise()
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server raise self.value
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/exception_wrapper.py", line 63, in wrapped
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/compute/manager.py", line 183, in decorated_function
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server LOG.warning("Failed to revert task state for instance. "
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_utils/excutils.py", line 227, in __exit__
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server self.force_reraise()
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server raise self.value
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/compute/manager.py", line 154, in decorated_function
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/compute/utils.py", line 1433, in decorated_function
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/compute/manager.py", line 211, in decorated_function
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server compute_utils.add_instance_fault_from_exc(context,
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_utils/excutils.py", line 227, in __exit__
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server self.force_reraise()
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server raise self.value
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/compute/manager.py", line 200, in decorated_function
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server return function(self, context, *args, **kwargs)
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server File "/openstack/venvs/nova-24.0.0/lib/python3.8/site-packages/nova/compute/manager.py", line 4214, in rescue_instance
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server raise exception.InstanceNotRescuable(
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server nova.exception.InstanceNotRescuable: Instance 05043c69-3533-4396-a610-98e564fa3ead cannot be rescued: Driver Error: internal error: qemu unexpectedly closed the monitor: 2022-06-16T11:53:51.583795Z qemu-system-x86_64: -blockdev {"driver":"rbd","pool":"ephemeral-vms","image":"05043c69-3533-4396-a610-98e564fa3ead_disk","server":[{"host":"10.81.240.10","port":"6789"},{"host":"10.81.240.11","port":"6789"},{"host":"10.81.240.12","port":"6789"}],"user":"cinder","auth-client-required":["cephx","none"],"key-secret":"libvirt-1-storage-secret0","node-name":"libvirt-1-storage","cache":{"direct":false,"no-flush":false},"auto-read-only":true,"discard":"unmap"}: error reading header from 05043c69-3533-4396-a610-98e564fa3ead_disk: No such file or directory
    2022-06-16 11:53:52.312 32936 ERROR oslo_messaging.rpc.server
    2022-06-16 11:53:55.555 32936 INFO nova.compute.manager [-] [instance: 05043c69-3533-4396-a610-98e564fa3ead] VM Stopped (Lifecycle Event)
    2022-06-16 11:53:55.610 32936 INFO nova.compute.manager [req-7790122f-6942-4296-bdfa-37362785baec - - - - -] [instance: 05043c69-3533-4396-a610-98e564fa3ead] During _sync_instance_power_state the DB power_state (1) does not match the vm_power_state from the hypervisor (4). Updating power_state in the DB to match the hypervisor.

Environment
===========

Nova f766db261634c8f95f874ba132159f148de9e8bf (installed from Git by OSA)

Hypervisor: Libvirt + KVM (Libvirt 6.0.0-0ubuntu8.16)

Storage type: Ceph

Networking: Neutron (using Linux networking, no OVS)

Revision history for this message
melanie witt (melwitt) wrote :

Thank you for the descriptive bug report!

I looked at the code you found [1], and while it is indeed a validation check for volume-backed rescue, it's unfortunately only checked that the required API microversion was requested for volume-backed rescue [2].

That said, I think a validation check of the image metadata could easily be added in that method to fail fast if we know the rescue is not going to work.

[1] https://github.com/openstack/nova/blob/f766db261634c8f95f874ba132159f148de9e8bf/nova/compute/api.py#L4514-L4517
[2] https://github.com/openstack/nova/blob/f766db261634c8f95f874ba132159f148de9e8bf/nova/api/openstack/compute/rescue.py#L60

Changed in nova:
importance: Undecided → Medium
status: New → Triaged
tags: added: low-hanging-fruit rescue
Rajesh Tailor (ratailor)
Changed in nova:
assignee: nobody → Rajesh Tailor (ratailor)
Revision history for this message
sean mooney (sean-k-mooney) wrote :

https://bugs.launchpad.net/nova/+bug/1926601 is the same issue and it was filed before so marked this as a duplicate of that older bug and set the assinee to rajesh

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/852737

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/852737
Committed: https://opendev.org/openstack/nova/commit/6eed55bf55469f4ceaa7d4d4eb1be635e14bc73b
Submitter: "Zuul (22348)"
Branch: master

commit 6eed55bf55469f4ceaa7d4d4eb1be635e14bc73b
Author: Rajesh Tailor <email address hidden>
Date: Wed Aug 10 18:15:04 2022 +0530

    Fix rescue volume-based instance

    As of now, when attempting to rescue a volume-based instance
    using an image without the hw_rescue_device and/or hw_rescue_bus
    properties set, the rescue api call fails (as non-stable rescue
    for volume-based instances are not supported) leaving the instance
    in error state.

    This change checks for hw_rescue_device/hw_rescue_bus image
    properties before attempting to rescue and if the property
    is not set, then fail with proper error message, without changing
    instance state.

    Related-Bug: #1978958
    Closes-Bug: #1926601
    Change-Id: Id4c8c5f3b32985ac7d3d7c833b82e0876f7367c1

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/zed)

Related fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/nova/+/872116

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/yoga)

Related fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/nova/+/872118

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/xena)

Related fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/875343

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/wallaby)

Related fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/875344

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/victoria)

Related fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/875347

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/nova/+/872116
Committed: https://opendev.org/openstack/nova/commit/d00a848a735f98b028f5930798ee69ef205c8e2e
Submitter: "Zuul (22348)"
Branch: stable/zed

commit d00a848a735f98b028f5930798ee69ef205c8e2e
Author: Rajesh Tailor <email address hidden>
Date: Wed Aug 10 18:15:04 2022 +0530

    Fix rescue volume-based instance

    As of now, when attempting to rescue a volume-based instance
    using an image without the hw_rescue_device and/or hw_rescue_bus
    properties set, the rescue api call fails (as non-stable rescue
    for volume-based instances are not supported) leaving the instance
    in error state.

    This change checks for hw_rescue_device/hw_rescue_bus image
    properties before attempting to rescue and if the property
    is not set, then fail with proper error message, without changing
    instance state.

    Related-Bug: #1978958
    Closes-Bug: #1926601
    Change-Id: Id4c8c5f3b32985ac7d3d7c833b82e0876f7367c1
    (cherry picked from commit 6eed55bf55469f4ceaa7d4d4eb1be635e14bc73b)

tags: added: in-stable-zed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/nova/+/872118
Committed: https://opendev.org/openstack/nova/commit/4073aa51f79be54e2e6e8143666a7c1f9a00e03d
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 4073aa51f79be54e2e6e8143666a7c1f9a00e03d
Author: Rajesh Tailor <email address hidden>
Date: Wed Aug 10 18:15:04 2022 +0530

    Fix rescue volume-based instance

    As of now, when attempting to rescue a volume-based instance
    using an image without the hw_rescue_device and/or hw_rescue_bus
    properties set, the rescue api call fails (as non-stable rescue
    for volume-based instances are not supported) leaving the instance
    in error state.

    This change checks for hw_rescue_device/hw_rescue_bus image
    properties before attempting to rescue and if the property
    is not set, then fail with proper error message, without changing
    instance state.

    Related-Bug: #1978958
    Closes-Bug: #1926601
    Change-Id: Id4c8c5f3b32985ac7d3d7c833b82e0876f7367c1
    (cherry picked from commit 6eed55bf55469f4ceaa7d4d4eb1be635e14bc73b)
    (cherry picked from commit d00a848a735f98b028f5930798ee69ef205c8e2e)

tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/875343
Committed: https://opendev.org/openstack/nova/commit/c977027b1933e408c58508e883f6a799ffacc4cc
Submitter: "Zuul (22348)"
Branch: stable/xena

commit c977027b1933e408c58508e883f6a799ffacc4cc
Author: Rajesh Tailor <email address hidden>
Date: Wed Aug 10 18:15:04 2022 +0530

    Fix rescue volume-based instance

    As of now, when attempting to rescue a volume-based instance
    using an image without the hw_rescue_device and/or hw_rescue_bus
    properties set, the rescue api call fails (as non-stable rescue
    for volume-based instances are not supported) leaving the instance
    in error state.

    This change checks for hw_rescue_device/hw_rescue_bus image
    properties before attempting to rescue and if the property
    is not set, then fail with proper error message, without changing
    instance state.

    Related-Bug: #1978958
    Closes-Bug: #1926601
    Change-Id: Id4c8c5f3b32985ac7d3d7c833b82e0876f7367c1
    (cherry picked from commit 6eed55bf55469f4ceaa7d4d4eb1be635e14bc73b)
    (cherry picked from commit d00a848a735f98b028f5930798ee69ef205c8e2e)
    (cherry picked from commit 4073aa51f79be54e2e6e8143666a7c1f9a00e03d)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/victoria)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/875347
Reason: stable/victoria branch of openstack/nova is about to be deleted. To be able to do that, all open patches need to be abandoned. Please cherry pick the patch to unmaintained/victoria if you want to further work on this patch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/wallaby)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/875344
Reason: stable/wallaby branch of openstack/nova is about to be deleted. To be able to do that, all open patches need to be abandoned. Please cherry pick the patch to unmaintained/wallaby if you want to further work on this patch.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.