The instance is volume backed and power state is PAUSED,shelve the instance failed

Bug #1864624 reported by Qiu Fossen on 2020-02-25
18
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Undecided
Qiu Fossen

Bug Description

The instance is volume backed and power state is PAUSED,shelve the instance failed. The reason is that can not attempt a clean shutdown of a paused guest instance some hypervisors will fail the clean shutdown if the guest is not running.

Description
===========
The instance is booted from volume, and pause the instance in paused status, and
shelve this instance. But this instance's status is only paused not shelved.

Steps to reproduce
==================
1.Boot an instance from volume.
2.Pause the instance.
3.shelve the instance.

Expected result
===============
This instance's status is shelved.

Actual result
=============
This instance's status is paused.

Environment
===========
1. OpenStack Rocky version
2.Hypervisor is kvm
3.The storage is Ceph
4.Networking is Neutron with OpenVSwitch

Logs&Configs
============
The logs as following:
ERROR oslo_messaging.rpc.server [req-2b83d684-eb08-4337-bcd8-39315db5bd4f ada583e8d8b24df695ecc6bcad83e0d8 749b546d5bcd4425ae53cbe1ba419f01 - - -] Exception during message handling: libvirtError: Requested operation is not valid: domain is not running
ERROR oslo_messaging.rpc.server Traceback (most recent call last):
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch
ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch
ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/exception_wrapper.py", line 79, in wrapped
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
ERROR oslo_messaging.rpc.server self.force_reraise()
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/exception_wrapper.py", line 69, in wrapped
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/compute/manager.py", line 188, in decorated_function
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
ERROR oslo_messaging.rpc.server self.force_reraise()
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/compute/manager.py", line 158, in decorated_function
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/compute/utils.py", line 1141, in decorated_function
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/compute/manager.py", line 216, in decorated_function
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
ERROR oslo_messaging.rpc.server self.force_reraise()
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/compute/manager.py", line 204, in decorated_function
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/compute/manager.py", line 4967, in shelve_offload_instance
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner
ERROR oslo_messaging.rpc.server return f(*args, **kwargs)
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/compute/manager.py", line 4966, in do_shelve_offload_instance
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/compute/manager.py", line 4981, in _shelve_offload_instance
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/compute/manager.py", line 2451, in _power_off_instance
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/virt/libvirt/driver.py", line 2884, in power_off
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/virt/libvirt/driver.py", line 2812, in _clean_shutdown
ERROR oslo_messaging.rpc.server File "build/bdist.linux-x86_64/egg/nova/virt/libvirt/guest.py", line 610, in shutdown
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit
ERROR oslo_messaging.rpc.server result = proxy_call(self._autowrap, f, *args, **kwargs)
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call
ERROR oslo_messaging.rpc.server rv = execute(f, *args, **kwargs)
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute
ERROR oslo_messaging.rpc.server six.reraise(c, e, tb)
ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker
ERROR oslo_messaging.rpc.server rv = meth(*args, **kwargs)
ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2455, in shutdown
ERROR oslo_messaging.rpc.server if ret == -1: raise libvirtError ('virDomainShutdown() failed', dom=self)
ERROR oslo_messaging.rpc.server libvirtError: Requested operation is not valid: domain is not running
ERROR oslo_messaging.rpc.server

Qiu Fossen (fossen123) on 2020-02-25
Changed in tacker:
assignee: nobody → Qiu Fossen (fossen123)
Brin Zhang (zhangbailin) on 2020-03-09
affects: tacker → nova
Brin Zhang (zhangbailin) wrote :

There is the same behavior if the server boot from an image https://opendev.org/openstack/nova/src/branch/master/nova/compute/manager.py#L6325-L6326

Brin Zhang (zhangbailin) on 2020-03-10
Changed in nova:
status: New → Confirmed
status: Confirmed → In Progress
Lee Yarwood (lyarwood) wrote :

Can you edit your initial comment and use the bug template listed below (and on the bug creation page):

!!!!!!!!!!!!!!!!!!!!!!!!! READ THIS !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Each bug report needs to provide a minimum of information without
we are not able to address the issue you observed. It is crucial
for other developers to have this information. You can use the
template below, which asks for this information.

"Request for Feature Enhancements" (RFEs) are collected:
* with the blueprint process [1][2][3] (if you're a developer) OR
* via the liason with the operator (ops) community [4].
This means "wishlist" bugs won't get any attention anymore.

You can ask in the #openstack-nova IRC channel on freenode, if you have questions about this.

References:
[1] https://blueprints.launchpad.net/nova/
[2] https://github.com/openstack/nova-specs
[3] https://wiki.openstack.org/wiki/Blueprints
[4] http://lists.openstack.org/pipermail/openstack-operators/2016-March/010007.html

!!!!!!!!!!!!!!!!!!!!!!!!! READ THIS !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Description
===========
Some prose which explains more in detail what this bug report is
about. If the headline of this report is descriptive enough, skip
this section.

Steps to reproduce
==================
A chronological list of steps which will bring off the
issue you noticed:
* I did X
* then I did Y
* then I did Z
A list of openstack client commands (with correct argument value)
would be the most descriptive example. To get more information use:

    $ nova --debug <command> <arg1> <arg2=value>
    or
    $ openstack --debug <command> <arg1> <arg2=value>

Expected result
===============
After the execution of the steps above, what should have
happened if the issue wasn't present?

Actual result
=============
What happened instead of the expected result?
How did the issue look like?

Environment
===========
1. Exact version of OpenStack you are running. See the following
  list for all releases: http://docs.openstack.org/releases/

   If this is from a distro please provide
       $ dpkg -l | grep nova
       or
       $ rpm -ql | grep nova
   If this is from git, please provide
       $ git log -1

2. Which hypervisor did you use?
   (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
   What's the version of that?

2. Which storage type did you use?
   (For example: Ceph, LVM, GPFS, ...)
   What's the version of that?

3. Which networking type did you use?
   (For example: nova-network, Neutron with OpenVSwitch, ...)

Logs & Configs
==============
The tool *sosreport* has support for some OpenStack projects.
It's worth having a look at it. For example, if you want to collect
the logs of a compute node you would execute:

   $ sudo sosreport -o openstack_nova --batch

on that compute node. Attach the logs to this bug report. Please
consider that these logs need to be collected in "DEBUG" mode.

For tips on reporting VMware virt driver bugs, please see this doc: https://docs.openstack.org/nova/latest/admin/configuration/hypervisor-vmware.html#troubleshooting

Lee Yarwood (lyarwood) wrote :

FWIW I don't think there is a bug with shelve offloading paused instances, at least for Libvirt.

We have the following tempest test for this at present:

https://github.com/openstack/tempest/blob/baecb1e674c42782fe95b30d379ae815b7ca880c/tempest/api/compute/servers/test_server_actions.py#L669-L681

Which hypervisors are failing here?

Brin Zhang (zhangbailin) wrote :

This hit in Rocky release, the hypervisor is Libvirt+KVM, yes, that should give more details for this bug, will respin this bug, and paste more error log.

Qiu Fossen (fossen123) on 2020-04-11
description: updated
Sylvain Bauza (sylvain-bauza) wrote :

I'll close this bug for keeping our bug tracking correct. Feel free to mark this bug as duplicate if you created another bug.

Changed in nova:
status: In Progress → Invalid
Qiu Fossen (fossen123) wrote :

why ? This is not a bug ? ?

Changed in nova:
status: Invalid → In Progress
sunhao (suha9102) wrote :

I also encountered the same problem. Due to asynchronous reasons, although a successful message was returned after the shelved operation, an exception has actually occurred, and the status of the instance is still paused. I also think this is a bug.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers