Live Migrations complete but occasionally fail to update the Openstack Database

Bug #1849154 reported by Ryan Farrell
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

[Description]
Occasionally when evacuating vms off of nova compute hosts for host reboots, a vms migration will be reported as complete in the migration list, but queries to the openstack api, such as 'openstack show uuid' will report the host & hypervisor-hostname unchanged. The only indication that something is wrong is that power_state will be NOSTATE. We can see that the instance is in fact migrated and running on the new host with 'sudo virsh list --all | grep $instance_name'.

In order to resolve this issue we perform a direct database edit such as:

'update instances
set host="$newhost", node="$newhost.domain", progress="0"
where uuid="" and deleted="0";'

* In one instance, the 'progress' value was stuck at 99 and I needed to set that to 0 in the database as well.

[Expected]
Its expected that the live migration completes and that the instance in the openstack database correctly reflects the name of the new host, and its power state.

[Impact]
Instances that are found to be in power state NOSTATE are blocked from performing certain actions; instances in this state do not self recover.

[Environment]
Openstack Queens; Nova 17.0.10
libvirtd/virsh: 4.0.0
ceph: 12.2.8
neutron-openvswitch: 12.0.5

[Logs]
In this particular set of logs (sosreports from the live migration source and destination hosts); the instance that was in error had uuid 67f328d0-cb5e-416a-9af4-c6e47e68a1e0.

Revision history for this message
Ryan Farrell (whereisrysmind) wrote :

Logs uploaded to <email address hidden>:lp1849154.tar.gz

Matt Riedemann (mriedem)
tags: added: live-migration
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

@Ryan: I don't know how to access the logs you uploaded. Could you help me here or on IRC in #openstack-nova (on Freenode), my nick is gibi on IRC.

Changed in nova:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.