Comment 17 for bug 1788006

Revision history for this message
Matt Riedemann (mriedem) wrote :

Looking at the libvirtd logs, I'm seeing:

2018-08-20 12:52:07.630+0000: 27153: debug : qemuProcessHandleResume:807 : Transitioned guest instance-00000001 out of paused into resumed state

2018-08-20 12:52:07.631+0000: 27153: debug : qemuProcessHandleStop:755 : Transitioned guest instance-00000001 to paused state, reason unknown

I think ^ is where things go wrong and we don't recover.

There are errors in the qemu domain log:

http://logs.openstack.org/74/591074/4/check/neutron-tempest-plugin-designate-scenario/d02f171/controller/logs/libvirt/qemu/instance-00000001_log.txt.gz

KVM: entry failed, hardware error 0x0

If you look at the node providers that these failures show up on, they are primarily OVH:

  Value Action Count / 82 events
1. ovh-bhs1 48
2. ovh-gra1 32
3. limestone-regionone 2

And the bug is because the job is using virt_type=kvm rather than qemu:

http://logs.openstack.org/74/591074/4/check/neutron-tempest-plugin-designate-scenario/d02f171/controller/logs/etc/nova/nova-cpu_conf.txt.gz

[libvirt]
live_migration_uri = qemu+ssh://stack@%s/system
cpu_mode = none
virt_type = kvm

Running nested virt with kvm is a known issue, so that's definitely the bug.