Systemd freezing execution in VMs on manually running virsh list

Bug #1650238 reported by Jiří Stránský
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo-quickstart
Expired
Low
Unassigned

Bug Description

This could well be a libvirt/qemu bug. Reporting it mostly to have some info in case someone else hits it too. Or perhaps someone might find a workaround.

Some ways of executing `virsh list` seem to cause the virsh list command to hang, and the VMs to stop working properly.

Executing as root:

sudo -u stack virsh list --all

hangs the command, and in an undercloud VM this is broadcasted:

systemd[1]: Caught <BUS>, dumped core as pid
systemd[1]: Freezing execution.

(I already lost the exact broadcast but reconstructed the above from googling.) After this, pretty much everything stops working in the undercloud VM, i guess the same would affect overcloud.

The symptom when trying to ssh from the host looks like this:

ssh -F /root/wd-containers/ssh.config.ansible undercloud
Warning: Permanently added 'dell-t5810ws-rdo-10.tpb.lab.eng.brq.redhat.com,10.40.128.52' (ECDSA) to the list of known hosts.
ssh_exchange_identification: Connection closed by remote host

Revision history for this message
Jiří Stránský (jistr) wrote :

Adding -i to the sudo command seems to prevent virsh list from hanging. Not sure yet if it prevents the machines from freezing too, will redeploy :)

Revision history for this message
Jiří Stránský (jistr) wrote :

Changed the title to make it more apparent that this doesn't block quickstart deployments, the freeze happens when running virsh list manually.

summary: - Systemd freezing execution in VMs on virsh list
+ Systemd freezing execution in VMs on manually running virsh list
Changed in tripleo-quickstart:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for tripleo-quickstart because there has been no activity for 60 days.]

Changed in tripleo-quickstart:
status: Incomplete → Expired
Revision history for this message
Oliver Walsh (owalsh) wrote :

The root cause appears to be that virsh is connecting via different sockets. It has been discussed previously here:
https://www.redhat.com/archives/libvirt-users/2016-March/msg00056.html

And there is this patch in oooq that attempts to workaround it, although I wonder if this makes matters worse:
https://github.com/openstack/tripleo-quickstart/commit/f21d7a6d48ac21087c9433e75dc217b85349d8d7

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.