nova-compute doesn't reconnect to libvirtd
Bug #1411278 reported by
Peter Sabaini
This bug affects 5 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
nova (Ubuntu) |
Confirmed
|
Medium
|
Unassigned | ||
nova-compute (Juju Charms Collection) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
I've found nova-compute disabled on a prod system with this in the log:
2015-01-09 20:14:53.906 26500 WARNING nova.virt.
And this in libvirt.log:
2015-01-09 20:14:53.647+0000: 6646: error : netcfStateClean
However, libvirtd seems to operate normally now. After restarting nova-compute it connected successfully to libvirtd.
Shouldn't nova-compute try reconnect automatically?
To post a comment you must log in.
On Trusty with the 3.16 LTS-U kernel, and Icehouse 2014.1.3, after some time (< 1 day), instances become unreachable, cannot be deleted, stopped or started. Restarting libvirt-bin and nova compute on the affected compute node appears to restore the ability to perform operations on the instances. I have 6 identical compute nodes doing the same thing, fully updated as of this date 2015 Apr 1. [Nope, not an April fools bug.]
# Tried to nova delete an instance, here is the paste.ubuntu. com/10718343/
# instance's nova show output, while libvirt on the
# compute node is logging "netcfStateCleanup" errors:
http://
# logs at/near the time of the crime paste.ubuntu. com/10718361/ paste.ubuntu. com/10718443/ paste.ubuntu. com/10718490/
libvirt: http://
nova compute: http://
syslog: http://
# workaround-ish steps
sudo service libvirt-bin stop
sudo service nova-compute stop
sudo service libvirt-bin start
sudo service nova-compute start
I am then able to delete, stop, start instances on the affected compute node for a while. The issue reappears within hours. Even more quickly if I create and destroy ~20 new instances. No crash dumps around.
# version info fat-machine: ~$ dpkg-query --show *libvirt* *nova* hypervisor libvirt 1:2014.1.3-0ubuntu2
ubuntu@
libvirt-bin 1.2.2-0ubuntu13.1.9
libvirt0 1.2.2-0ubuntu13.1.9
nova-common 1:2014.1.3-0ubuntu2
nova-compute 1:2014.1.3-0ubuntu2
nova-compute-
nova-compute-kvm 1:2014.1.3-0ubuntu2
nova-compute-
python-libvirt 1.2.2-0ubuntu2
python-nova 1:2014.1.3-0ubuntu2
python-novaclient 1:2.17.0-0ubuntu1
python2.7-nova
ubuntu@ fat-machine: ~$ uname -a
Linux fat-machine 3.16.0-33-generic #44~14.04.1-Ubuntu SMP Fri Mar 13 10:33:29 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
# resources fat-machine: ~$ free -m
ubuntu@
total used free shared buffers cached
Mem: 48289 4545 43743 1 232 899
-/+ buffers/cache: 3413 44875
Swap: 8191 0 8191
ubuntu@ fat-machine: ~$ df -h ceph/osd/ ceph-2
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 459G 16G 420G 4% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
udev 24G 4.0K 24G 1% /dev
tmpfs 4.8G 724K 4.8G 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 24G 72K 24G 1% /run/shm
none 100M 0 100M 0% /run/user
/dev/sdb1 465G 847M 464G 1% /var/lib/
ubuntu@ fat-machine: ~$ uptime
13:55:57 up 10:25, 1 user, load average: 0.04, 0.04, 0.05