libvirt-bin crashes / refuses to restart if cgmanager is restarted

Bug #1397130 reported by Don Bowman
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Fix Released
High
Unassigned
Utopic
Fix Released
High
Unassigned

Bug Description

========================
Impact: libvirt hangs
Fix: mutex libvirt's access to cgmanager
Test case: see script in #14
Regression potential: if this is done wrongly, it could cause a deadlock. No non-cgmanager codepaths are affected.
========================

reference bug 1367702. As per request, opening new ticket w/ instructions to reproduce.

This is on 14.10 server, libvirt-bin 1.2.8-0ubuntu11.1

As per 1367702, this is not using LXC (which u used in your attempt). This is running bare-metal, no container, no hypervisor. Each VM below is started from OpenStack nova-compute (this node is a compute-only node).

don@nubo-5:~$ sudo service cgmanager restart
cgmanager stop/waiting
cgmanager start/running, process 22588
don@nubo-5:~$ virsh list
 Id Name State
----------------------------------------------------
 2 instance-000015de running
 3 instance-000015df running
 4 instance-000015e0 running
 5 instance-000015e1 running
 6 instance-000015e2 running
 7 instance-000015e3 running
 8 instance-000015e4 running
 9 instance-000015e5 running
 10 instance-000015e6 running
 11 instance-000015e7 running
 12 instance-000015e8 running
 13 instance-000015e9 running
 14 instance-000015ea running
 15 instance-000015eb running
 16 instance-000015ec running
 17 instance-000015ed running
 18 instance-000015ee running
 19 instance-000015ef running
 20 instance-000015f0 running
 21 instance-000015f1 running
 22 instance-000015f2 running
 23 instance-000015f3 running
 24 instance-000015f4 running
 25 instance-000015f5 running
 26 instance-000015f6 running
 27 instance-000015f7 running
 28 instance-000015f8 running
 29 instance-000015f9 running
 30 instance-000015fa running
 31 instance-000015fb running
 32 instance-000015fc running
 33 instance-000015fd running
 34 instance-000015fe running
 35 instance-000015ff running
 36 instance-00001600 running

don@nubo-5:~$ sudo service libvirt-bin restart
libvirt-bin stop/waiting
libvirt-bin start/running, process 22751
don@nubo-5:~$ virsh list
error: failed to connect to the hypervisor
error: no valid connection
error: Cannot recv data: Connection reset by peer

If i then run libvirtd manually:

root@nubo-5:~# libvirtd -v
2014-11-27 22:38:18.066+0000: 26422: info : libvirt version: 1.2.8, package: 1.2.8-0ubuntu11.1
2014-11-27 22:38:18.066+0000: 26422: info : virNetlinkEventServiceStart:521 : starting netlink event service with protocol 0
2014-11-27 22:38:18.066+0000: 26422: info : virNetlinkEventServiceStart:521 : starting netlink event service with protocol 15
2014-11-27 22:38:18.073+0000: 26433: info : dnsmasqCapsSetFromBuffer:685 : dnsmasq version is 2.71, --bind-dynamic is present, SO_BINDTODEVICE is in use
2014-11-27 22:38:18.074+0000: 26433: info : networkReloadFirewallRules:1778 : Reloading iptables rules
2014-11-27 22:38:18.074+0000: 26433: info : networkRefreshDaemons:1750 : Refreshing network daemons
2014-11-27 22:38:18.198+0000: 26433: info : virFirewallApplyGroup:844 : Starting transaction for 0x7f15e40e7110 flags=0
2014-11-27 22:38:18.198+0000: 26433: info : virFirewallApplyRule:785 : Applying rule '/sbin/iptables --version'
2014-11-27 22:38:18.207+0000: 26433: info : libxlDriverShouldLoad:241 : Disabling driver as /proc/xen/capabilities does not exist
2014-11-27 22:38:18.250+0000: 26433: info : virDomainObjListLoadAllConfigs:18944 : Scanning for configs in /var/run/libvirt/qemu
2014-11-27 22:38:18.256+0000: 26433: info : virDomainObjListLoadAllConfigs:18968 : Loading config file 'instance-000015fd.xml'

 ...
2014-11-27 22:38:18.385+0000: 26441: error : cgm_dbus_connect:76 : cgmanager: Error pinging manager: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
process 26422: The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.
Most likely, the application was supposed to call dbus_connection_close(), since this is a private connection.
2014-11-27 22:38:18.387+0000: 26439: warning : cg_detect_placement:561 : Failed to get cgroup path for cpu
2014-11-27 22:38:18.392+0000: 26445: error : cgm_dbus_connect:76 : cgmanager: Error pinging manager: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.

cgm ping returns true, so cgmanager is presumably ok.

sometimes when doing the libvirtd -v manually it does a segfault instead of an assert.
sometimes the assertion is different:
(null):cgmanager-client.c:1015: Assertion failed in cgmanager_get_pid_cgroup_sync: proxy != NULL

Segmentation fault (core dumped)

or
(null):alloc.c:315: Assertion failed in nih_free: ptr != NULL
(null):alloc.c:315: Assertion failed in nih_free: ptr != NULL
Segmentation fault (core dumped)
---
ApportVersion: 2.14.7-0ubuntu8
Architecture: amd64
DistroRelease: Ubuntu 14.10
Package: libvirt (not installed)
ProcCmdline: BOOT_IMAGE=/boot/vmlinuz-3.16.0-25-generic root=UUID=a58668fa-f6db-4941-84eb-c89e102971e1 ro splash quiet vt.handoff=7
ProcEnviron:
 LANGUAGE=en_CA:en
 TERM=screen
 PATH=(custom, no user)
 LANG=en_CA.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 3.16.0-25.33-generic 3.16.7
Tags: utopic utopic
Uname: Linux 3.16.0-25-generic x86_64
UnreportableReason: The report belongs to a package that is not installed.
UpgradeStatus: Upgraded to utopic on 2014-10-19 (39 days ago)
UserGroups:

_MarkForUpload: True
modified.conffile..etc.apparmor.d.abstractions.libvirt.qemu: [modified]
modified.conffile..etc.apparmor.d.usr.sbin.libvirtd: [modified]
mtime.conffile..etc.apparmor.d.abstractions.libvirt.qemu: 2014-10-23T03:29:38.231519
mtime.conffile..etc.apparmor.d.usr.sbin.libvirtd: 2014-10-23T03:18:18.057906

Revision history for this message
Don Bowman (donbowman) wrote : KernLog.txt

apport information

tags: added: apport-collected utopic
description: updated
Revision history for this message
Don Bowman (donbowman) wrote : RelatedPackageVersions.txt

apport information

Revision history for this message
Don Bowman (donbowman) wrote :

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `libvirtd -v'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fa266277d27 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007fa266277d27 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007fa266279418 in __GI_abort () at abort.c:89
#2 0x00007fa263fde309 in nih_free () from /lib/x86_64-linux-gnu/libnih.so.1
#3 0x00007fa26688f862 in cgm_dbus_connect () at /build/buildd/libvirt-1.2.8/./src/util/cgmanager.c:78
#4 0x00007fa266887d5c in virCgroupAvailable () at /build/buildd/libvirt-1.2.8/./src/util/vircgroup.c:135
#5 0x00007fa256db5e85 in qemuConnectCgroup (driver=<optimized out>, vm=0x7fa25028d540) at /build/buildd/libvirt-1.2.8/./src/qemu/qemu_cgroup.c:776
#6 0x00007fa256dca447 in qemuProcessReconnect (opaque=0x7d3e, opaque@entry=0x7fa25032de10) at /build/buildd/libvirt-1.2.8/./src/qemu/qemu_process.c:3322
#7 0x00007fa2668d994e in virThreadHelper (data=<optimized out>) at /build/buildd/libvirt-1.2.8/./src/util/virthread.c:197
#8 0x00007fa26660e0a5 in start_thread (arg=0x7fa245ffb700) at pthread_create.c:309
#9 0x00007fa26633b84d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Revision history for this message
Don Bowman (donbowman) wrote :

(gdb) bt
#0 0x00007f9fd650ad27 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f9fd650c418 in __GI_abort () at abort.c:89
#2 0x00007f9fd448f538 in cgmanager_get_pid_cgroup_sync () from /lib/x86_64-linux-gnu/libcgmanager.so.0
#3 0x00007f9fd6b2302b in cgm_controller_exists (controller=0x7f9fd6ca766b "name=systemd") at /build/buildd/libvirt-1.2.8/./src/util/cgmanager.c:296
#4 0x00007f9fd6b1aa3b in cg_get_cgroups (group=<optimized out>) at /build/buildd/libvirt-1.2.8/./src/util/vircgroup.c:374
#5 virCgroupDetectMounts (group=0x12f) at /build/buildd/libvirt-1.2.8/./src/util/vircgroup.c:412
#6 0x00007f9fd6b1bc53 in virCgroupDetect (parent=<optimized out>, path=<optimized out>, controllers=<optimized out>, pid=<optimized out>,
    group=<optimized out>) at /build/buildd/libvirt-1.2.8/./src/util/vircgroup.c:729
#7 virCgroupNew (pid=303, path=0x7f9fd6cf4dd8 "", parent=0x0, controllers=-1, group=0x7f9fbcff9700)
    at /build/buildd/libvirt-1.2.8/./src/util/vircgroup.c:1263
#8 0x00007f9fd6b1bec7 in virCgroupNewDetectMachine (name=0x7f9fc0279100 "instance-000015f8", drivername=0x7f9fc70c8ffb "qemu", pid=303,
    partition=0x7f9fc026c1a0 "/machine", controllers=-1124100352, group=0x7f9fc027cb50) at /build/buildd/libvirt-1.2.8/./src/util/vircgroup.c:1782
#9 0x00007f9fc7048ec5 in qemuConnectCgroup (driver=<optimized out>, vm=0x7f9fc027e080) at /build/buildd/libvirt-1.2.8/./src/qemu/qemu_cgroup.c:781
#10 0x00007f9fc705d447 in qemuProcessReconnect (opaque=0x12f, opaque@entry=0x7f9fc031b890) at /build/buildd/libvirt-1.2.8/./src/qemu/qemu_process.c:3322
#11 0x00007f9fd6b6c94e in virThreadHelper (data=<optimized out>) at /build/buildd/libvirt-1.2.8/./src/util/virthread.c:197
#12 0x00007f9fd68a10a5 in start_thread (arg=0x7f9fbcff9700) at pthread_create.c:309
#13 0x00007f9fd65ce84d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks for reporting this bug. I can't seem to reproduce this on my utopic laptop. However, in your example you have > 30 instances running. Do you still get this when you have no instances running when restart cgmanager and libvirtd?

Can you show the following info both before and after the restart/crash:

ps -ef | egrep -e '(cgmanager|cgproxy)'
cat /proc/self/mountinfo
tree /sys/fs/cgroup

Changed in libvirt (Ubuntu):
status: New → Incomplete
importance: Undecided → High
Revision history for this message
Don Bowman (donbowman) wrote :

  463.169884] init: cgmanager main process (510) killed by TERM signal
[ 468.885143] show_signal_msg: 1 callbacks suppressed
[ 468.885149] libvirtd[44515]: segfault at 0 ip 00007fb39575b90f sp 00007fb38241b8e0 error 4 in libvirt.so.0.1002.8[7fb3956f0000+2b3000][ 469.049895] init: libvirt-bin main process (44423) killed by SEGV signal
[ 469.049908] init: libvirt-bin main process ended, respawning
[ 494.421272] libvirtd[44683]: segfault at 0 ip 00007f65efd2e90f sp 00007f65dc9ee8e0 error 4 in libvirt.so.0.1002.8[7f65efcc3000+2b3000]
[ 494.614372] init: libvirt-bin main process (44584) killed by SEGV signal
[ 494.614387] init: libvirt-bin main process ended, respawning
[ 494.977478] libvirtd[2476]: segfault at 7f3b00000012 ip 00007f3b3f811ec7 sp 00007f3b2daa1650 error 4 in libc-2.19.so[7f3b3f791000+1ba000]
[ 495.125140] init: libvirt-bin main process (2414) killed by SEGV signal
[ 495.125153] init: libvirt-bin main process ended, respawning
[ 495.604876] init: libvirt-bin main process (2511) killed by ABRT signal
[ 495.604888] init: libvirt-bin main process ended, respawning
[ 495.946894] libvirtd[2744]: segfault at 0 ip 00007f27818c590f sp 00007f276fd888e0 error 4 in libvirt.so.0.1002.8[7f278185a000+2b3000]
[ 495.946912] libvirtd[2749]: segfault at 7f2700000012 ip 00007f27812f7ec7 sp 00007f276d583210 error 4 in libc-2.19.so[7f2781277000+1ba000]
[ 496.101145] init: libvirt-bin main process (2647) killed by SEGV signal
[ 496.101160] init: libvirt-bin main process ended, respawning
[ 496.403354] libvirtd[2911]: segfault at 10 ip 00007f55485bd5b6 sp 00007f55352ed850 error 4 in libdbus-1.so.3.8.7[7f5548598000+46000]
[ 496.524614] init: libvirt-bin main process (2821) killed by SEGV signal
[ 496.524628] init: libvirt-bin main process ended, respawning
[ 496.993885] init: libvirt-bin main process (2938) killed by ABRT signal
[ 496.993910] init: libvirt-bin main process ended, respawning
[ 497.363533] libvirtd[3270]: segfault at 0 ip 00007fddf3dd690f sp 00007fddd3ffe8e0 error 4 in libvirt.so.0.1002.8[7fddf3d6b000+2b3000]
[ 497.363551] libvirtd[3266]: segfault at 0 ip 00007fddf3dd690f sp 00007fdde22998e0 error 4 in libvirt.so.0.1002.8[7fddf3d6b000+2b3000]
[ 497.504828] init: libvirt-bin main process (3097) killed by SEGV signal
[ 497.504841] init: libvirt-bin main process ended, respawning
[ 497.989323] init: libvirt-bin main process (3296) killed by ABRT signal
[ 497.989339] init: libvirt-bin main process ended, respawning
[ 498.477920] init: libvirt-bin main process (3464) killed by ABRT signal
[ 498.477940] init: libvirt-bin main process ended, respawning
[ 498.787286] libvirtd[3711]: segfault at 7fd70000000a ip 00007fd7666cc232 sp 00007fd75515e500 error 4 in libc-2.19.so[7fd76664d000+1ba000]
[ 498.953864] init: libvirt-bin main process (3624) killed by SEGV signal
[ 498.953878] init: libvirt-bin main process ended, respawning
[ 499.262920] libvirtd[3910]: segfault at 10 ip 00007f34615465b6 sp 00007f344affc850 error 4 in libdbus-1.so.3.8.7[7f3461521000+46000]
[ 499.409498] init: libvirt-bin main process (3785) killed by SEGV signal
[ 499.409522] init: libvirt-bin respawning too fast, stopped

Revision history for this message
Don Bowman (donbowman) wrote :

attached.
it didn't happen the first time [w/ 0 instances running].
i then started 1, and virsh hung [after my earlier restart?]
its now dead.

this is a different blade, a 2x E5-2640 0 @ 2.50GHz

Revision history for this message
Don Bowman (donbowman) wrote :
Download full text (5.1 KiB)

the latest repro shows this stack trace if it helps.
Linux nubo-9 3.16.0-28-generic #37-Ubuntu SMP Mon Dec 8 17:15:28 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

root@nubo-9:/var/crash# dpkg -l |grep cgm
ii cgmanager 0.33-2 amd64 Central cgroup manager daemon
ii cgmanager-tests 0.33-2 all Central cgroup manager daemon (tests)
ii libcgmanager-dev:amd64 0.33-2 amd64 Central cgroup manager daemon (dev)
ii libcgmanager0:amd64 0.33-2 amd64 Central cgroup manager daemon (client library)

root@nubo-9:/var/crash# dpkg -l|grep libvirt
ii libvirt-bin 1.2.8-0ubuntu11.1 amd64 programs for the libvirt library
ii libvirt0 1.2.8-0ubuntu11.1 amd64 library for interfacing with different virtualization systems
hi nova-compute-libvirt 1:2014.2-0ubuntu1 all OpenStack Compute - compute node libvirt support
ii python-libvirt 1.2.8-0ubuntu2 amd64 libvirt Python bindings

# dpkg -l |grep dbus
ii dbus 1.8.8-1ubuntu2.1 amd64 simple interprocess messaging system (daemon and utilities)
ii dbus-1-dbg:amd64 1.8.8-1ubuntu2.1 amd64 simple interprocess messaging system (debug symbols)
ii dbus-x11 1.8.8-1ubuntu2.1 amd64 simple interprocess messaging system (X11 deps)
ii libdbus-1-3:amd64 1.8.8-1ubuntu2.1 amd64 simple interprocess messaging system (library)
ii libdbus-1-dev:amd64 1.8.8-1ubuntu2.1 amd64 simple interprocess messaging system (development headers)
ii libdbus-glib-1-2:amd64 0.102-1 amd64 simple interprocess messaging system (GLib-based shared library)
ii libnih-dbus-dev:amd64 1.0.3-4ubuntu26 amd64 NIH D-Bus Bindings Library (development files)
ii libnih-dbus1:amd64 1.0.3-4ubuntu26 amd64 NIH D-Bus Bindings Library
ii nih-dbus-tool 1.0.3-4ubuntu26 amd64 NIH D-Bus Binding Tool
ii python3-dbus 1.2.0-2build2 amd64 simple interprocess messaging system (Python 3 interface)

2014-12-12 05:27:01.331+0000: 35353: error : cgm_dbus_connect:76 : cgmanager: Error pinging manager: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
process 35331: The last reference on a connection was dropped without closing the connection. This is a bug in an application. See dbus_connection_unref() documentation for details.
Most likely, the applicatio...

Read more...

Revision history for this message
Don Bowman (donbowman) wrote :

another way to reproduce this is to start new instances quickly (e.g. create back to back), and then restart libvirt-bin.

and sometimes, just start new instances quickly (w/o restart libvirt-bin)

for example, for host X, if the sleep are commented out, it will normally crash libvirt if this is run to try and rescue the old instances. if the sleep is present it will normally not crash.

#!/bin/bash
host=$1
j=0
for i in $(nova list --host $host --all-tenants | awk '/Shutdown/ {print $2}')
do
    j=$((j+1))
    if [ $j -eq 5 ]
    then
        echo "Wait a bit..."
# sleep 30
        j=0
    fi
    echo "Rescue $i"
    nova reboot --hard $i
# sleep 5
done

Revision history for this message
Don Bowman (donbowman) wrote :

reproduced with requested steps attached

Revision history for this message
Don Bowman (donbowman) wrote :

the attached script will reproduce this on utopic.

Revision history for this message
Don Bowman (donbowman) wrote :

the attached script will reproduce this on utopic. I did it on a virtual machine running under kvm, but it will also do it bare-metal.

#!/bin/bash

apt-get -qy install virtinst libvirt-bin

die() {
    echo "$*"
    exit 1
}

dir=$(mktemp -d "/tmp/repro.XXXXXX")
echo "rep files in $dir"
cd "$dir"
chgrp libvirtd .
chmod 775 .

[ -f cirros-0.3.3-x86_64-disk.img ] || wget http://download.cirros-cloud.net/0.3.3/cirros-0.3.3-x86_64-disk.img || die "Error fetching cirros image"

ninstances=10

for i in $(seq 0 $ninstances)
do
    cp -f cirros-0.3.3-x86_64-disk.img cirros-$i.qcow2
    chgrp libvirtd cirros-$i.qcow2
    chmod 775 cirros-$i.qcow2
done

for i in $(seq 0 $ninstances)
do
    virt-install -n instance-$i -r 256 --cpu host --description "instance-$i" --import --disk cirros-$i.qcow2 --os-type=linux --noautoconsole
done

service libvirt-bin restart

echo "This command should return instantly with $ninstances in the list"
virsh list

Revision history for this message
Don Bowman (donbowman) wrote :

script to reproduce (as attachment)

Revision history for this message
Don Bowman (donbowman) wrote :

so part of the issue is cgm_* is not thread safe, but is called from various threads.
e.g. if u start libvirtd w/ several instances already running, it will call cgm_dbus_connect from each.

there is a pair of global variable:
static NihDBusProxy *cgroup_manager = NULL;
bool cgm_running = false;

in src/util/cgmanager.c, and they cause this issue.

cgm_dbus_connect() overwrites cgroup_manager regardless of the state of cgm_running (and cgm_running is set @ the end of it anyway).

The use of pthread_once is probably part of the solution. The below patch partially corrects.

--- libvirt-1.2.8.orig/src/util/cgmanager.c
+++ libvirt-1.2.8/src/util/cgmanager.c
@@ -40,10 +40,12 @@
 static NihDBusProxy *cgroup_manager = NULL;
 bool cgm_running = false;

+static pthread_once_t cgmanager_is_init = PTHREAD_ONCE_INIT;
+
 VIR_LOG_INIT("util.cgmanager");

 #define CGMANAGER_DBUS_SOCK "unix:path=/sys/fs/cgroup/cgmanager/sock"
-bool cgm_dbus_connect(void)
+static bool _cgm_dbus_connect(void)
 {
     DBusError dbus_error;
     DBusConnection *connection;
@@ -83,6 +85,11 @@ bool cgm_dbus_connect(void)
     return true;
 }

+bool cgm_dbus_connect(void)
+{
+ pthread_once(&cgmanager_is_init, _cgm_dbus_connect);
+}
+
 void cgm_dbus_disconnect(void)
 {
     if (cgroup_manager) {
--- libvirt-1.2.8.orig/src/util/cgmanager.h
+++ libvirt-1.2.8/src/util/cgmanager.h
@@ -30,6 +30,7 @@
 #include <nih/alloc.h>
 #include <nih/error.h>
 #include <nih/string.h>
+#include <pthread.h>

 extern bool cgm_running;

Its not clear to me the underlying libs are all thread safe (e.g. https://bugs.launchpad.net/ubuntu/+source/libnih/+bug/1294200)

there would remain the problem of... what if cgmanager goes away. 1 thread will find it, does it reconnect and fix the others?

Revision history for this message
Don Bowman (donbowman) wrote :

this patch to libvirt gets it going.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1397130] Re: libvirt-bin crashes / refuses to restart if cgmanager is restarted

Hi Don,

indeed the underlying libs (mainly the libnih-dbus ones) are not thread-safe.
We've had to serialize access to cgmanager-client in lxc for the same
reason.

The cgm_dbus_connect() happens once on every cgmanager action, not once
at libvirt startup. So doing pthread_once() is not going to do the right
thing here IIUC. Rather we should just take a pthread lock at connect
and drop it at disconnect.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks for the reproducer script. Note that I had to tweak it to sleep a second before the last virsh list, else it always failed bc libvirt simply hadn't finished starting up yet.

I can reproduce the bug with stock utopic. With the libvirt package in ppa:~serge-hallyn/virt, i cannot.

Could you please test that package as well?

Revision history for this message
Don Bowman (donbowman) wrote :

I just tried, it did not immediately reproduce, will let it run for a bit.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "don-cgm.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Revision history for this message
Don Bowman (donbowman) wrote :

i ran the ppa image on 10 blades for a day in my production. its cycled through ~10K instances. so far so good, it appears to have solved this issue.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks Don, and thanks very much for nailing down the cause!

I'll get a fix out for v today and start the SRU process.

Changed in libvirt (Ubuntu):
status: Incomplete → In Progress
Changed in libvirt (Ubuntu Trusty):
importance: Undecided → High
Changed in libvirt (Ubuntu Utopic):
importance: Undecided → High
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 1.2.8-0ubuntu18

---------------
libvirt (1.2.8-0ubuntu18) vivid; urgency=medium

  * mutex cgmanager actions (Thanks to Don Bowman for finding the cause)
    (LP: #1397130) (LP: #1367702)
 -- Serge Hallyn <email address hidden> Thu, 18 Dec 2014 13:28:03 -0600

Changed in libvirt (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in libvirt (Ubuntu Trusty):
status: New → Confirmed
Changed in libvirt (Ubuntu Utopic):
status: New → Confirmed
Revision history for this message
Tony Link (tlink) wrote :

Any change that this patch will make it into Utopic anytime soon?

Revision history for this message
Don Bowman (donbowman) wrote :

I added the ppa:
$ cat /etc/apt/sources.list.d/serge-hallyn-ubuntu-virt-utopic.list
deb http://ppa.launchpad.net/serge-hallyn/virt/ubuntu utopic main

whilst the sru process presumably continues.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

> Any change that this patch will make it into Utopic anytime soon?

I will get this into utopic-proposed this week. Once accepted it should
sit there for about a week before going into utopic-updates (if the
fix is verified)

no longer affects: libvirt (Ubuntu Trusty)
Revision history for this message
Chris J Arges (arges) wrote : Please test proposed package

Hello Don, or anyone else affected,

Accepted libvirt into utopic-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/libvirt/1.2.8-0ubuntu11.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in libvirt (Ubuntu Utopic):
status: Confirmed → Fix Committed
tags: added: verification-needed
Revision history for this message
Don Bowman (donbowman) wrote :

tried 1.2.8-0ubuntu11.3. seems to resolve.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 1.2.8-0ubuntu11.3

---------------
libvirt (1.2.8-0ubuntu11.3) utopic-proposed; urgency=medium

  * apparmor libvirt-qemu template: allow reading charm-specific ceph config
    and allow reading under /tmp and /var/tmp (for SRU only) (LP: #1403648)
  * mutex cgmanager actions (Thanks to Don Bowman for finding the cause)
    (LP: #1397130) (LP: #1367702)
 -- Serge Hallyn <email address hidden> Tue, 06 Jan 2015 10:40:17 -0600

Changed in libvirt (Ubuntu Utopic):
status: Fix Committed → Fix Released
Revision history for this message
Scott Kitterman (kitterman) wrote : Update Released

The verification of the Stable Release Update for libvirt has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.