n-cpu seems to crash when running with libvirt 1.1.1 from ubuntu cloud archive
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
High
|
Unassigned | ||
Ubuntu Cloud Archive |
Fix Released
|
Undecided
|
Unassigned | ||
libvirt (Ubuntu) |
Fix Released
|
High
|
Unassigned | ||
Saucy |
Fix Released
|
High
|
Unassigned |
Bug Description
impact
------
any concurrent use of libvirt may lockup libirt
test case
---------
use libvirt concurrently, specifically in nwfilter + createDomain calls. e.g. run devstack-gate against this
regression potential
-------
upstream stable branch update - should be low
We experienced a series of jenkins rejects starting overnight on Saturday whose root cause is still not quite tracked down yet. However, they all have a couple of things in common:
1) they are the first attempt to use libvir 1.0.6 from havana cloud archive for ubuntu precise
2) the fails are all related to guests not spawning correctly
3) the n-cpu log just stops about 1/2 way through the tempest log, making my suspect that we did something to either lockup or hard crash n-cpu
After that change went in no devstack/tempest gating project managed to merge a change.
This needs more investigation, but creating this bug to both reverify against, as well as track down this issue.
Changed in nova: | |
importance: | Undecided → High |
Changed in nova: | |
status: | New → Confirmed |
summary: |
- n-cpu seems to crash when running with libvirt 1.0.6 from ubuntu cloud + n-cpu seems to crash when running with libvirt 1.1.1 from ubuntu cloud archive |
tags: | added: libvirt |
tags: | added: libvirt111 |
tags: |
added: libvirt1x removed: libvirt111 |
tags: |
added: libvirt-1x removed: libvirt1x |
tags: |
added: libvirt1x removed: libvirt-1x |
description: | updated |
description: | updated |
tags: |
added: verification-done removed: verification-needed |
Changed in nova: | |
status: | Confirmed → Fix Released |
status: | Fix Released → Confirmed |
status: | Confirmed → Fix Released |
Changed in cloud-archive: | |
status: | Confirmed → Fix Released |
On further investigation this isn't actually an n-cpu crash, it's actually a hard lockup of the nova-compute process. At some point it just stops responding, and the process running under devstack is not even ^C killable any more after this. So there is some deadlock introduced by the new packages.