cleaning up an LXC instance fails

Bug #1044090 reported by David Kang
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Derek Higgins

Bug Description

Termination an LXC instance is done successfully, but it fails in cleanup.

 using libvirt 0.9.13.

In the log:
2012-08-30 13:35:11 INFO nova.virt.libvirt.driver [-] [instance: cde9e8e0-621b-440a-969b-2a787adac731] Instance destroyed successfully.
2012-08-30 13:35:11 ERROR nova.virt.libvirt.driver [req-381cac99-426a-44c0-83a7-e18060973cdb admin admin] [instance: cde9e8e0-621b-440a-969b-2a787adac731] Error from libvirt during saved instance removal. Code=3 Error=this function is not supported by the connection driver: virDomainHasManagedSaveImage

I think it fails at line 488 in the following code:

nova/virt/libvirt/driver.py
 479 def _cleanup(self, instance, network_info, block_device_info):
 480 try:
 481 virt_dom = self._lookup_by_name(instance['name'])
 482 except exception.NotFound:
 483 virt_dom = None
 484 if virt_dom:
 485 try:
 486 # NOTE(derekh): we can switch to undefineFlags and
 487 # VIR_DOMAIN_UNDEFINE_MANAGED_SAVE once we require 0.9.4
 488 if virt_dom.hasManagedSaveImage(0):
 489 virt_dom.managedSaveRemove(0)

Revision history for this message
Pádraig Brady (p-draigbrady) wrote :

As a side note, I notice that we now require libvirt >= 0.9.6 since:
https://github.com/openstack/nova/commit/f28731c1
https://github.com/openstack/nova/commit/ee41673b

Derek Higgins (derekh)
Changed in nova:
assignee: nobody → Derek Higgins (derekh)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/12274

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/12275

Revision history for this message
Derek Higgins (derekh) wrote :

Hi David,
    Can you check if this patch can delete a vm on your system without an error.

thanks,
Derek.

Revision history for this message
Mark McLoughlin (markmc) wrote :
Changed in nova:
importance: Undecided → Medium
milestone: none → folsom-rc1
Revision history for this message
Mark McLoughlin (markmc) wrote :

Oh, I see - this is a new version of the patch

Please upload the new version to gerrit too and mark it as Work In Progress if you want to wait for David's testing results

Revision history for this message
David Kang (dkang) wrote :
Download full text (4.2 KiB)

 Hi Derek,

 I just tried your new patches.
(I couldn't test many times because of the instability of my system.
I will test more when the system gets stable.)
So far, one good news and one bad news.
Good news first.
The previous error is gone.
Bad news is that the rootfs of LXC instance is not unmounted before rmtree() is called in the nova/virt/libvirt/driver.py file.
I've seen this problem in Essex and in Folsom.
I think this is another bug.
It does not happen always, though.
I suspect there is timing issues between unmount() and rmtree().
This bug eventually leads to "no free nbd device".
If you agree, I can report it as a new bug.

 After terminating instance i-00000005, I still see that its rootfs is mounted to /dev/nbd15:
$ mount
/dev/nbd15 on /usr/local/upstream-Aug-29/instances/instance-00000005/rootfs type ext2 (rw)

 Since it is not unmounted before rmtree() is called, it complains.
Here is the log of nova-compute:
2012-09-04 09:11:46 INFO nova.virt.libvirt.driver [-] [instance: 8e0b9d15-2c4b-40e7-a932-90c8d39d9657] Instance destroyed successfully.
2012-09-04 09:11:46 DEBUG nova.utils [req-52c4813e-2ae8-4307-af31-158d896fe374 admin admin] Attempting to grab semaphore "iptables" for method "_apply"... from (pid=10672) inner /usr/local/nova/nova/utils.py:708
2012-09-04 09:11:46 DEBUG nova.utils [req-52c4813e-2ae8-4307-af31-158d896fe374 admin admin] Got semaphore "iptables" for method "_apply"... from (pid=10672) inner /usr/local/nova/nova/utils.py:712
2012-09-04 09:11:46 DEBUG nova.utils [req-52c4813e-2ae8-4307-af31-158d896fe374 admin admin] Attempting to grab file lock "iptables" for method "_apply"... from (pid=10672) inner /usr/local/nova/nova/utils.py:716
2012-09-04 09:11:46 DEBUG nova.utils [req-52c4813e-2ae8-4307-af31-158d896fe374 admin admin] Got file lock "iptables" for method "_apply"... from (pid=10672) inner /usr/local/nova/nova/utils.py:724
2012-09-04 09:11:46 DEBUG nova.utils [req-52c4813e-2ae8-4307-af31-158d896fe374 admin admin] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf iptables-save -c -t filter from (pid=10672) execute /usr/local/nova/nova/utils.py:176
2012-09-04 09:11:46 DEBUG nova.utils [req-52c4813e-2ae8-4307-af31-158d896fe374 admin admin] Result was 0 from (pid=10672) execute /usr/local/nova/nova/utils.py:191
2012-09-04 09:11:46 DEBUG nova.utils [req-52c4813e-2ae8-4307-af31-158d896fe374 admin admin] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf iptables-restore -c from (pid=10672) execute /usr/local/nova/nova/utils.py:176
2012-09-04 09:11:46 DEBUG nova.utils [req-52c4813e-2ae8-4307-af31-158d896fe374 admin admin] Result was 0 from (pid=10672) execute /usr/local/nova/nova/utils.py:191
2012-09-04 09:11:46 DEBUG nova.utils [req-52c4813e-2ae8-4307-af31-158d896fe374 admin admin] Running cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf iptables-save -c -t nat from (pid=10672) execute /usr/local/nova/nova/utils.py:176
2012-09-04 09:11:46 DEBUG nova.utils [req-52c4813e-2ae8-4307-af31-158d896fe374 admin admin] Result was 0 from (pid=10672) execute /usr/local/nova/nova/utils.py:191
2012-09-04 09:11:46 DEBUG nova.utils [req-52c4813e-2ae8-4307-af31-158d896fe374 a...

Read more...

Revision history for this message
Pádraig Brady (p-draigbrady) wrote :

@David, please put comment 7 as a new bug while mentioning the version of nova you're using.

thanks!

Revision history for this message
David Kang (dkang) wrote : Re: [Bug 1044090] Re: cleaning up an LXC instance fails

 Sure.
I've done that.

 Thanks,
 David

----- Original Message -----
> @David, please put comment 7 as a new bug while mentioning the version
> of nova you're using.
>
> thanks!
>
> --
> You received this bug notification because you are subscribed to the
> bug
> report.
> https://bugs.launchpad.net/bugs/1044090
>
> Title:
> cleaning up an LXC instance fails
>
> Status in OpenStack Compute (Nova):
> In Progress
>
> Bug description:
> Termination an LXC instance is done successfully, but it fails in
> cleanup.
>
> using libvirt 0.9.13.
>
> In the log:
> 2012-08-30 13:35:11 INFO nova.virt.libvirt.driver [-] [instance:
> cde9e8e0-621b-440a-969b-2a787adac731] Instance destroyed successfully.
> 2012-08-30 13:35:11 ERROR nova.virt.libvirt.driver
> [req-381cac99-426a-44c0-83a7-e18060973cdb admin admin] [instance:
> cde9e8e0-621b-440a-969b-2a787adac731] Error from libvirt during saved
> instance removal. Code=3 Error=this function is not supported by the
> connection driver: virDomainHasManagedSaveImage
>
> I think it fails at line 488 in the following code:
>
> nova/virt/libvirt/driver.py
> 479 def _cleanup(self, instance, network_info, block_device_info):
> 480 try:
> 481 virt_dom = self._lookup_by_name(instance['name'])
> 482 except exception.NotFound:
> 483 virt_dom = None
> 484 if virt_dom:
> 485 try:
> 486 # NOTE(derekh): we can switch to undefineFlags and
> 487 # VIR_DOMAIN_UNDEFINE_MANAGED_SAVE once we require 0.9.4
> 488 if virt_dom.hasManagedSaveImage(0):
> 489 virt_dom.managedSaveRemove(0)
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/1044090/+subscriptions

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/12275
Committed: http://github.com/openstack/nova/commit/aeacea16a30f85dfb307803a4b46a602cabc8eb5
Submitter: Jenkins
Branch: master

commit aeacea16a30f85dfb307803a4b46a602cabc8eb5
Author: Derek Higgins <email address hidden>
Date: Tue Sep 4 11:50:36 2012 +0100

    Fixing call to hasManagedSaveImage

    Fixes bug #1044090

    hasManagedSaveImage is not implmented in the LXC libvirt driver, resulting
    in the following error when a vm is deleted "Error from libvirt during saved
    instance removal. Code=3 Error=this function is not supported by the
    connection driver: virDomainHasManagedSaveImage"

    This commit replaces the use of hasManagedSaveImage, managedSaveRemove and
    undefine with undefineFlags which does the work of all 3 calls and is
    implemented in versions of libvirt > 0.9.4. We also revert back to calling
    undefine if undefineFlags raises an error.

    Change-Id: Ib8d451aeff7767f835c3c1aab99ee4ab5e299852

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: folsom-rc1 → 2012.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.