rebuild vm fail when using lvm type image

Bug #1282423 reported by lirenke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Vladik Romanovsky

Bug Description

I launch a vm with lvm-type image successfully, whose flavor includes "OS-FLV-EXT-DATA:ephemeral" flag, the size is 1G.
So, I can see two logical volumes have beed created by "lvs" command.
Then I call “nova rebuild $SERVERID $IMAGEID”, try to rebuild it with the same image. Error occur, the vm'state get to ERROR, the two volumes disapear sametime.

nova-compute's log is:

"""
2014-02-18 19:54:39.290 2368 DEBUG nova.openstack.common.processutils [req-5e803aa8-329b-4536-941b-c6f91c71dcb0 72d393e2d21d4d71837a4a57e36eb06c 99eced3f5f6d4a8089ec7a8f3e3b5f13] Running cmd (subprocess): lvs -o lv_size --noheadings --units b --nosuffix /dev/cinder-volumes/instance-0000007c_disk.local execute /usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py:147
2014-02-18 19:54:39.626 2368 DEBUG nova.openstack.common.processutils [req-5e803aa8-329b-4536-941b-c6f91c71dcb0 72d393e2d21d4d71837a4a57e36eb06c 99eced3f5f6d4a8089ec7a8f3e3b5f13] Running cmd (subprocess): dd bs=1048576 if=/dev/zero of=/dev/cinder-volumes/instance-0000007c_disk.local seek=0 count=1024 oflag=direct execute /usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py:147
2014-02-18 19:54:47.622 2368 DEBUG nova.openstack.common.processutils [req-5e803aa8-329b-4536-941b-c6f91c71dcb0 72d393e2d21d4d71837a4a57e36eb06c 99eced3f5f6d4a8089ec7a8f3e3b5f13] Running cmd (subprocess): lvremove -f /dev/cinder-volumes/instance-0000007c_disk /dev/cinder-volumes/instance-0000007c_disk.local execute /usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py:147
2014-02-18 19:54:48.061 2368 DEBUG nova.openstack.common.processutils [req-5e803aa8-329b-4536-941b-c6f91c71dcb0 72d393e2d21d4d71837a4a57e36eb06c 99eced3f5f6d4a8089ec7a8f3e3b5f13] Result was 5 execute /usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py:172
2014-02-18 19:54:48.066 2368 DEBUG nova.openstack.common.processutils [req-5e803aa8-329b-4536-941b-c6f91c71dcb0 72d393e2d21d4d71837a4a57e36eb06c 99eced3f5f6d4a8089ec7a8f3e3b5f13] ['lvremove', '-f', '/dev/cinder-volumes/instance-0000007c_disk', '/dev/cinder-volumes/instance-0000007c_disk.local'] failed. Retrying. execute /usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py:184
2014-02-18 19:54:49.081 2368 DEBUG nova.openstack.common.processutils [req-5e803aa8-329b-4536-941b-c6f91c71dcb0 72d393e2d21d4d71837a4a57e36eb06c 99eced3f5f6d4a8089ec7a8f3e3b5f13] Running cmd (subprocess): lvremove -f /dev/cinder-volumes/instance-0000007c_disk /dev/cinder-volumes/instance-0000007c_disk.local execute /usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py:147
2014-02-18 19:54:49.279 2368 DEBUG nova.openstack.common.processutils [req-5e803aa8-329b-4536-941b-c6f91c71dcb0 72d393e2d21d4d71837a4a57e36eb06c 99eced3f5f6d4a8089ec7a8f3e3b5f13] Result was 5 execute /usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py:172
2014-02-18 19:54:49.286 2368 DEBUG nova.openstack.common.processutils [req-5e803aa8-329b-4536-941b-c6f91c71dcb0 72d393e2d21d4d71837a4a57e36eb06c 99eced3f5f6d4a8089ec7a8f3e3b5f13] ['lvremove', '-f', '/dev/cinder-volumes/instance-0000007c_disk', '/dev/cinder-volumes/instance-0000007c_disk.local'] failed. Retrying. execute /usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py:184
2014-02-18 19:54:50.222 2368 DEBUG nova.openstack.common.processutils [req-5e803aa8-329b-4536-941b-c6f91c71dcb0 72d393e2d21d4d71837a4a57e36eb06c 99eced3f5f6d4a8089ec7a8f3e3b5f13] Running cmd (subprocess): lvremove -f /dev/cinder-volumes/instance-0000007c_disk /dev/cinder-volumes/instance-0000007c_disk.local execute /usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py:147
2014-02-18 19:54:50.359 2368 DEBUG nova.openstack.common.processutils [req-5e803aa8-329b-4536-941b-c6f91c71dcb0 72d393e2d21d4d71837a4a57e36eb06c 99eced3f5f6d4a8089ec7a8f3e3b5f13] Result was 5 execute /usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py:172
2014-02-18 19:54:50.364 2368 ERROR nova.compute.manager [req-5e803aa8-329b-4536-941b-c6f91c71dcb0 72d393e2d21d4d71837a4a57e36eb06c 99eced3f5f6d4a8089ec7a8f3e3b5f13] [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] Setting instance vm_state to ERROR
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] Traceback (most recent call last):
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] File "/usr/local/lib/python2.7/site-packages/nova/compute/manager.py", line 4967, in _error_out_instance_on_exception
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] yield
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] File "/usr/local/lib/python2.7/site-packages/nova/compute/manager.py", line 2037, in rebuild_instance
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] block_device_info=block_device_info)
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] File "/usr/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 835, in destroy
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] destroy_disks, context=context)
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] File "/usr/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 935, in _cleanup
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] self._cleanup_lvm(instance)
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] File "/usr/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 957, in _cleanup_lvm
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] libvirt_utils.remove_logical_volumes(*disks)
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] File "/usr/local/lib/python2.7/site-packages/nova/virt/libvirt/utils.py", line 392, in remove_logical_volumes
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] execute(*lvremove, attempts=3, run_as_root=True)
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] File "/usr/local/lib/python2.7/site-packages/nova/virt/libvirt/utils.py", line 50, in execute
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] return utils.execute(*args, **kwargs)
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] File "/usr/local/lib/python2.7/site-packages/nova/utils.py", line 177, in execute
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] return processutils.execute(*cmd, **kwargs)
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] File "/usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py", line 178, in execute
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] cmd=' '.join(cmd))
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] ProcessExecutionError: Unexpected error while running command.
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] Command: lvremove -f /dev/cinder-volumes/instance-0000007c_disk /dev/cinder-volumes/instance-0000007c_disk.local
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] Exit code: 5
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] Stdout: ''
2014-02-18 19:54:50.364 2368 TRACE nova.compute.manager [instance: f9dfdb34-4b89-43e2-9b65-1a59d4705146] Stderr: ' One or more specified logical volume(s) not found.\n'
"""

In nova.virt.libvirt.utils.py:
"""
def remove_logical_volumes(*paths):
    """Remove one or more logical volume."""

    for path in paths:
        clear_logical_volume(path)

    if paths:
        lvremove = ('lvremove', '-f') + paths
        execute(*lvremove, attempts=3, run_as_root=True)
"""
It will try 3 times to remove two volumes (instance-0000007c_disk and instance-0000007c_disk.local).

Then I trace the problem in another example, when it happed again, pdb result show below.

The 1st try:

"""
2014-02-18 22:33:48.579 24156 DEBUG nova.openstack.common.processutils [req-a97ce6d7-ee76-4349-9d89-7c2e7325f2d5 72d393e2d21d4d71837a4a57e36eb06c 99eced3f5f6d4a8089ec7a8f3e3b5f13] Result was 5 execute /usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py:172
> /usr/local/lib/python2.7/site-packages/nova/openstack/common/processutils.py(173)execute()
-> if not ignore_exit_code and _returncode not in check_exit_code:
(Pdb) p cmd
['lvremove', '-f', '/dev/cinder-volumes/instance-0000007e_disk', '/dev/cinder-volumes/instance-0000007e_disk.local']
(Pdb) p result
(' Logical volume "instance-0000007e_disk" successfully removed\n', ' Can\'t remove open logical volume "instance-0000007e_disk.local"\n')
(Pdb) c
...
"""
It shows that remove ephemeral disk fail, but image disk success.

2nd try begin:

"""
-> if not ignore_exit_code and _returncode not in check_exit_code:
(Pdb) p cmd
['lvremove', '-f', '/dev/cinder-volumes/instance-0000007e_disk', '/dev/cinder-volumes/instance-0000007e_disk.local']
(Pdb) p result
(' Logical volume "instance-0000007e_disk.local" successfully removed\n', ' One or more specified logical volume(s) not found.\n')
(Pdb) c
...
"""

The ephemeral disk is remove in this time, but image disk have been removed in last step, so the whole command still fail.

3rd try:

"""
-> if not ignore_exit_code and _returncode not in check_exit_code:
(Pdb) p cmd
['lvremove', '-f', '/dev/cinder-volumes/instance-0000007e_disk', '/dev/cinder-volumes/instance-0000007e_disk.local']
(Pdb) p result
('', ' One or more specified logical volume(s) not found.\n')
(Pdb)
"""

Openstack think remove_logical_volumes fail, though two volumes have been removed in fact.

It seems that retry to remove open volume is useful, but if the situation that not all volumes were removed once time happen, the retry are useless, and remove_logical_volume must fail.

Revision history for this message
lirenke (lvhancy) wrote :

This prolem dose not happen in every rebuild operation.

tags: added: rebuild
Changed in nova:
assignee: nobody → Vladik Romanovsky (vladik-romanovsky)
tags: added: libvirt volumes
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/76954

Changed in nova:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/76954
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0192898e30035028b002f9aed3199be69a18efb3
Submitter: Jenkins
Branch: master

commit 0192898e30035028b002f9aed3199be69a18efb3
Author: Vladik Romanovsky <email address hidden>
Date: Mon Feb 24 09:36:34 2014 -0500

    libvirt: remove_logical_volumes should remove each separately

    Currently, remove_logical_volumes() is trying to remove
    all of the provided volumes path in a single lvremove command,
    while retrying it several times.
    This is causing the command to fail, when the deleted volumes
    are reappearing in the next iteration of the retry loop.

    Handling the removal of each volume separately
    and raising an exception for volumes which failed to be removed.
    This will increase the chance for individual volume to be removed.

    Closes-Bug: #1282423
    Change-Id: Ib25f5603929c3038716661a589a41116b4c48909

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.