VM evacuation is broken with shared storage if VM console.log is not owned by nova

Bug #1691831 reported by Cedric Brandily on 2017-05-18
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Cedric Brandily
Newton
High
Stephen Finucane
Ocata
High
Stephen Finucane

Bug Description

On my Ocata deployment (with a shared storage between my KVMs hypervisors), the following worflow is failing:
 * stop nova-compute on a KVM hypervisor
 * stop a VM on the KVM hypervisor using virsh destroy
 * evacuate the VM ... which fails with the stacktrace:

ERROR nova.compute.manager [req-dcb547e3-5f98-488e-8dbf-ad4453ce82ac 7e6f47b9c6cf4994bb38a2eb3ad6243f 920a4480938349eca2651c140ce33fdd - - -] [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] Setting instance vm_state to ERROR
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] Traceback (most recent call last):
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 6717, in _error_out_instance_on_exception
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] yield
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2751, in rebuild_instance
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] bdms, recreate, on_shared_storage, preserve_ephemeral)
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2795, in _do_rebuild_instance_with_claim
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] self._do_rebuild_instance(*args, **kwargs)
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2910, in _do_rebuild_instance
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] self._rebuild_default_impl(**kwargs)
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2673, in _rebuild_default_impl
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] block_device_info=new_block_device_info)
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2689, in spawn
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] self._ensure_console_log_for_instance(instance)
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 2961, in _ensure_console_log_for_instance
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] libvirt_utils.file_open(console_file, 'a').close()
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/utils.py", line 350, in file_open
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] return open(*args, **kwargs)
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf] IOError: [Errno 13] Permission denied: '/var/lib/nova/instances/a129ab8f-c224-4df2-8134-6716cfe89acf/console.log'
ERROR nova.compute.manager [instance: a129ab8f-c224-4df2-8134-6716cfe89acf]

After some investigation:

_ensure_console_log_for_instance[1] ensures console.log existence. A change[2] updated this method in order to succeed if the file exists without nova being able to open it by ignoring EPERM erros (errno 1, "operation not permitted") but it should ignore EACCES errors (errno 13, "permission denied").

 >>> open('/etc/shadow')
 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
 IOError: [Errno 13] Permission denied: '/etc/shadow'

EACCES errors are raised when you cannot do something because of insufficient permissions, EPERM are raised when you cannot do something (even with root account).

[1] nova.virt.libvirt.driver
[2] https://review.openstack.org/392643

Fix proposed to branch: master
Review: https://review.openstack.org/466088

Changed in nova:
assignee: nobody → Cedric Brandily (cbrandily)
status: New → In Progress
Changed in nova:
importance: Undecided → Low
tags: added: rebuild

Reviewed: https://review.openstack.org/466088
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3072b0afbc157eef5e72f191525296cfa2b014cb
Submitter: Jenkins
Branch: master

commit 3072b0afbc157eef5e72f191525296cfa2b014cb
Author: cedric.brandily <email address hidden>
Date: Thu May 18 21:26:09 2017 +0200

    Correct _ensure_console_log_for_instance implementation

    _ensure_console_log_for_instance[1] ensures VM console.log existence.

    A change[2] updated in order to succeed if the file exists without nova
    being able to read it (typically happens when libvirt rewrites uid/gid)
    by ignoring EPERM errors.

    It seems the method should ignore EACCES errors. Indeed EACCES errors
    are raised when an action is not permitted because of insufficient
    permissions where EPERM errors when an action is not permitted at all.

    [1] nova.virt.libvirt.driver
    [2] https://review.openstack.org/392643

    Closes-Bug: #1691831
    Change-Id: Ifc075a0fd91fc87651fcb306d6439be5369009b6

Changed in nova:
status: In Progress → Fix Released
Matt Riedemann (mriedem) on 2017-05-30
summary: - VM evacuation is broken with shared torage if VM console.log is not
+ VM evacuation is broken with shared storage if VM console.log is not
owned by nova
Matt Riedemann (mriedem) on 2017-05-30
Changed in nova:
importance: Low → High

Reviewed: https://review.openstack.org/469012
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=81838f1ae910383bf0992684e6b0ef30a6d943ba
Submitter: Jenkins
Branch: stable/ocata

commit 81838f1ae910383bf0992684e6b0ef30a6d943ba
Author: cedric.brandily <email address hidden>
Date: Thu May 18 21:26:09 2017 +0200

    Correct _ensure_console_log_for_instance implementation

    _ensure_console_log_for_instance[1] ensures VM console.log existence.

    A change[2] updated in order to succeed if the file exists without nova
    being able to read it (typically happens when libvirt rewrites uid/gid)
    by ignoring EPERM errors.

    It seems the method should ignore EACCES errors. Indeed EACCES errors
    are raised when an action is not permitted because of insufficient
    permissions where EPERM errors when an action is not permitted at all.

    [1] nova.virt.libvirt.driver
    [2] https://review.openstack.org/392643

    Closes-Bug: #1691831
    Change-Id: Ifc075a0fd91fc87651fcb306d6439be5369009b6
    (cherry picked from commit 3072b0afbc157eef5e72f191525296cfa2b014cb)

Reviewed: https://review.openstack.org/469013
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9d299ae50ea52c63811eba11ee974643c17079ad
Submitter: Jenkins
Branch: stable/newton

commit 9d299ae50ea52c63811eba11ee974643c17079ad
Author: cedric.brandily <email address hidden>
Date: Thu May 18 21:26:09 2017 +0200

    Correct _ensure_console_log_for_instance implementation

    _ensure_console_log_for_instance[1] ensures VM console.log existence.

    A change[2] updated in order to succeed if the file exists without nova
    being able to read it (typically happens when libvirt rewrites uid/gid)
    by ignoring EPERM errors.

    It seems the method should ignore EACCES errors. Indeed EACCES errors
    are raised when an action is not permitted because of insufficient
    permissions where EPERM errors when an action is not permitted at all.

    [1] nova.virt.libvirt.driver
    [2] https://review.openstack.org/392643

    Closes-Bug: #1691831
    Change-Id: Ifc075a0fd91fc87651fcb306d6439be5369009b6
    (cherry picked from commit 3072b0afbc157eef5e72f191525296cfa2b014cb)

This issue was fixed in the openstack/nova 16.0.0.0b2 development milestone.

This issue was fixed in the openstack/nova 15.0.6 release.

This issue was fixed in the openstack/nova 14.0.8 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers