libvirt: driver calls volume connect twice for every volume on boot

Bug #1359617 reported by Nikola Đipanov
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Thang Pham

Bug Description

Libvirt driver will attempt to connect the volume on the hipervisor twice for every volume provided to the instance when booting. If you examine the libvirt driver's spawn() method, both _get_guest_xml (by means of get_guest_storage_config) and _create_domain_and_network will call the _connect_volume method which works out the volume driver and then dispatches the connect logic.

This is especially bad in the iscsi volume driver case, where we do 2 rootwraped calls in the best case, one of which is the target rescan, that can in theory add and remove devices in the kernel.

I suspect that fixing this will make a number of races that have to do with the volume not being present when expected on the hypervisor, at least less likely to happen, in addition to making the boot process with volumes more performant.

An example of a race condition that may be caused or made worse by this is: https://bugs.launchpad.net/cinder/+bug/1357677

Thang Pham (thang-pham)
Changed in nova:
assignee: nobody → Thang Pham (thang-pham)
Revision history for this message
Daniel Berrange (berrange) wrote :

It seems the problem here is that we're overloading the 'connect' method to serve two jobs - do the host OS integration/setup tasks, and return the XML config. I wonder if this is a sign we need to change volume.py to be more like vif.py

By that I mean, when connecting VIFs we have separate 'plug' and 'get_config' methods - the former does the host OS setup and must only be called once and the latter is idempotent for getting XML config.

Revision history for this message
Thang Pham (thang-pham) wrote :

@Daniel Berrange: Sounds like a good plan. I will work on fixing it up, similar to the way vifs are handled, unless there are any objections from Nikola. Thanks.

Revision history for this message
Nikola Đipanov (ndipanov) wrote :

@Thang - I think it will be more difficult to get right than the VIF handling, but I completely agree that it is the path we should try to follow at first, so +1 on the approach from myself.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/116998

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/121965

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/116998
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1b45f4574c85288a6fd67a83ffd8c798d436875b
Submitter: Jenkins
Branch: master

commit 1b45f4574c85288a6fd67a83ffd8c798d436875b
Author: Thang Pham <email address hidden>
Date: Fri Aug 29 14:45:20 2014 -0400

    Split up libvirt volume's connect_volume method

    The connect_volume method in libvirt/volume.py does a couple of
    things. It does the host OS integration/setup tasks, and it returns
    the libvirt xml. In some cases, only the libvirt xml is required,
    and the setup tasks are not necessary.

    The following patch separates the setup tasks and libvirt xml return
    in connect_volume. A new method, called get_config, is added to just
    return the xml config, separate from the connect_volume method.
    The connect_volume method will continue to return the libvirt xml,
    so that existing functionality is not broken. This way, get_config
    can be used to get the libvirt xml without any side-effects, e.g.
    re-running shell commands to setup the volume.

    There will be a follow up patch, where libvirt's driver.py will
    switch over to use get_config and connect_volume will no longer
    need to return the libvirt xml.

    Change-Id: I5d300f8fe99f714d63d51197dc89f556c3a5d43b
    Related-Bug: #1359617

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/121965
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=36cfe6dce403a7dde6c5c12c8c6486f6048c9d35
Submitter: Jenkins
Branch: master

commit 36cfe6dce403a7dde6c5c12c8c6486f6048c9d35
Author: Thang Pham <email address hidden>
Date: Mon Sep 15 14:54:40 2014 -0400

    Make separate calls to libvirt volume

    This is a follow up patch to
    https://review.openstack.org/#/c/116998/, where libvirt
    volume's connect_volume method were split into get_config and
    connect_volume. The following patch changes libvirt driver
    to call the appropriate volume method, get_config or
    connect_volume, instead of the old monolithic call to
    connect_volume. connect_volume will no longer need to return
    the libvirt xml after this switch over. This will improve
    performance and eliminate possible race conditions.

    Change-Id: I0d0d7ba98c53ad7509c28bb4866c488c0a049d2f
    Closes-Bug: #1359617

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → kilo-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: kilo-1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.