libvirtd restart can sometimes cause multiple nova-compute connections

Bug #1240905 reported by Tom Hancock
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Unassigned
Havana
Fix Released
Undecided
Unassigned

Bug Description

libvirt driver's get_connection() is not thread safe in the
presence of a libvirtd restart during concurrent incoming
requests.

With existing code, each request will in turn call get_connection,
find the connection is broken, try to create new one, block
for a while and yield to the next thread to do the same.
You get as many connections as there are incoming requests
and only the last one is used finally. If enough are incoming
these connections can exhaust the client pool configured
for libvirtd.

One fix is to hold a lock while creating the connection.
Note that has_min_version calls _conn which calls get_connection
so the lock may not be held over the call to has_min_version()

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/52401
Committed: http://github.com/openstack/nova/commit/b2e64e379835f57128e66f507438130eda716814
Submitter: Jenkins
Branch: master

commit b2e64e379835f57128e66f507438130eda716814
Author: Tom Hancock <email address hidden>
Date: Thu Oct 17 09:48:54 2013 +0000

    make libvirt driver get_connection thread-safe

    libvirt driver's get_connection is not thread safe in the
    presence of a libvirtd restart during concurrent incoming
    requests.

    With existing code each will in turn call get_connection,
    find the connection is broken, try to create new one, block
    for a while and yield to the next thread to do the same.
    You get as many connections as there are incoming requests
    and only the last one is used finally. If enough are incoming
    these connections can exhaust the client pool configured
    for libvirtd.
    One fix is to hold a lock while creating the connection.
    Note that has_min_version calls _conn which calls get_connection
    and thus the direct call to _has_min_version()

    Also added the exception text if it fails to register an event
    handler for lifecycle events.

    Change-Id: I090765802bfe443440f16722bc7c43b6280fe56a
    Fixes: bug #1240905

Changed in nova:
status: New → Fix Committed
Changed in nova:
importance: Undecided → Medium
tags: added: havana-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/havana)

Fix proposed to branch: stable/havana
Review: https://review.openstack.org/54595

Changed in nova:
milestone: none → icehouse-1
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: icehouse-1 → 2014.1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/havana)

Reviewed: https://review.openstack.org/54595
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=98ab49bbb29890730ce544b785f1babff3e694e1
Submitter: Jenkins
Branch: stable/havana

commit 98ab49bbb29890730ce544b785f1babff3e694e1
Author: Tom Hancock <email address hidden>
Date: Thu Oct 17 09:48:54 2013 +0000

    make libvirt driver get_connection thread-safe

    libvirt driver's get_connection is not thread safe in the
    presence of a libvirtd restart during concurrent incoming
    requests.

    With existing code each will in turn call get_connection,
    find the connection is broken, try to create new one, block
    for a while and yield to the next thread to do the same.
    You get as many connections as there are incoming requests
    and only the last one is used finally. If enough are incoming
    these connections can exhaust the client pool configured
    for libvirtd.
    One fix is to hold a lock while creating the connection.
    Note that has_min_version calls _conn which calls get_connection
    and thus the direct call to _has_min_version()

    Also added the exception text if it fails to register an event
    handler for lifecycle events.

    Change-Id: I090765802bfe443440f16722bc7c43b6280fe56a
    Fixes: bug #1240905
    (cherry picked from commit b2e64e379835f57128e66f507438130eda716814)

tags: added: in-stable-havana
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.