glance qpid notifier can hang under heavy load

Bug #1229042 reported by Attila Fazekas
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Glance
Fix Released
Medium
Attila Fazekas
Grizzly
Fix Released
Medium
Flavio Percoco

Bug Description

Glance qpid notifier can hang under heavy image creation load.

The issue happens because of two issse:

- The qpid notifier instance can be called by multiple green thread concurrently, one thread may recreate the connection object, while other thread working.
The connection object should be local variable instead of object variable, in order to avoid unwanted modification, or replacement.

The second issue the python-qpid uses PipeWaiter with a select.select. The select is not monkey patched to the be green thread friendly, is can causes hang issue.

All other AMQP user openstack competent monkey patches the thread and select modules, but glance not.
Usually good practice to make everything what is possible green thread/eventlet safe and make the application more preemptive.

cinder, neutron makes everything evenetlet friend, nova excludes the 'os' module.
I would recommend to make everything greenlet friendly.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to glance (master)

Fix proposed to branch: master
Review: https://review.openstack.org/47786

Changed in glance:
status: New → Triaged
importance: Undecided → Medium
milestone: none → havana-rc1
Changed in glance:
assignee: nobody → Attila Fazekas (afazekas)
status: Triaged → In Progress
Revision history for this message
Attila Fazekas (afazekas) wrote :

A reproducer attached.

Alan Pevec (apevec)
tags: added: grizzly-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to glance (master)

Reviewed: https://review.openstack.org/47786
Committed: http://github.com/openstack/glance/commit/2e7aa761b6c2b31f4cbd9703ee19090b6757508a
Submitter: Jenkins
Branch: master

commit 2e7aa761b6c2b31f4cbd9703ee19090b6757508a
Author: Attila Fazekas <email address hidden>
Date: Mon Sep 23 08:44:37 2013 +0200

    Fixing glance-api hangs in the qpid notifier

    Glance-api was able to hang in qpid notifier under heavy image creation load.

    The ``thread`` and ``select`` modules used by the python-qpid for managing
    the AMQP connection. When the eventlet was not able to switch between threads
    because leaded to hang and/or pipe(2) leaking issues.

    * Monkey patching the ``select`` and ``thread`` modules to be eventlet friendly
      in order to avoid hanging issues.

    * The reference to the connection object in the QpidStrategy
      was replaceable by a concurrent thread, which could cause various issues.
      Using just local variables for storing connection object in order to avoid
      concurrent unsafe manipulation.

    Fixing bug 1229042

    Change-Id: I8fa8c4f36892b96d406216cb3c64854a94ca9df7

Thierry Carrez (ttx)
Changed in glance:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in glance:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to glance (stable/grizzly)

Fix proposed to branch: stable/grizzly
Review: https://review.openstack.org/50258

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to glance (stable/grizzly)

Reviewed: https://review.openstack.org/50258
Committed: http://github.com/openstack/glance/commit/9a557c8f54e8ac0db2f48bb95296ea2a9ef9a7bb
Submitter: Jenkins
Branch: stable/grizzly

commit 9a557c8f54e8ac0db2f48bb95296ea2a9ef9a7bb
Author: Attila Fazekas <email address hidden>
Date: Mon Sep 23 08:44:37 2013 +0200

    Fixing glance-api hangs in the qpid notifier

    Glance-api was able to hang in qpid notifier under heavy image creation load.

    The ``thread`` and ``select`` modules used by the python-qpid for managing
    the AMQP connection. When the eventlet was not able to switch between threads
    because leaded to hang and/or pipe(2) leaking issues.

    * Monkey patching the ``select`` and ``thread`` modules to be eventlet friendly
      in order to avoid hanging issues.

    * The reference to the connection object in the QpidStrategy
      was replaceable by a concurrent thread, which could cause various issues.
      Using just local variables for storing connection object in order to avoid
      concurrent unsafe manipulation.

    Fixing bug 1229042

    Change-Id: I8fa8c4f36892b96d406216cb3c64854a94ca9df7
    (cherry picked from commit 2e7aa761b6c2b31f4cbd9703ee19090b6757508a)

tags: added: in-stable-grizzly
Thierry Carrez (ttx)
Changed in glance:
milestone: havana-rc1 → 2013.2
Alan Pevec (apevec)
tags: removed: grizzly-backport-potential in-stable-grizzly
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.