Logging blocks on race condition under eventlet

Bug #1420788 reported by Alexander Makarov
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Fix Released
Medium
Alexander Makarov
Icehouse
Fix Released
Medium
Unassigned
Juno
Fix Released
Medium
Boris Bobrov

Bug Description

Wrong initialization order makes logging block on race condition under eventlet.

bin/keystone-all launcher initialize logging first and after that does eventlet patching leaving logging system with generic thred.lock in critical sections what leads to infinite thread locks under high load.

Boris Bobrov (bbobrov)
Changed in keystone:
status: New → Confirmed
Changed in keystone:
assignee: nobody → Alexander Makarov (amakarov)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/155292

Revision history for this message
David Stanek (dstanek) wrote :

What exactly is the race condition? I'd like a little more info so that I can understand the patch/problem.

Revision history for this message
Boris Bobrov (bbobrov) wrote :

David,

Copying slightly modified Alexander's explanation from origin bug to Mirantis Openstack:

Python logging used in oslo.log consists of a bunch of logger objects for different modules.
Any logger object calls handle() method of every logging handler registered in the logger or any of it's parents.
All Handlers are derived from python logging.Handler class.
logging.Handler.handle() implemented as a critical section using a lock.
The lock used in logging.Handler.handle() is created by logging.Handler.createLock().
The lock is threading.RLock() which is reentrant allowing owner thread to attempt aquiring lock without blocking.
Thread is identified using _get_ident() call which is patched by eventlet.

If we patch threading late, _get_ident() gets patched and internals of RLock don't, causing deadlocks on greenthread switch.

Revision history for this message
Alexander Makarov (amakarov) wrote :

Boris, as we discovered this is not the root cause of the problem.

David, as I wrote in the bug description the issue is in initialization sequence:

- logging system initializes all it's stuff including loks (thread.lock) objects
- eventlet monkey-patching replaces thread module, including locks
But the locks, initialized before and linked to objects by reference are not replaced.

In outcome we have logging system using standard, not patched thread.lock objects in an eventlet environment.

Now the case:
here el - eventlet

el1: aquire lock
el1: begin writing log record. This is an IO operation so execution is passed to the next eventlet el2
el2: aquire lock
entire thread: "I'm waiting for lock to release, what do you want of me?"

The solution is to initialize logging after eventlet monkey-patching

Revision history for this message
Boris Bobrov (bbobrov) wrote :

> But the locks, initialized before and linked to objects by reference are not replaced.

And I say that in my comment ;)

Revision history for this message
Alexander Makarov (amakarov) wrote :

Boris, ok, I yield: let "the locks, initialized before" be "Internals of RLock" :) It's true.

Revision history for this message
Dolph Mathews (dolph) wrote :
Revision history for this message
Alexander Makarov (amakarov) wrote :

This shows that lock remains unpatched

import logging
logging.basicConfig()
logger = logging.getLogger()
import eventlet
eventlet.monkey_patch()
print logger.handlers[0].lock._RLock__block.__class__

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to keystone (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/159172

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to keystone (master)

Reviewed: https://review.openstack.org/154915
Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=6fefe724695122754b9034c3f6434633877c4326
Submitter: Jenkins
Branch: master

commit 6fefe724695122754b9034c3f6434633877c4326
Author: Alexander Makarov <email address hidden>
Date: Wed Feb 11 18:42:32 2015 +0300

    Correct initialization order for logging to use eventlet locks

    oslo logging needs to be reconfigured after eventlet patching

    Change-Id: I717ecfd1c0f9b791a57f69325365144692f1e0b6
    Closes-Bug: 1420788

Changed in keystone:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/163387

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to keystone (stable/juno)

Reviewed: https://review.openstack.org/155292
Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=3db10d06828b39e7dbdb2938a581526a78d2957b
Submitter: Jenkins
Branch: stable/juno

commit 3db10d06828b39e7dbdb2938a581526a78d2957b
Author: Boris Bobrov <email address hidden>
Date: Thu Feb 12 15:55:24 2015 +0300

    Correct initialization order for logging to use eventlet locks

    oslo logging needs to be reconfigured after eventlet patching

    This patch follows the approach suggested in change
    I717ecfd1c0f9b791a57f69325365144692f1e0b6

    Change-Id: I84cac685726bbcf1285c0f6d9172439aa186466a
    Closes-Bug: 1420788

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on keystone (stable/icehouse)

Change abandoned by Alan Pevec (<email address hidden>) on branch: stable/icehouse
Review: https://review.openstack.org/156649
Reason: Replaced by https://review.openstack.org/163387

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to keystone (stable/icehouse)

Reviewed: https://review.openstack.org/163387
Committed: https://git.openstack.org/cgit/openstack/keystone/commit/?id=d72e6faca06e9470498860f16e64ed161815bb44
Submitter: Jenkins
Branch: stable/icehouse

commit d72e6faca06e9470498860f16e64ed161815bb44
Author: Boris Bobrov <email address hidden>
Date: Thu Feb 12 15:55:24 2015 +0300

    Correct initialization order for logging to use eventlet locks

    oslo logging needs to be reconfigured after eventlet patching

    This patch follows the approach suggested in change
    I717ecfd1c0f9b791a57f69325365144692f1e0b6

    Change-Id: I84cac685726bbcf1285c0f6d9172439aa186466a
    Closes-Bug: 1420788
    (cherry picked from commit 3db10d06828b39e7dbdb2938a581526a78d2957b)

tags: added: in-stable-icehouse
Thierry Carrez (ttx)
Changed in keystone:
milestone: none → kilo-3
status: Fix Committed → Fix Released
Changed in keystone:
importance: Undecided → Medium
Thierry Carrez (ttx)
Changed in keystone:
milestone: kilo-3 → 2015.1.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on keystone (master)

Change abandoned by Morgan Fainberg (<email address hidden>) on branch: master
Review: https://review.openstack.org/159172
Reason: Administratively abandoning based on the conversations. We aren't going to continue with refactoring just to throw out the eventlet code next cycle.

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

So do I understand this correctly that the release fix for kilo was only a partial fix for the apache version of keystone and the wsgi is still broken and also won't get fixed?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.