Activity log for bug #1380220

Date Who What changed Old value New value Message
2014-10-12 06:31:04 Roman Podoliaka bug added bug
2014-10-12 06:34:48 Roman Podoliaka nominated for series mos/5.1.x
2014-10-12 06:34:48 Roman Podoliaka bug task added mos/5.1.x
2014-10-12 06:34:55 Roman Podoliaka mos/5.1.x: milestone 5.1.2
2014-10-12 06:34:57 Roman Podoliaka mos/5.1.x: assignee Roman Podoliaka (rpodolyaka)
2014-10-12 06:34:59 Roman Podoliaka mos/5.1.x: importance Undecided High
2014-10-12 06:35:01 Roman Podoliaka mos/5.1.x: status New Triaged
2014-10-12 06:35:05 Roman Podoliaka mos: milestone 6.0
2014-10-12 10:22:19 Roman Podoliaka mos/5.1.x: status Triaged In Progress
2014-10-14 08:39:47 Igor Marnat tags messaging oslo messaging oslo scale
2014-12-01 08:46:01 Roman Podoliaka mos: importance High Medium
2014-12-01 08:46:03 Roman Podoliaka mos/5.1.x: importance High Medium
2014-12-01 08:46:31 Roman Podoliaka summary OpenStack services consume a lot of CPU time when oslo.messaging is used OpenStack services excessively poll socket events when oslo.messaging is used
2014-12-01 08:48:54 Roman Podoliaka description On a newly deployed cluster, after creating some load (e.g. running Rally scenarios), top shows that many of OpenStack services start to consume CPU time heavily: http://paste.openstack.org/show/120460/ This is caused by the fact those services are excessively polling open sockets (http://paste.openstack.org/show/120461/) using a very small timeout value (close to 0, while the eventlet default is 60). Further investigation shown that services which didn't use oslo.messaging were't affected. It turns out that CPython 2.6/2.7 implementation of condition variables plays badly with eventlet event loop. oslo.messaging has a place in the code (https://gerrit.mirantis.com/gitweb?p=openstack/oslo.messaging.git;a=blob;f=oslo/messaging/_drivers/impl_rabbit.py;h=dfed27851a36143e31448c77772e2a77597c94c6;hb=45d0e2742aa29c242f027de5edb54ba3db95cc33#l857) in which it tries to put the current thread into sleep until some condition is true passing a sane timeout value (24.0 s). Unfortunately, CPython provides its own implementation of conditional variables and doesn't use corresponding pthreads calls. In CPython 2.6/2.7 wait(timeout) for conditional variables is implemented as polling after a short sleep in a loop (https://github.com/akheron/cpython/blob/2.7/Lib/threading.py#L344-L369). Sleeps of 0.0005 to 0.05 seconds are the values passed to poll()/epoll_wait() in eventlet eventually, causing the process to wake up much more often than it really should (as there are no socket events to process). And user space <-> kernel space switches are expensive. FWIW, PyPy and CPython 3.2+ shouldn't have this bug, but their compatibility with eventlet is an open question. There must be at least two ways to fix this: 1) backport changes to thread.c and threading.py from CPython 3.2 to CPython 2.6/2.7, build and use custom packages 2) add a workaround to oslo.messaging (don't use a conditional variable in that particular place) The former might affect CPython stability and should be throughly tested, so the latter seems to be a 'good enough' work around for now. On a newly deployed cluster, after creating some load (e.g. running Rally scenarios), top shows that many of OpenStack services start to consume CPU time when they are *idle* (no user activity): http://paste.openstack.org/show/120460/ This is caused by the fact those services are excessively polling open sockets (http://paste.openstack.org/show/120461/) using a very small timeout value (close to 0, while the eventlet default is 60). Further investigation shown that services, which didn't use oslo.messaging, weren't affected. It turns out, that CPython 2.6/2.7 implementation of condition variables plays badly with eventlet event loop. oslo.messaging has a place in the code (https://gerrit.mirantis.com/gitweb?p=openstack/oslo.messaging.git;a=blob;f=oslo/messaging/_drivers/impl_rabbit.py;h=dfed27851a36143e31448c77772e2a77597c94c6;hb=45d0e2742aa29c242f027de5edb54ba3db95cc33#l857) in which it tries to put the current thread into sleep until some condition is true passing a sane timeout value (24.0 s). Unfortunately, CPython provides its own implementation of conditional variables and doesn't use corresponding pthreads calls. In CPython 2.6/2.7 wait(timeout) for conditional variables is implemented as polling after a short sleep in a loop (https://github.com/akheron/cpython/blob/2.7/Lib/threading.py#L344-L369). Sleeps of 0.0005 to 0.05 seconds are the values passed to poll()/epoll_wait() in eventlet eventually, causing the process to wake up much more often than it really should (as there are no socket events to process). And user space <-> kernel space switches are expensive. FWIW, PyPy and CPython 3.2+ shouldn't have this bug, but their compatibility with eventlet is an open question. There must be at least two ways to fix this: 1) backport changes to thread.c and threading.py from CPython 3.2 to CPython 2.6/2.7, build and use custom packages 2) add a workaround to oslo.messaging (don't use a conditional variable in that particular place) The former might affect CPython stability and should be throughly tested, so the latter seems to be a 'good enough' work around for now.
2014-12-01 16:09:22 Roman Podoliaka mos: milestone 6.0 6.0.1
2014-12-23 10:36:06 Dmitry Mescheryakov nominated for series mos/6.1.x
2014-12-23 10:36:06 Dmitry Mescheryakov bug task added mos/6.1.x
2014-12-23 10:36:06 Dmitry Mescheryakov nominated for series mos/6.0.x
2014-12-23 10:36:06 Dmitry Mescheryakov bug task added mos/6.0.x
2014-12-23 10:36:14 Dmitry Mescheryakov mos/6.1.x: status New Triaged
2014-12-23 10:36:16 Dmitry Mescheryakov mos/6.1.x: importance Undecided Medium
2014-12-23 10:36:32 Dmitry Mescheryakov mos/6.1.x: assignee Roman Podoliaka (rpodolyaka)
2014-12-23 10:36:37 Dmitry Mescheryakov mos/6.1.x: milestone 6.1
2014-12-23 10:36:41 Dmitry Mescheryakov mos/6.0.x: status Triaged Won't Fix
2014-12-23 10:36:46 Dmitry Mescheryakov mos/5.1.x: status In Progress Won't Fix
2014-12-23 11:03:09 Dmitry Mescheryakov mos: status Triaged Won't Fix
2015-01-05 21:32:07 Roman Podoliaka mos: milestone 6.0.1 6.1
2015-01-05 21:32:14 Roman Podoliaka mos: status Won't Fix Triaged
2015-02-13 09:19:00 Roman Podoliaka mos/6.1.x: assignee Roman Podoliaka (rpodolyaka)
2015-02-13 09:19:02 Roman Podoliaka mos/6.0.x: assignee Roman Podoliaka (rpodolyaka)
2015-02-13 09:19:04 Roman Podoliaka mos/5.1.x: assignee Roman Podoliaka (rpodolyaka)
2015-04-14 14:50:40 Dmitry Mescheryakov mos/5.1.x: assignee MOS Oslo (mos-oslo)
2015-04-14 14:50:49 Dmitry Mescheryakov mos/6.0.x: assignee MOS Oslo (mos-oslo)
2015-04-14 14:50:57 Dmitry Mescheryakov mos/6.1.x: assignee MOS Oslo (mos-oslo)
2015-04-14 14:51:01 Dmitry Mescheryakov mos/6.1.x: status Triaged Won't Fix
2015-04-14 14:51:09 Dmitry Mescheryakov mos: milestone 6.1 7.0
2015-08-17 11:40:28 Viktor Serhieiev mos: assignee MOS Oslo (mos-oslo) MOS QA Team (mos-qa)
2015-08-31 13:23:43 Nastya Urlapova mos: milestone 7.0 8.0
2016-02-03 12:51:57 Dmitry Pyzhov tags messaging oslo scale area-qa messaging oslo scale
2016-05-23 06:22:28 Andrey Epifanov tags area-qa messaging oslo scale area-qa ct1 customer-found messaging oslo scale support
2016-06-02 08:33:34 Timur Nurlygayanov nominated for series mos/9.0.x
2016-06-02 08:33:34 Timur Nurlygayanov bug task added mos/9.0.x
2016-06-02 08:33:34 Timur Nurlygayanov nominated for series mos/10.0.x
2016-06-02 08:33:34 Timur Nurlygayanov bug task added mos/10.0.x
2016-06-02 08:33:34 Timur Nurlygayanov nominated for series mos/8.0.x
2016-06-02 08:33:34 Timur Nurlygayanov bug task added mos/8.0.x
2016-06-02 08:33:42 Timur Nurlygayanov mos/8.0.x: status New Won't Fix
2016-06-02 08:33:46 Timur Nurlygayanov mos/9.0.x: status Triaged Won't Fix
2016-06-02 08:33:49 Timur Nurlygayanov mos/8.0.x: importance Undecided Medium
2016-06-02 08:33:55 Timur Nurlygayanov mos/8.0.x: milestone 8.0-updates
2016-06-02 08:34:03 Timur Nurlygayanov mos/8.0.x: status Won't Fix Triaged
2016-06-02 08:34:52 Timur Nurlygayanov mos/9.0.x: assignee MOS QA Team (mos-qa) Timur Nurlygayanov (tnurlygayanov)
2016-06-02 08:34:55 Timur Nurlygayanov mos/9.0.x: assignee Timur Nurlygayanov (tnurlygayanov)
2016-06-02 08:35:00 Timur Nurlygayanov mos/9.0.x: milestone 8.0 9.0
2016-06-02 08:35:04 Timur Nurlygayanov mos/9.0.x: milestone 9.0 9.0-updates
2016-06-02 08:35:08 Timur Nurlygayanov mos/9.0.x: status Won't Fix Confirmed
2016-06-02 08:35:11 Timur Nurlygayanov mos/8.0.x: status Triaged Confirmed
2016-06-02 08:35:13 Timur Nurlygayanov mos/10.0.x: status New Confirmed
2016-06-02 08:35:16 Timur Nurlygayanov mos/10.0.x: importance Undecided Medium
2016-06-02 08:35:20 Timur Nurlygayanov mos/10.0.x: milestone 10.0
2016-06-02 08:36:52 Timur Nurlygayanov mos/10.0.x: assignee MOS Oslo (mos-oslo)
2016-06-02 08:36:58 Timur Nurlygayanov mos/8.0.x: assignee MOS Oslo (mos-oslo)
2016-06-02 08:37:03 Timur Nurlygayanov mos/9.0.x: assignee MOS Oslo (mos-oslo)
2016-06-07 07:53:03 Dina Belova tags area-qa ct1 customer-found messaging oslo scale support area-qa ct1 customer-found messaging move-to-mu oslo scale support
2016-07-11 11:31:47 Dmitry Mescheryakov tags area-qa ct1 customer-found messaging move-to-mu oslo scale support 10.0-reviewed area-qa ct1 customer-found messaging move-to-mu oslo scale support
2016-09-02 14:01:43 Denis Meltsaykin mos/8.0.x: status Confirmed Won't Fix
2016-09-26 09:05:54 Andrew Kalach mos/9.x: status Confirmed Invalid
2016-09-26 09:06:24 Andrew Kalach mos/9.x: status Invalid Confirmed
2016-10-05 14:58:07 Michael Semenov mos/9.x: status Confirmed Invalid
2017-01-20 11:50:28 Dmitry Mescheryakov mos/9.x: status Invalid Fix Committed
2017-01-20 11:50:33 Dmitry Mescheryakov mos/9.x: milestone 9.1 9.2
2017-02-01 13:58:33 Michael Semenov mos/9.x: status Fix Committed Fix Released
2017-03-03 10:05:02 Dmitry Mescheryakov mos/10.0.x: status Confirmed Fix Committed