mongodb retry logic uses unpatched time.sleep to wait

Bug #1447599 reported by Chris Dent
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ceilometer
Fix Released
High
Chris Dent
Kilo
Fix Released
High
Chris Dent

Bug Description

The mongo db retry handling added in https://review.openstack.org/#/c/122387/ sleeps for the retry_interval before retrying a command.

Most of the commands which use that retry code are running under eventlet with various modules monkeypatched.

However it is using time.sleep, which is _not_ monkeypatched.

This has an interesting impact which can be see relatively easily in ceilometer-collector: If you start up the collector and then after a while shut down mongod and leave it shut down for long enough for some data writes to be attempted retry log messages will be produced. Then if you restart mongo the retry log messages carry on for quite a while, much longer than expected. Eventually the reconnects do happen.

If the time.sleep is changed to eventlet.greenthread.sleep then the behavior is more like what would be expected: The retries are delayed, but other green threads carry on, and when mongo is back up we see reconnects much sooner.

Eoghan Glynn (eglynn)
Changed in ceilometer:
milestone: none → kilo-rc2
importance: Undecided → High
assignee: nobody → Chris Dent (chdent)
status: New → In Progress
Thierry Carrez (ttx)
Changed in ceilometer:
milestone: kilo-rc2 → none
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ceilometer (master)

Fix proposed to branch: master
Review: https://review.openstack.org/176751

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ceilometer (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/176801

Eoghan Glynn (eglynn)
Changed in ceilometer:
milestone: none → liberty-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ceilometer (master)

Reviewed: https://review.openstack.org/176751
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=11ade403b12c3941b43cd8f26b7d765541707c86
Submitter: Jenkins
Branch: master

commit 11ade403b12c3941b43cd8f26b7d765541707c86
Author: Chris Dent <email address hidden>
Date: Thu Apr 23 14:01:35 2015 +0000

    Have eventlet monkeypatch the time module

    Without this, mongod retry logic in the various services started as
    commands fails to behave as expected and does not reconnect as soon as
    the mongod service has returned to availability.

    In addition to the mongod sleep there are two other time.sleep calls
    that may be reached by this change. Review and discussion with others
    indicates that their behavior will continue to be correct with the
    monkeypatch in place.

    Change-Id: I4eca290acc3b06658951f070935ebb39936e13d3
    Closes-Bug: #1447599

Changed in ceilometer:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ceilometer (stable/kilo)

Reviewed: https://review.openstack.org/176801
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=b0447ed8e7bee371bf7095c86e47d717abe89edc
Submitter: Jenkins
Branch: stable/kilo

commit b0447ed8e7bee371bf7095c86e47d717abe89edc
Author: Chris Dent <email address hidden>
Date: Thu Apr 23 14:01:35 2015 +0000

    Have eventlet monkeypatch the time module

    Without this, mongod retry logic in the various services started as
    commands fails to behave as expected and does not reconnect as soon as
    the mongod service has returned to availability.

    In addition to the mongod sleep there are two other time.sleep calls
    that may be reached by this change. Review and discussion with others
    indicates that their behavior will continue to be correct with the
    monkeypatch in place.

    Cherry-pick from https://review.openstack.org/#/c/176751/

    Change-Id: I4eca290acc3b06658951f070935ebb39936e13d3
    Closes-Bug: #1447599

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ceilometer (master)

Fix proposed to branch: master
Review: https://review.openstack.org/179290

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ceilometer (master)
Download full text (3.5 KiB)

Reviewed: https://review.openstack.org/179290
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=b192f69f166317bbf048886f3a78d68fc89e64d1
Submitter: Jenkins
Branch: master

commit b0447ed8e7bee371bf7095c86e47d717abe89edc
Author: Chris Dent <email address hidden>
Date: Thu Apr 23 14:01:35 2015 +0000

    Have eventlet monkeypatch the time module

    Without this, mongod retry logic in the various services started as
    commands fails to behave as expected and does not reconnect as soon as
    the mongod service has returned to availability.

    In addition to the mongod sleep there are two other time.sleep calls
    that may be reached by this change. Review and discussion with others
    indicates that their behavior will continue to be correct with the
    monkeypatch in place.

    Cherry-pick from https://review.openstack.org/#/c/176751/

    Change-Id: I4eca290acc3b06658951f070935ebb39936e13d3
    Closes-Bug: #1447599

commit 2a0ecfa6e8246352376f463b3237017b529ca464
Author: OpenStack Proposal Bot <email address hidden>
Date: Thu Apr 23 02:12:27 2015 +0000

    Updated from global requirements

    Change-Id: Ia56c489de41738e318d00f04698a68a149bf3ac5

commit 2004c4cb563a8adb38b2498711ff8a24c2897e49
Author: Lan Qi song <email address hidden>
Date: Wed Apr 22 15:05:35 2015 +0800

    Fix valueerror when ceilometer-api start

    If change local language to other language(like ja), ceilometer-api
    will failed to start.

    This patch will remove i18 support from ceilometer/storage/__init__.py

    Change-Id: I8dc7ff0921d69e64a41e588207c97548513c99ed
    Closes-Bug: #1446983
    (cherry picked from commit f2ab05ac3dc95981ce183277a52e8c395a4cdbeb)

commit f5b994829bbd19794eba062ea79599f3b78a9286
Author: ZhiQiang Fan <email address hidden>
Date: Fri Apr 17 17:06:13 2015 +0800

    use message id to generate hbase unique key

    We currently can disable computing signature for metering message,
    but HBase still uses signature as a dimension to generate unique row
    key, this will lead to different samples treated as same one.

    Message id should be used in stead of message signature, because
    id will always exist and also unique.

    Change-Id: I47cdce59c9934573076cf609ce1a0c37aea75c44
    Closes-Bug: #1445227
    (cherry picked from commit 15a92ec1ef2ab3e6d1d187ad909a0d5800ee4a2e)

commit 446cd1440beda53e4216dc5da362fd74ded4f1df
Author: Mehdi Abaakouk <email address hidden>
Date: Wed Apr 15 14:31:24 2015 +0200

    gnocchi: fix typo in the aggregation endpoint

    Closes-bug: #1446201

    Change-Id: I92a727b518f77ad7149fc43a564adea5e8e97240
    (cherry picked from commit d1130e810d40ecd553a325fa5c32626be944625c)

commit dc6f4bfecbde54f663e96f485a948944223ef231
Author: Andreas Jaeger <email address hidden>
Date: Mon Apr 20 11:13:57 2015 +0200

    Release Import of Translations from Transifex

    Manual import of Translations from Transifex. This change also removes
    all po files that are less than 66 per cent translated since such
    partially translated files will not help users.

    This change needs to be don...

Read more...

Thierry Carrez (ttx)
Changed in ceilometer:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in ceilometer:
milestone: liberty-1 → 5.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.