loopingcall: if a time drift to the future occurs, all timers will be blocked

Bug #1450438 reported by Nikola Đipanov
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
oslo.service
Fix Released
Undecided
Unassigned

Bug Description

Due to the fact that loopingcall.py uses time.time for recording wall-clock time which is not guaranteed to be monotonic, if a time drift to the future occurs, and then gets corrected, all the timers will get blocked until the actual time reaches the moment of the original drift.

This can be pretty bad if the interval is not insignificant - in Nova's case - all services uses FixedIntervalLoopingCall for it's heartbeat periodic tasks - if a drift is on the order of magnitude of several hours, no heartbeats will happen.

DynamicLoopingCall is affected by this as well but because it relies on eventlet which would also use a non-monotonic time.time function for it's internal timers.

Solving this will require looping calls to start using a monotonic timer (for python 2.7 there is a monotonic package).

Also all services that want to use timers and avoid this issue should doe something like

  import monotonic

  hub = eventlet.get_hub()
  hub.clock = monotonic.monotonic

immediately after calling eventlet.monkey_patch()

Sean Dague (sdague)
Changed in nova:
status: New → Confirmed
status: Confirmed → Triaged
importance: Undecided → Medium
Chang-Yi Lee (cy-lee)
Changed in nova:
assignee: nobody → inwinSTACK inc. (inwinstack)
Elena Ezhova (eezhova)
affects: oslo-incubator → oslo.service
Chang-Yi Lee (cy-lee)
Changed in nova:
assignee: inwinSTACK inc. (inwinstack) → nobody
Changed in nova:
assignee: nobody → Chung Chih, Hung (lyanchih)
Revision history for this message
Elena Ezhova (eezhova) wrote :

Fix fro oslo.service was committed in review: https://review.openstack.org/#/c/190372/

Changed in oslo.service:
status: New → Fix Released
Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by lyanchih (<email address hidden>) on branch: master
Review: https://review.openstack.org/194073

Revision history for this message
Chung Chih, Hung (lyanchih) wrote :

Those modules related to this bug were graduated at oslo.service.

Changed in oslo-incubator:
status: New → Fix Committed
Revision history for this message
Chung Chih, Hung (lyanchih) wrote :

This bug will be fixed at following review
https://review.openstack.org/#/c/192900/

Changed in nova:
assignee: lyanchih (lyanchih) → nobody
status: In Progress → Confirmed
Changed in nova:
status: Confirmed → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → liberty-2
status: Fix Committed → Fix Released
Revision history for this message
Diana Clarke (diana-clarke) wrote :

My apologies for the noise if I'm wrong, but this doesn't actually appear to be fixed in Liberty. Here are my notes showing that heartbeats stop with a status 'XXX' after turning the clock forward 2 hours and then turning it immediately back to the actual time.

http://paste.openstack.org/show/476318/

Thierry Carrez (ttx)
Changed in nova:
milestone: liberty-2 → 12.0.0
Revision history for this message
Matt Riedemann (mriedem) wrote :

Removed nova and oslo-incubator because the only change was in oslo.service:

https://review.openstack.org/#/c/190372/

And the fix has been here since 0.1.0.

Nova 12.0.0 requires oslo.service>=0.7.0 so you should have the fix:

https://github.com/openstack/nova/blob/12.0.0/requirements.txt#L48

If it doesn't fix it, I'd open a new bug and refer to this one along with the details / recreate scenario / logs of what you're seeing.

no longer affects: oslo-incubator
no longer affects: nova
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.