BackOffLoopingCall can get stuck in an infinite loop

Bug #1686159 reported by melanie witt
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
oslo.service
Fix Released
Medium
melanie witt

Bug Description

This was noticed while working on an intermittent test failure in Nova:

https://bugs.launchpad.net/nova/+bug/1683953

The test had been using a real backoff timer with the max timeout set low to verify expected behavior in event of a timeout. A recent change in Nova changed the timeout value from 0.1s to 1s and the test began failing intermittently with the entire test timing out (hitting the 160s overall test timeout). The test (the underlying driver being tested) uses a jitter value of 0.5.

The timer should have timed out within 1s but it didn't and I found that because of the jitter value of 0.5, if random.gauss() values < 0.5 were picked enough over time, the self._interval can eventually walk to zero and once it's stuck at zero, the timer will be in an infinite loop state (because to break the loop self._error_time + idle/self._interval has to be > timeout.

Revision history for this message
melanie witt (melwitt) wrote :
Changed in oslo.service:
assignee: nobody → melanie witt (melwitt)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.service 1.23.0

This issue was fixed in the openstack/oslo.service 1.23.0 release.

Changed in oslo.service:
status: In Progress → Fix Released
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.