Comment 32 for bug 614853

Revision history for this message
Stefan Bader (smb) wrote :

SRU Justification:

Impact: When trying to find the busiest group for the scheduler, there are rare (but it seems more likely in EC2) cases where cpu_power is zero when the code tries to divide by that variable.

Fix: There is no real fix yet (and therefor both patches are not upstream) but users have tested the first patch which works around the issue by avoiding the divide whenever cpu_power actually is zero.
The second patch is an optional companion to the first one which hopefully will yell when cpu_power is set to zero by accident. While it is neither a bug fix nor really needed I would like to add it, too. That way we could potentially catch the real bug in real usage (which seems to be the only way to get it after an extended period of time) and then revert both changes in future, when there is a fix.

Testcase: Not being able to reproduce in test. But this has been reported to happen after around a week of uptime on production servers.
(boot tested this approach to make sure this does not introduce obvious regressions by hitting the warning too often).