Addition of leap second causes spuriously high CPU usage and futex lockups
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
base-files (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Lucid |
Won't Fix
|
Undecided
|
Unassigned | ||
Natty |
Won't Fix
|
Undecided
|
Unassigned | ||
Oneiric |
Won't Fix
|
Undecided
|
Unassigned | ||
Precise |
Won't Fix
|
Undecided
|
Unassigned | ||
Quantal |
Won't Fix
|
Undecided
|
Unassigned | ||
linux (Ubuntu) |
Fix Released
|
Medium
|
Brad Figg | ||
Lucid |
Fix Released
|
Medium
|
Brad Figg | ||
Natty |
Invalid
|
Medium
|
Brad Figg | ||
Oneiric |
Fix Released
|
Medium
|
Brad Figg | ||
Precise |
Fix Released
|
Medium
|
Brad Figg | ||
Quantal |
Fix Released
|
Medium
|
Brad Figg |
Bug Description
[Impact]
Software that relies on fine-grained pthread timeouts will spin indefinitely and drive up system load following a leap second, when the kernel's idea of time has become desynced and sub-1s timeouts are all hit immediately. Mysql and Java are in particular reported to be affected by this. This is a transient issue, in that it will go away the first time the system is rebooted after the leap second and is expected to be fixed before the next leap second occurs; nevertheless admins have been caught off-guard by this misbehavior and in some cases may not have noticed the problem or know what to do about it, so we should help them along by resetting the kernel clock with a minimal-risk base-files update.
[Test Case]
1. Find a system that has been online, with mysqld or a java-based process running since before 2012-06-30.
2. Verify that one or more processes on the system are spinning in futex and driving up the system load.
3. Upgrade to the base-files package from -proposed.
4. Verify that the system load comes back down immediately.
5. A stress-test for leap-second handling has been provided at https:/
[Regression potential]
No analysis has been done on the effect of resetting the date on applications that require a high-accuracy clock. While this fixes the problem with the pthreads interfaces, it may cause other problems for other software. Since the proposed fix is to reset the kernel's date to the current date, which is not atomic, there will be a slight skew of the clock backwards in time. ntp *should* fix this shortly thereafter for machines that have it enabled.
Also, because there's a single version check for each copy of the SRU, users whose applications are negatively affected by the running of this date command will also be negatively affected on each subsequent upgrade of the system, up to and including the quantal devel release.
As widely reported, the addition of the leap second on 2012-06-30 has
caused high CPU usage and futex lockups in a lot of applications
including JVMs, Mysql as well as desktop apps like Firefox and
Thunderbird.
https:/
http://
https:/
We've seen this ourselves on the Canonical infrastructure on both
current Lucid and Precise kernels, i.e.
ii linux-image-
ii linux-image-
We can also confirm the 'date -s $(date)' workaround fixes the problem
without requiring a reboot.
Changed in linux (Ubuntu): | |
importance: | Undecided → Medium |
tags: | added: kernel-da-key lucid precise |
Changed in linux (Ubuntu): | |
assignee: | nobody → Canonical Kernel Team (canonical-kernel-team) |
importance: | Medium → Undecided |
status: | New → Triaged |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Precise): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Oneiric): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Natty): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Lucid): | |
importance: | Undecided → Medium |
Changed in linux (Ubuntu Quantal): | |
assignee: | Canonical Kernel Team (canonical-kernel-team) → Andy Whitcroft (apw) |
Changed in linux (Ubuntu Quantal): | |
assignee: | Andy Whitcroft (apw) → Brad Figg (brad-figg) |
status: | Triaged → In Progress |
Changed in linux (Ubuntu Precise): | |
status: | New → Confirmed |
status: | Confirmed → Triaged |
Changed in linux (Ubuntu Oneiric): | |
status: | New → Triaged |
Changed in linux (Ubuntu Natty): | |
status: | New → Triaged |
Changed in linux (Ubuntu Lucid): | |
status: | New → Triaged |
tags: | added: kernel-key |
Changed in linux (Ubuntu Lucid): | |
assignee: | nobody → Brad Figg (brad-figg) |
Changed in linux (Ubuntu Natty): | |
assignee: | nobody → Brad Figg (brad-figg) |
Changed in linux (Ubuntu Oneiric): | |
assignee: | nobody → Brad Figg (brad-figg) |
Changed in linux (Ubuntu Precise): | |
assignee: | nobody → Brad Figg (brad-figg) |
description: | updated |
description: | updated |
Changed in linux (Ubuntu Lucid): | |
status: | Triaged → Fix Committed |
tags: | removed: kernel-key |
For the record, those looking for a runtime workaround might prefer:
date -u -s "$(date -u -R)"
The extra switches are to avoid locales and ambiguous timezones getting in your way, and the quoting is, well, for proper quoting. :P