NULL pointer dereference when offlining CPU on DL385p Gen8
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Quantal |
Won't Fix
|
Medium
|
Leann Ogasawara | ||
Raring |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
== Quantal SRU Justification ==
When we try to offline a CPU on DL385p Gen8 with Abu Dhabi processor then we see a NULL pointer dereference kernel oops:
https:/
The latest Raring kernel works fine on the same system. Upon investigation, we have identified a straightforward fix to check the device pointer and thus avoid the NULL pointer dereference.
== Test Case ==
With 1P on the same system everything works fine.
command to offline a cpu is
echo "0" > /sys/devices/
== Investigation ==
There are actually a few commits in Raring which prevent the Oops from occuring. HP originally pointed us to the following commits to resolve the Oops:
http://
http://
However, when examining the commit bf4d1b5d "cpuidle: support multiple drivers", it's clearly aimed at arm platforms...This option is configurable at compile time and should be enabled for the tegra3 and big.LITTLE architectures. There is no impact for other platforms if the option is disabled. The option defaults to 'n'. And indeed in Raring, CONFIG_
12c46e70e44728c
bf4d1b5ddb78f86
1aef40e288acfb3
3af272ab75c7a0c
cf31cd1a0c692a1
1b0a0e9a15b976d
0aeb9cac6f8a6fc
13dd52f11a04e61
41682032715c2c9
42f67f2acab2b71
ed953472d181e1d
a77de28662adea3
6e797a078824b30
Upon further examination of the Oops, I confirmed that the following upstream commit would also avoid the issue:
commit ac3ebafa81af76d
Author: Daniel Lezcano <email address hidden>
Date: Mon Feb 4 22:44:43 2013 +0000
ACPI / idle: remove usage of the statedata
However, this change seems more of a band-aid solution to the root cause because it removes all remaining references to state_usage and thus avoiding any pointer dereferencing all together, but clearly doesn't addressing the fact that the pointer was NULL to begin with.
Going back to a closer examination of "cpuidle: support multiple drivers", even though it claims to have no other impact on other platforms if disabled, it clearly makes one change which does affect all platforms. It replaces all calls to cpuidle_
== Fix ==
Add device error checking to avoid NULL pointer dereference kernel Oops when offlining CPU's (See patch attached to bug report). We have tested this patch against a DL385p Gen8 and confirm we no longer experience a kernel oops when offlining CPU's.
Changed in linux (Ubuntu Raring): | |
status: | New → Fix Released |
Changed in linux (Ubuntu Quantal): | |
assignee: | nobody → Leann Ogasawara (leannogasawara) |
importance: | Undecided → Medium |
status: | New → In Progress |
description: | updated |
Changed in linux (Ubuntu Quantal): | |
status: | In Progress → Fix Committed |
tags: |
added: verification-done-quantal removed: verification-needed-quantal |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification- needed' to 'verification- done'.
If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/ /wiki.ubuntu. com/Testing/ EnableProposed for documentation how to enable and use -proposed. Thank you!