Activity log for bug #1168452

Date Who What changed Old value New value Message
2013-04-12 15:49:13 Leann Ogasawara bug added bug
2013-04-12 15:49:29 Leann Ogasawara nominated for series Ubuntu Quantal
2013-04-12 15:49:29 Leann Ogasawara bug task added linux (Ubuntu Quantal)
2013-04-12 15:49:29 Leann Ogasawara nominated for series Ubuntu Raring
2013-04-12 15:49:29 Leann Ogasawara bug task added linux (Ubuntu Raring)
2013-04-12 15:49:36 Leann Ogasawara linux (Ubuntu Raring): status New Fix Released
2013-04-12 15:49:43 Leann Ogasawara linux (Ubuntu Quantal): importance Undecided Medium
2013-04-12 15:49:43 Leann Ogasawara linux (Ubuntu Quantal): status New In Progress
2013-04-12 15:49:43 Leann Ogasawara linux (Ubuntu Quantal): assignee Leann Ogasawara (leannogasawara)
2013-04-12 15:50:06 Leann Ogasawara description == Quantal SRU Justification == When we try to offline a CPU on DL380 Gen8 with Abu Dhabi process then we see a NULL pointer dereference kernel oops: https://bugs.launchpad.net/hp/+bug/1132956/+attachment/3546712/+files/kernel_panic.jpg The latest Raring kernel works fine on the same system. Upon investigation, we have identified a straightforward fix to check the device pointer and thus avoid the NULL pointer dereference. == Test Case == With 1P on the same system everything works fine. command to offline a cpu is echo "0" > /sys/devices/system/cpu/cpu12/online == Investigation == There are actually a few commits in Raring which prevent the Oops from occuring. HP originally pointed us to the following commits to resolve the Oops: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-raring.git;a=commitdiff;h=bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-raring.git;a=commitdiff;h=13dd52f11a04e616900f565d6a1e5138e58d579f However, when examining the commit bf4d1b5d "cpuidle: support multiple drivers", it's clearly aimed at arm platforms...This option is configurable at compile time and should be enabled for the tegra3 and big.LITTLE architectures. There is no impact for other platforms if the option is disabled. The option defaults to 'n'. And indeed in Raring, CONFIG_CPU_IDLE_MULTIPLE_DRIVERS is not set for amd64. So I was highly skeptical that this was the proper fix we wanted. Also, in order to cleanly cherry-pick the above into Quantal, one would have to also pick up the following set: 12c46e70e44728c58f96409414ff06e90e1354f0 [Config] UBUNTU: Enable CONFIG_CPU_IDLE_MULTIPLE_DRIVERS bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 cpuidle: support multiple drivers 1aef40e288acfb3cc28ff77528b34ef66683bed6 cpuidle / sysfs: change function parameter 3af272ab75c7a0c7fa5ae5507724d961f7e7718b cpuidle: fix error handling in __cpuidle_register_device cf31cd1a0c692a1445c80756055875088fa29982 ACPI idle, CPU hotplug: Fix NULL pointer dereference during hotplug 1b0a0e9a15b976d91f3b5ae619c6a8964c2818eb cpuidle: add checks to avoid NULL pointer dereference 0aeb9cac6f8a6fc68acfb07d30b62ad6106a6384 cpuidle: remove unused hrtimer_peek_ahead_timers() call 13dd52f11a04e616900f565d6a1e5138e58d579f cpuidle: prepare the cpuidle core to handle multiple drivers 41682032715c2c969357c81391a442a24dd1c2c2 cpuidle: move driver checking within the lock section 42f67f2acab2b7179c0d1ab234869e391448dfa6 cpuidle: move driver's refcount to cpuidle ed953472d181e1d149f17d85d82de9634db296c3 cpuidle: rename function name "__cpuidle_register_driver", v2 a77de28662adea391d8ed952e2b9c49b60193e8c cpuidle: remove some empty lines 6e797a078824b30afbfae6cc4b1c2b21c51761ef PM / cpuidle: Add driver reference counter Upon further examination of the Oops, I confirmed that the following upstream commit would also avoid the issue: commit ac3ebafa81af76d65e4fb45c6388f08e90ddcc6d Author: Daniel Lezcano <daniel.lezcano@linaro.org> Date: Mon Feb 4 22:44:43 2013 +0000 ACPI / idle: remove usage of the statedata However, this change seems more of a band-aid solution to the root cause because it removes all remaining references to state_usage and thus avoiding any pointer dereferencing all together, but clearly doesn't addressing the fact that the pointer was NULL to begin with. Going back to a closer examination of "cpuidle: support multiple drivers", even though it claims to have no other impact on other platforms if disabled, it clearly makes one change which does affect all platforms. It replaces all calls to cpuidle_get_driver() with a new call to cpuidle_get_cpu_driver(). This new call cpuidle_get_cpu_driver() adds a check if (!dev) return NULL which then propogates back up to eventually return -ENODEV. It is this additional error checking which is missing from the Quantal code base. == Fix == Add device error checking to avoid NULL pointer dereference kernel Oops when offlining CPU's. == Quantal SRU Justification == When we try to offline a CPU on DL380 Gen8 with Abu Dhabi processor then we see a NULL pointer dereference kernel oops: https://bugs.launchpad.net/hp/+bug/1132956/+attachment/3546712/+files/kernel_panic.jpg The latest Raring kernel works fine on the same system. Upon investigation, we have identified a straightforward fix to check the device pointer and thus avoid the NULL pointer dereference. == Test Case == With 1P on the same system everything works fine. command to offline a cpu is echo "0" > /sys/devices/system/cpu/cpu12/online == Investigation == There are actually a few commits in Raring which prevent the Oops from occuring. HP originally pointed us to the following commits to resolve the Oops: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-raring.git;a=commitdiff;h=bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-raring.git;a=commitdiff;h=13dd52f11a04e616900f565d6a1e5138e58d579f However, when examining the commit bf4d1b5d "cpuidle: support multiple drivers", it's clearly aimed at arm platforms...This option is configurable at compile time and should be enabled for the tegra3 and big.LITTLE architectures. There is no impact for other platforms if the option is disabled. The option defaults to 'n'. And indeed in Raring, CONFIG_CPU_IDLE_MULTIPLE_DRIVERS is not set for amd64. So I was highly skeptical that this was the proper fix we wanted. Also, in order to cleanly cherry-pick the above into Quantal, one would have to also pick up the following set: 12c46e70e44728c58f96409414ff06e90e1354f0 [Config] UBUNTU: Enable CONFIG_CPU_IDLE_MULTIPLE_DRIVERS bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 cpuidle: support multiple drivers 1aef40e288acfb3cc28ff77528b34ef66683bed6 cpuidle / sysfs: change function parameter 3af272ab75c7a0c7fa5ae5507724d961f7e7718b cpuidle: fix error handling in __cpuidle_register_device cf31cd1a0c692a1445c80756055875088fa29982 ACPI idle, CPU hotplug: Fix NULL pointer dereference during hotplug 1b0a0e9a15b976d91f3b5ae619c6a8964c2818eb cpuidle: add checks to avoid NULL pointer dereference 0aeb9cac6f8a6fc68acfb07d30b62ad6106a6384 cpuidle: remove unused hrtimer_peek_ahead_timers() call 13dd52f11a04e616900f565d6a1e5138e58d579f cpuidle: prepare the cpuidle core to handle multiple drivers 41682032715c2c969357c81391a442a24dd1c2c2 cpuidle: move driver checking within the lock section 42f67f2acab2b7179c0d1ab234869e391448dfa6 cpuidle: move driver's refcount to cpuidle ed953472d181e1d149f17d85d82de9634db296c3 cpuidle: rename function name "__cpuidle_register_driver", v2 a77de28662adea391d8ed952e2b9c49b60193e8c cpuidle: remove some empty lines 6e797a078824b30afbfae6cc4b1c2b21c51761ef PM / cpuidle: Add driver reference counter Upon further examination of the Oops, I confirmed that the following upstream commit would also avoid the issue: commit ac3ebafa81af76d65e4fb45c6388f08e90ddcc6d Author: Daniel Lezcano <daniel.lezcano@linaro.org> Date: Mon Feb 4 22:44:43 2013 +0000     ACPI / idle: remove usage of the statedata However, this change seems more of a band-aid solution to the root cause because it removes all remaining references to state_usage and thus avoiding any pointer dereferencing all together, but clearly doesn't addressing the fact that the pointer was NULL to begin with. Going back to a closer examination of "cpuidle: support multiple drivers", even though it claims to have no other impact on other platforms if disabled, it clearly makes one change which does affect all platforms. It replaces all calls to cpuidle_get_driver() with a new call to cpuidle_get_cpu_driver(). This new call cpuidle_get_cpu_driver() adds a check if (!dev) return NULL which then propogates back up to eventually return -ENODEV. It is this additional error checking which is missing from the Quantal code base. == Fix == Add device error checking to avoid NULL pointer dereference kernel Oops when offlining CPU's.
2013-04-12 15:52:55 Leann Ogasawara attachment added 0001-UBUNTU-SAUCE-cpuidle-Fix-NULL-pointer-dereference-wh.patch https://bugs.launchpad.net/ubuntu/quantal/+source/linux/+bug/1168452/+attachment/3642243/+files/0001-UBUNTU-SAUCE-cpuidle-Fix-NULL-pointer-dereference-wh.patch
2013-04-12 15:55:45 Leann Ogasawara description == Quantal SRU Justification == When we try to offline a CPU on DL380 Gen8 with Abu Dhabi processor then we see a NULL pointer dereference kernel oops: https://bugs.launchpad.net/hp/+bug/1132956/+attachment/3546712/+files/kernel_panic.jpg The latest Raring kernel works fine on the same system. Upon investigation, we have identified a straightforward fix to check the device pointer and thus avoid the NULL pointer dereference. == Test Case == With 1P on the same system everything works fine. command to offline a cpu is echo "0" > /sys/devices/system/cpu/cpu12/online == Investigation == There are actually a few commits in Raring which prevent the Oops from occuring. HP originally pointed us to the following commits to resolve the Oops: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-raring.git;a=commitdiff;h=bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-raring.git;a=commitdiff;h=13dd52f11a04e616900f565d6a1e5138e58d579f However, when examining the commit bf4d1b5d "cpuidle: support multiple drivers", it's clearly aimed at arm platforms...This option is configurable at compile time and should be enabled for the tegra3 and big.LITTLE architectures. There is no impact for other platforms if the option is disabled. The option defaults to 'n'. And indeed in Raring, CONFIG_CPU_IDLE_MULTIPLE_DRIVERS is not set for amd64. So I was highly skeptical that this was the proper fix we wanted. Also, in order to cleanly cherry-pick the above into Quantal, one would have to also pick up the following set: 12c46e70e44728c58f96409414ff06e90e1354f0 [Config] UBUNTU: Enable CONFIG_CPU_IDLE_MULTIPLE_DRIVERS bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 cpuidle: support multiple drivers 1aef40e288acfb3cc28ff77528b34ef66683bed6 cpuidle / sysfs: change function parameter 3af272ab75c7a0c7fa5ae5507724d961f7e7718b cpuidle: fix error handling in __cpuidle_register_device cf31cd1a0c692a1445c80756055875088fa29982 ACPI idle, CPU hotplug: Fix NULL pointer dereference during hotplug 1b0a0e9a15b976d91f3b5ae619c6a8964c2818eb cpuidle: add checks to avoid NULL pointer dereference 0aeb9cac6f8a6fc68acfb07d30b62ad6106a6384 cpuidle: remove unused hrtimer_peek_ahead_timers() call 13dd52f11a04e616900f565d6a1e5138e58d579f cpuidle: prepare the cpuidle core to handle multiple drivers 41682032715c2c969357c81391a442a24dd1c2c2 cpuidle: move driver checking within the lock section 42f67f2acab2b7179c0d1ab234869e391448dfa6 cpuidle: move driver's refcount to cpuidle ed953472d181e1d149f17d85d82de9634db296c3 cpuidle: rename function name "__cpuidle_register_driver", v2 a77de28662adea391d8ed952e2b9c49b60193e8c cpuidle: remove some empty lines 6e797a078824b30afbfae6cc4b1c2b21c51761ef PM / cpuidle: Add driver reference counter Upon further examination of the Oops, I confirmed that the following upstream commit would also avoid the issue: commit ac3ebafa81af76d65e4fb45c6388f08e90ddcc6d Author: Daniel Lezcano <daniel.lezcano@linaro.org> Date: Mon Feb 4 22:44:43 2013 +0000     ACPI / idle: remove usage of the statedata However, this change seems more of a band-aid solution to the root cause because it removes all remaining references to state_usage and thus avoiding any pointer dereferencing all together, but clearly doesn't addressing the fact that the pointer was NULL to begin with. Going back to a closer examination of "cpuidle: support multiple drivers", even though it claims to have no other impact on other platforms if disabled, it clearly makes one change which does affect all platforms. It replaces all calls to cpuidle_get_driver() with a new call to cpuidle_get_cpu_driver(). This new call cpuidle_get_cpu_driver() adds a check if (!dev) return NULL which then propogates back up to eventually return -ENODEV. It is this additional error checking which is missing from the Quantal code base. == Fix == Add device error checking to avoid NULL pointer dereference kernel Oops when offlining CPU's. == Quantal SRU Justification == When we try to offline a CPU on DL380 Gen8 with Abu Dhabi processor then we see a NULL pointer dereference kernel oops: https://bugs.launchpad.net/hp/+bug/1132956/+attachment/3546712/+files/kernel_panic.jpg The latest Raring kernel works fine on the same system. Upon investigation, we have identified a straightforward fix to check the device pointer and thus avoid the NULL pointer dereference. == Test Case == With 1P on the same system everything works fine. command to offline a cpu is echo "0" > /sys/devices/system/cpu/cpu12/online == Investigation == There are actually a few commits in Raring which prevent the Oops from occuring. HP originally pointed us to the following commits to resolve the Oops: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-raring.git;a=commitdiff;h=bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-raring.git;a=commitdiff;h=13dd52f11a04e616900f565d6a1e5138e58d579f However, when examining the commit bf4d1b5d "cpuidle: support multiple drivers", it's clearly aimed at arm platforms...This option is configurable at compile time and should be enabled for the tegra3 and big.LITTLE architectures. There is no impact for other platforms if the option is disabled. The option defaults to 'n'. And indeed in Raring, CONFIG_CPU_IDLE_MULTIPLE_DRIVERS is not set for amd64. So I was highly skeptical that this was the proper fix we wanted. Also, in order to cleanly cherry-pick the above into Quantal, one would have to also pick up the following set: 12c46e70e44728c58f96409414ff06e90e1354f0 [Config] UBUNTU: Enable CONFIG_CPU_IDLE_MULTIPLE_DRIVERS bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 cpuidle: support multiple drivers 1aef40e288acfb3cc28ff77528b34ef66683bed6 cpuidle / sysfs: change function parameter 3af272ab75c7a0c7fa5ae5507724d961f7e7718b cpuidle: fix error handling in __cpuidle_register_device cf31cd1a0c692a1445c80756055875088fa29982 ACPI idle, CPU hotplug: Fix NULL pointer dereference during hotplug 1b0a0e9a15b976d91f3b5ae619c6a8964c2818eb cpuidle: add checks to avoid NULL pointer dereference 0aeb9cac6f8a6fc68acfb07d30b62ad6106a6384 cpuidle: remove unused hrtimer_peek_ahead_timers() call 13dd52f11a04e616900f565d6a1e5138e58d579f cpuidle: prepare the cpuidle core to handle multiple drivers 41682032715c2c969357c81391a442a24dd1c2c2 cpuidle: move driver checking within the lock section 42f67f2acab2b7179c0d1ab234869e391448dfa6 cpuidle: move driver's refcount to cpuidle ed953472d181e1d149f17d85d82de9634db296c3 cpuidle: rename function name "__cpuidle_register_driver", v2 a77de28662adea391d8ed952e2b9c49b60193e8c cpuidle: remove some empty lines 6e797a078824b30afbfae6cc4b1c2b21c51761ef PM / cpuidle: Add driver reference counter Upon further examination of the Oops, I confirmed that the following upstream commit would also avoid the issue: commit ac3ebafa81af76d65e4fb45c6388f08e90ddcc6d Author: Daniel Lezcano <daniel.lezcano@linaro.org> Date: Mon Feb 4 22:44:43 2013 +0000     ACPI / idle: remove usage of the statedata However, this change seems more of a band-aid solution to the root cause because it removes all remaining references to state_usage and thus avoiding any pointer dereferencing all together, but clearly doesn't addressing the fact that the pointer was NULL to begin with. Going back to a closer examination of "cpuidle: support multiple drivers", even though it claims to have no other impact on other platforms if disabled, it clearly makes one change which does affect all platforms. It replaces all calls to cpuidle_get_driver() with a new call to cpuidle_get_cpu_driver(). This new call cpuidle_get_cpu_driver() adds a check if (!dev) return NULL which then propogates back up to eventually return -ENODEV. It is this additional error checking which is missing from the Quantal code base. == Fix == Add device error checking to avoid NULL pointer dereference kernel Oops when offlining CPU's (See patch attached to bug report). We have tested this patch against a DL380 Gen8 and confirm we no longer experience a kernel oops when offlining CPU's.
2013-04-12 15:56:57 Leann Ogasawara summary NULL pointer dereference when offlining CPU on DL380 Gen8 NULL pointer dereference when offlining CPU on DL385p Gen8
2013-04-12 15:57:23 Leann Ogasawara description == Quantal SRU Justification == When we try to offline a CPU on DL380 Gen8 with Abu Dhabi processor then we see a NULL pointer dereference kernel oops: https://bugs.launchpad.net/hp/+bug/1132956/+attachment/3546712/+files/kernel_panic.jpg The latest Raring kernel works fine on the same system. Upon investigation, we have identified a straightforward fix to check the device pointer and thus avoid the NULL pointer dereference. == Test Case == With 1P on the same system everything works fine. command to offline a cpu is echo "0" > /sys/devices/system/cpu/cpu12/online == Investigation == There are actually a few commits in Raring which prevent the Oops from occuring. HP originally pointed us to the following commits to resolve the Oops: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-raring.git;a=commitdiff;h=bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-raring.git;a=commitdiff;h=13dd52f11a04e616900f565d6a1e5138e58d579f However, when examining the commit bf4d1b5d "cpuidle: support multiple drivers", it's clearly aimed at arm platforms...This option is configurable at compile time and should be enabled for the tegra3 and big.LITTLE architectures. There is no impact for other platforms if the option is disabled. The option defaults to 'n'. And indeed in Raring, CONFIG_CPU_IDLE_MULTIPLE_DRIVERS is not set for amd64. So I was highly skeptical that this was the proper fix we wanted. Also, in order to cleanly cherry-pick the above into Quantal, one would have to also pick up the following set: 12c46e70e44728c58f96409414ff06e90e1354f0 [Config] UBUNTU: Enable CONFIG_CPU_IDLE_MULTIPLE_DRIVERS bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 cpuidle: support multiple drivers 1aef40e288acfb3cc28ff77528b34ef66683bed6 cpuidle / sysfs: change function parameter 3af272ab75c7a0c7fa5ae5507724d961f7e7718b cpuidle: fix error handling in __cpuidle_register_device cf31cd1a0c692a1445c80756055875088fa29982 ACPI idle, CPU hotplug: Fix NULL pointer dereference during hotplug 1b0a0e9a15b976d91f3b5ae619c6a8964c2818eb cpuidle: add checks to avoid NULL pointer dereference 0aeb9cac6f8a6fc68acfb07d30b62ad6106a6384 cpuidle: remove unused hrtimer_peek_ahead_timers() call 13dd52f11a04e616900f565d6a1e5138e58d579f cpuidle: prepare the cpuidle core to handle multiple drivers 41682032715c2c969357c81391a442a24dd1c2c2 cpuidle: move driver checking within the lock section 42f67f2acab2b7179c0d1ab234869e391448dfa6 cpuidle: move driver's refcount to cpuidle ed953472d181e1d149f17d85d82de9634db296c3 cpuidle: rename function name "__cpuidle_register_driver", v2 a77de28662adea391d8ed952e2b9c49b60193e8c cpuidle: remove some empty lines 6e797a078824b30afbfae6cc4b1c2b21c51761ef PM / cpuidle: Add driver reference counter Upon further examination of the Oops, I confirmed that the following upstream commit would also avoid the issue: commit ac3ebafa81af76d65e4fb45c6388f08e90ddcc6d Author: Daniel Lezcano <daniel.lezcano@linaro.org> Date: Mon Feb 4 22:44:43 2013 +0000     ACPI / idle: remove usage of the statedata However, this change seems more of a band-aid solution to the root cause because it removes all remaining references to state_usage and thus avoiding any pointer dereferencing all together, but clearly doesn't addressing the fact that the pointer was NULL to begin with. Going back to a closer examination of "cpuidle: support multiple drivers", even though it claims to have no other impact on other platforms if disabled, it clearly makes one change which does affect all platforms. It replaces all calls to cpuidle_get_driver() with a new call to cpuidle_get_cpu_driver(). This new call cpuidle_get_cpu_driver() adds a check if (!dev) return NULL which then propogates back up to eventually return -ENODEV. It is this additional error checking which is missing from the Quantal code base. == Fix == Add device error checking to avoid NULL pointer dereference kernel Oops when offlining CPU's (See patch attached to bug report). We have tested this patch against a DL380 Gen8 and confirm we no longer experience a kernel oops when offlining CPU's. == Quantal SRU Justification == When we try to offline a CPU on DL385p Gen8 with Abu Dhabi processor then we see a NULL pointer dereference kernel oops: https://bugs.launchpad.net/hp/+bug/1132956/+attachment/3546712/+files/kernel_panic.jpg The latest Raring kernel works fine on the same system. Upon investigation, we have identified a straightforward fix to check the device pointer and thus avoid the NULL pointer dereference. == Test Case == With 1P on the same system everything works fine. command to offline a cpu is echo "0" > /sys/devices/system/cpu/cpu12/online == Investigation == There are actually a few commits in Raring which prevent the Oops from occuring. HP originally pointed us to the following commits to resolve the Oops: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-raring.git;a=commitdiff;h=bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-raring.git;a=commitdiff;h=13dd52f11a04e616900f565d6a1e5138e58d579f However, when examining the commit bf4d1b5d "cpuidle: support multiple drivers", it's clearly aimed at arm platforms...This option is configurable at compile time and should be enabled for the tegra3 and big.LITTLE architectures. There is no impact for other platforms if the option is disabled. The option defaults to 'n'. And indeed in Raring, CONFIG_CPU_IDLE_MULTIPLE_DRIVERS is not set for amd64. So I was highly skeptical that this was the proper fix we wanted. Also, in order to cleanly cherry-pick the above into Quantal, one would have to also pick up the following set: 12c46e70e44728c58f96409414ff06e90e1354f0 [Config] UBUNTU: Enable CONFIG_CPU_IDLE_MULTIPLE_DRIVERS bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92 cpuidle: support multiple drivers 1aef40e288acfb3cc28ff77528b34ef66683bed6 cpuidle / sysfs: change function parameter 3af272ab75c7a0c7fa5ae5507724d961f7e7718b cpuidle: fix error handling in __cpuidle_register_device cf31cd1a0c692a1445c80756055875088fa29982 ACPI idle, CPU hotplug: Fix NULL pointer dereference during hotplug 1b0a0e9a15b976d91f3b5ae619c6a8964c2818eb cpuidle: add checks to avoid NULL pointer dereference 0aeb9cac6f8a6fc68acfb07d30b62ad6106a6384 cpuidle: remove unused hrtimer_peek_ahead_timers() call 13dd52f11a04e616900f565d6a1e5138e58d579f cpuidle: prepare the cpuidle core to handle multiple drivers 41682032715c2c969357c81391a442a24dd1c2c2 cpuidle: move driver checking within the lock section 42f67f2acab2b7179c0d1ab234869e391448dfa6 cpuidle: move driver's refcount to cpuidle ed953472d181e1d149f17d85d82de9634db296c3 cpuidle: rename function name "__cpuidle_register_driver", v2 a77de28662adea391d8ed952e2b9c49b60193e8c cpuidle: remove some empty lines 6e797a078824b30afbfae6cc4b1c2b21c51761ef PM / cpuidle: Add driver reference counter Upon further examination of the Oops, I confirmed that the following upstream commit would also avoid the issue: commit ac3ebafa81af76d65e4fb45c6388f08e90ddcc6d Author: Daniel Lezcano <daniel.lezcano@linaro.org> Date: Mon Feb 4 22:44:43 2013 +0000     ACPI / idle: remove usage of the statedata However, this change seems more of a band-aid solution to the root cause because it removes all remaining references to state_usage and thus avoiding any pointer dereferencing all together, but clearly doesn't addressing the fact that the pointer was NULL to begin with. Going back to a closer examination of "cpuidle: support multiple drivers", even though it claims to have no other impact on other platforms if disabled, it clearly makes one change which does affect all platforms. It replaces all calls to cpuidle_get_driver() with a new call to cpuidle_get_cpu_driver(). This new call cpuidle_get_cpu_driver() adds a check if (!dev) return NULL which then propogates back up to eventually return -ENODEV. It is this additional error checking which is missing from the Quantal code base. == Fix == Add device error checking to avoid NULL pointer dereference kernel Oops when offlining CPU's (See patch attached to bug report). We have tested this patch against a DL385p Gen8 and confirm we no longer experience a kernel oops when offlining CPU's.
2013-04-12 15:58:17 Leann Ogasawara attachment removed 0001-UBUNTU-SAUCE-cpuidle-Fix-NULL-pointer-dereference-wh.patch https://bugs.launchpad.net/ubuntu/quantal/+source/linux/+bug/1168452/+attachment/3642243/+files/0001-UBUNTU-SAUCE-cpuidle-Fix-NULL-pointer-dereference-wh.patch
2013-04-12 15:58:46 Leann Ogasawara attachment added 0001-UBUNTU-SAUCE-cpuidle-Fix-NULL-pointer-dereference-wh.patch https://bugs.launchpad.net/ubuntu/quantal/+source/linux/+bug/1168452/+attachment/3642246/+files/0001-UBUNTU-SAUCE-cpuidle-Fix-NULL-pointer-dereference-wh.patch
2013-04-12 20:45:07 Leann Ogasawara linux (Ubuntu Quantal): status In Progress Fix Committed
2013-05-08 02:57:05 Launchpad Janitor branch linked lp:ubuntu/precise-proposed/linux-lts-quantal
2013-05-08 13:40:40 Brad Figg tags verification-needed-quantal
2013-05-11 03:19:40 Launchpad Janitor branch linked lp:ubuntu/quantal-proposed/linux-ti-omap4
2013-05-15 02:02:24 Narinder Gupta tags verification-needed-quantal verification-done-quantal
2014-09-15 20:21:18 Joseph Salisbury linux (Ubuntu Quantal): status Fix Committed Won't Fix