Ubuntu
linux package

Bug #1432837
Activity log

Activity log for bug #1432837

Date	Who	What changed	Old value	New value	Message
2015-03-16 21:05:25	Rafael David Tinoco	bug			added bug
2015-03-16 21:05:33	Rafael David Tinoco	tags		cts
2015-03-16 21:06:21	Rafael David Tinoco	linux (Ubuntu): assignee		Rafael David Tinoco (inaddy)
2015-03-16 21:06:24	Rafael David Tinoco	linux (Ubuntu): assignee	Rafael David Tinoco (inaddy)
2015-03-16 21:07:07	Rafael David Tinoco	linux (Ubuntu): status	New	Incomplete
2015-03-16 21:07:09	Rafael David Tinoco	linux (Ubuntu): status	Incomplete	Confirmed
2015-03-16 21:11:41	Rafael David Tinoco	summary	HP Proliant Servers should not have HPWDT module loaded automatically	HP Proliant Servers - Kernel Panic NMI - DL360 & DL380 - HPWDT module loaded
2015-03-16 21:11:49	Rafael David Tinoco	summary	HP Proliant Servers - Kernel Panic NMI - DL360 & DL380 - HPWDT module loaded	HP Proliant Servers - Kernel Panic - NMI - DL360 & DL380 - HPWDT module loaded
2015-03-16 21:14:56	Rafael David Tinoco	description	It was brought to me several situations where users where facing kernel panics when machine was apparently idling: Examples: PID: 0 TASK: ffffffff81c1a480 CPU: 0 COMMAND: "swapper/0" #0 [ffff88085fc05c88] machine_kexec at ffffffff8104eac2 #1 [ffff88085fc05cd8] crash_kexec at ffffffff810f26a3 #2 [ffff88085fc05da0] panic at ffffffff8175b3f2 #3 [ffff88085fc05e20] sched_clock at ffffffff8101c3b9 #4 [ffff88085fc05e30] nmi_handle at ffffffff810170e8 #5 [ffff88085fc05e90] io_check_error at ffffffff8101758e #6 [ffff88085fc05eb0] default_do_nmi at ffffffff810176a9 #7 [ffff88085fc05ed8] do_nmi at ffffffff810177d8 #8 [ffff88085fc05ef0] end_repeat_nmi at ffffffff8176da21 [exception RIP: native_safe_halt+6] RIP: ffffffff81055186 RSP: ffffffff81c03e90 RFLAGS: 00000246 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000246 RDX: ffffffff81c03e90 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff81055186 R8: ffffffff81055186 R9: 0000000000000018 R10: ffffffff81c03e90 R11: 0000000000000246 R12: ffffffffffffffff R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <DOUBLEFAULT exception stack> --- #9 [ffffffff81c03e90] native_safe_halt at ffffffff81055186 #10 [ffffffff81c03e98] default_idle at ffffffff8101d37f #11 [ffffffff81c03eb8] arch_cpu_idle at ffffffff8101dcaf #12 [ffffffff81c03ec8] cpu_startup_entry at ffffffff810b5325 #13 [ffffffff81c03f40] rest_init at ffffffff81751a37 #14 [ffffffff81c03f50] start_kernel at ffffffff81d320b7 #15 [ffffffff81c03f90] x86_64_start_reservations at ffffffff81d315ee #16 [ffffffff81c03fa0] x86_64_start_kernel at ffffffff81d31733 OR PID: 0 TASK: ffffffff81c14440 CPU: 0 COMMAND: "swapper/0" #0 [ffff880fffa07c40] machine_kexec at ffffffff8104b391 #1 [ffff880fffa07cb0] crash_kexec at ffffffff810d5fb8 #2 [ffff880fffa07d80] panic at ffffffff81730335 #3 [ffff880fffa07e00] hpwdt_pretimeout at ffffffffa02378b5 [hpwdt] #4 [ffff880fffa07e20] nmi_handle at ffffffff8174a76a #5 [ffff880fffa07ea0] default_do_nmi at ffffffff8174aacd #6 [ffff880fffa07ed0] do_nmi at ffffffff8174abe0 #7 [ffff880fffa07ef0] end_repeat_nmi at ffffffff81749c81 [exception RIP: intel_idle+204] RIP: ffffffff813f07ec RSP: ffffffff81c01d88 RFLAGS: 00000046 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000046 RDX: ffffffff81c01d88 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff813f07ec R8: ffffffff813f07ec R9: 0000000000000018 R10: ffffffff81c01d88 R11: 0000000000000046 R12: ffffffffffffffff R13: 0000000001c0d000 R14: ffffffff81c01fd8 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <NMI exception stack> --- #8 [ffffffff81c01d88] intel_idle at ffffffff813f07ec #9 [ffffffff81c01dc0] cpuidle_enter_state at ffffffff815e76cf It turned out that after investigating all idling situations and diverse kernel dump files - where we had most of the CPUs either MWAITing and or "relaxing", we discovered that HPWDT was loaded and corosync was opening /dev/watchdog file, triggering the ILO watchdog timer and not updating frequently enough as ILO expected. As described in /etc/modprobe.d/blacklist-watchdog.conf: """ # Watchdog drivers should not be loaded automatically, but only if a # watchdog daemon is installed. """ We should blacklist module "hpwdt" by default for all Ubuntu versions.	It was brought to me several situations where users where facing kernel panics when machine was apparently idling (for some HP Proliant Servers like DL 360, DL 380). Examples: PID: 0 TASK: ffffffff81c1a480 CPU: 0 COMMAND: "swapper/0" #0 [ffff88085fc05c88] machine_kexec at ffffffff8104eac2 #1 [ffff88085fc05cd8] crash_kexec at ffffffff810f26a3 #2 [ffff88085fc05da0] panic at ffffffff8175b3f2 #3 [ffff88085fc05e20] sched_clock at ffffffff8101c3b9 #4 [ffff88085fc05e30] nmi_handle at ffffffff810170e8 #5 [ffff88085fc05e90] io_check_error at ffffffff8101758e #6 [ffff88085fc05eb0] default_do_nmi at ffffffff810176a9 #7 [ffff88085fc05ed8] do_nmi at ffffffff810177d8 #8 [ffff88085fc05ef0] end_repeat_nmi at ffffffff8176da21 [exception RIP: native_safe_halt+6] RIP: ffffffff81055186 RSP: ffffffff81c03e90 RFLAGS: 00000246 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000246 RDX: ffffffff81c03e90 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff81055186 R8: ffffffff81055186 R9: 0000000000000018 R10: ffffffff81c03e90 R11: 0000000000000246 R12: ffffffffffffffff R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <DOUBLEFAULT exception stack> --- #9 [ffffffff81c03e90] native_safe_halt at ffffffff81055186 #10 [ffffffff81c03e98] default_idle at ffffffff8101d37f #11 [ffffffff81c03eb8] arch_cpu_idle at ffffffff8101dcaf #12 [ffffffff81c03ec8] cpu_startup_entry at ffffffff810b5325 #13 [ffffffff81c03f40] rest_init at ffffffff81751a37 #14 [ffffffff81c03f50] start_kernel at ffffffff81d320b7 #15 [ffffffff81c03f90] x86_64_start_reservations at ffffffff81d315ee #16 [ffffffff81c03fa0] x86_64_start_kernel at ffffffff81d31733 OR PID: 0 TASK: ffffffff81c14440 CPU: 0 COMMAND: "swapper/0" #0 [ffff880fffa07c40] machine_kexec at ffffffff8104b391 #1 [ffff880fffa07cb0] crash_kexec at ffffffff810d5fb8 #2 [ffff880fffa07d80] panic at ffffffff81730335 #3 [ffff880fffa07e00] hpwdt_pretimeout at ffffffffa02378b5 [hpwdt] #4 [ffff880fffa07e20] nmi_handle at ffffffff8174a76a #5 [ffff880fffa07ea0] default_do_nmi at ffffffff8174aacd #6 [ffff880fffa07ed0] do_nmi at ffffffff8174abe0 #7 [ffff880fffa07ef0] end_repeat_nmi at ffffffff81749c81 [exception RIP: intel_idle+204] RIP: ffffffff813f07ec RSP: ffffffff81c01d88 RFLAGS: 00000046 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000046 RDX: ffffffff81c01d88 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff813f07ec R8: ffffffff813f07ec R9: 0000000000000018 R10: ffffffff81c01d88 R11: 0000000000000046 R12: ffffffffffffffff R13: 0000000001c0d000 R14: ffffffff81c01fd8 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <NMI exception stack> --- #8 [ffffffff81c01d88] intel_idle at ffffffff813f07ec #9 [ffffffff81c01dc0] cpuidle_enter_state at ffffffff815e76cf It turned out that after investigating all idling situations and diverse kernel dump files - where we had most of the CPUs either MWAITing and or "relaxing", we discovered that HPWDT was loaded and corosync was opening /dev/watchdog file, triggering the ILO watchdog timer and not updating frequently enough as ILO expected. As described in /etc/modprobe.d/blacklist-watchdog.conf: """ # Watchdog drivers should not be loaded automatically, but only if a # watchdog daemon is installed. """ We should blacklist module "hpwdt" by default for all Ubuntu versions.
2015-03-16 21:16:54	Rafael David Tinoco	description	It was brought to me several situations where users where facing kernel panics when machine was apparently idling (for some HP Proliant Servers like DL 360, DL 380). Examples: PID: 0 TASK: ffffffff81c1a480 CPU: 0 COMMAND: "swapper/0" #0 [ffff88085fc05c88] machine_kexec at ffffffff8104eac2 #1 [ffff88085fc05cd8] crash_kexec at ffffffff810f26a3 #2 [ffff88085fc05da0] panic at ffffffff8175b3f2 #3 [ffff88085fc05e20] sched_clock at ffffffff8101c3b9 #4 [ffff88085fc05e30] nmi_handle at ffffffff810170e8 #5 [ffff88085fc05e90] io_check_error at ffffffff8101758e #6 [ffff88085fc05eb0] default_do_nmi at ffffffff810176a9 #7 [ffff88085fc05ed8] do_nmi at ffffffff810177d8 #8 [ffff88085fc05ef0] end_repeat_nmi at ffffffff8176da21 [exception RIP: native_safe_halt+6] RIP: ffffffff81055186 RSP: ffffffff81c03e90 RFLAGS: 00000246 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000246 RDX: ffffffff81c03e90 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff81055186 R8: ffffffff81055186 R9: 0000000000000018 R10: ffffffff81c03e90 R11: 0000000000000246 R12: ffffffffffffffff R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <DOUBLEFAULT exception stack> --- #9 [ffffffff81c03e90] native_safe_halt at ffffffff81055186 #10 [ffffffff81c03e98] default_idle at ffffffff8101d37f #11 [ffffffff81c03eb8] arch_cpu_idle at ffffffff8101dcaf #12 [ffffffff81c03ec8] cpu_startup_entry at ffffffff810b5325 #13 [ffffffff81c03f40] rest_init at ffffffff81751a37 #14 [ffffffff81c03f50] start_kernel at ffffffff81d320b7 #15 [ffffffff81c03f90] x86_64_start_reservations at ffffffff81d315ee #16 [ffffffff81c03fa0] x86_64_start_kernel at ffffffff81d31733 OR PID: 0 TASK: ffffffff81c14440 CPU: 0 COMMAND: "swapper/0" #0 [ffff880fffa07c40] machine_kexec at ffffffff8104b391 #1 [ffff880fffa07cb0] crash_kexec at ffffffff810d5fb8 #2 [ffff880fffa07d80] panic at ffffffff81730335 #3 [ffff880fffa07e00] hpwdt_pretimeout at ffffffffa02378b5 [hpwdt] #4 [ffff880fffa07e20] nmi_handle at ffffffff8174a76a #5 [ffff880fffa07ea0] default_do_nmi at ffffffff8174aacd #6 [ffff880fffa07ed0] do_nmi at ffffffff8174abe0 #7 [ffff880fffa07ef0] end_repeat_nmi at ffffffff81749c81 [exception RIP: intel_idle+204] RIP: ffffffff813f07ec RSP: ffffffff81c01d88 RFLAGS: 00000046 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000046 RDX: ffffffff81c01d88 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff813f07ec R8: ffffffff813f07ec R9: 0000000000000018 R10: ffffffff81c01d88 R11: 0000000000000046 R12: ffffffffffffffff R13: 0000000001c0d000 R14: ffffffff81c01fd8 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <NMI exception stack> --- #8 [ffffffff81c01d88] intel_idle at ffffffff813f07ec #9 [ffffffff81c01dc0] cpuidle_enter_state at ffffffff815e76cf It turned out that after investigating all idling situations and diverse kernel dump files - where we had most of the CPUs either MWAITing and or "relaxing", we discovered that HPWDT was loaded and corosync was opening /dev/watchdog file, triggering the ILO watchdog timer and not updating frequently enough as ILO expected. As described in /etc/modprobe.d/blacklist-watchdog.conf: """ # Watchdog drivers should not be loaded automatically, but only if a # watchdog daemon is installed. """ We should blacklist module "hpwdt" by default for all Ubuntu versions.	It was brought to me several situations where users where facing kernel panics when machine was apparently idling (for some HP Proliant Servers like DL 360, DL 380). ILO: "76 CriticalSystem Error03/12/2015 12:4203/12/2015 12:072 An Unrecoverable System Error (NMI) has occurred (System error code 0x0000002B, 0x00000000)" Examples: PID: 0 TASK: ffffffff81c1a480 CPU: 0 COMMAND: "swapper/0" #0 [ffff88085fc05c88] machine_kexec at ffffffff8104eac2 #1 [ffff88085fc05cd8] crash_kexec at ffffffff810f26a3 #2 [ffff88085fc05da0] panic at ffffffff8175b3f2 #3 [ffff88085fc05e20] sched_clock at ffffffff8101c3b9 #4 [ffff88085fc05e30] nmi_handle at ffffffff810170e8 #5 [ffff88085fc05e90] io_check_error at ffffffff8101758e #6 [ffff88085fc05eb0] default_do_nmi at ffffffff810176a9 #7 [ffff88085fc05ed8] do_nmi at ffffffff810177d8 #8 [ffff88085fc05ef0] end_repeat_nmi at ffffffff8176da21 [exception RIP: native_safe_halt+6] RIP: ffffffff81055186 RSP: ffffffff81c03e90 RFLAGS: 00000246 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000246 RDX: ffffffff81c03e90 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff81055186 R8: ffffffff81055186 R9: 0000000000000018 R10: ffffffff81c03e90 R11: 0000000000000246 R12: ffffffffffffffff R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <DOUBLEFAULT exception stack> --- #9 [ffffffff81c03e90] native_safe_halt at ffffffff81055186 #10 [ffffffff81c03e98] default_idle at ffffffff8101d37f #11 [ffffffff81c03eb8] arch_cpu_idle at ffffffff8101dcaf #12 [ffffffff81c03ec8] cpu_startup_entry at ffffffff810b5325 #13 [ffffffff81c03f40] rest_init at ffffffff81751a37 #14 [ffffffff81c03f50] start_kernel at ffffffff81d320b7 #15 [ffffffff81c03f90] x86_64_start_reservations at ffffffff81d315ee #16 [ffffffff81c03fa0] x86_64_start_kernel at ffffffff81d31733 OR PID: 0 TASK: ffffffff81c14440 CPU: 0 COMMAND: "swapper/0" #0 [ffff880fffa07c40] machine_kexec at ffffffff8104b391 #1 [ffff880fffa07cb0] crash_kexec at ffffffff810d5fb8 #2 [ffff880fffa07d80] panic at ffffffff81730335 #3 [ffff880fffa07e00] hpwdt_pretimeout at ffffffffa02378b5 [hpwdt] #4 [ffff880fffa07e20] nmi_handle at ffffffff8174a76a #5 [ffff880fffa07ea0] default_do_nmi at ffffffff8174aacd #6 [ffff880fffa07ed0] do_nmi at ffffffff8174abe0 #7 [ffff880fffa07ef0] end_repeat_nmi at ffffffff81749c81 [exception RIP: intel_idle+204] RIP: ffffffff813f07ec RSP: ffffffff81c01d88 RFLAGS: 00000046 RAX: 0000000000000010 RBX: 0000000000000010 RCX: 0000000000000046 RDX: ffffffff81c01d88 RSI: 0000000000000018 RDI: 0000000000000001 RBP: ffffffff813f07ec R8: ffffffff813f07ec R9: 0000000000000018 R10: ffffffff81c01d88 R11: 0000000000000046 R12: ffffffffffffffff R13: 0000000001c0d000 R14: ffffffff81c01fd8 R15: 0000000000000000 ORIG_RAX: 0000000000000000 CS: 0010 SS: 0018 --- <NMI exception stack> --- #8 [ffffffff81c01d88] intel_idle at ffffffff813f07ec #9 [ffffffff81c01dc0] cpuidle_enter_state at ffffffff815e76cf It turned out that after investigating all idling situations and diverse kernel dump files - where we had most of the CPUs either MWAITing and or "relaxing", we discovered that HPWDT was loaded and corosync was opening /dev/watchdog file, triggering the ILO watchdog timer and not updating frequently enough as ILO expected. As described in /etc/modprobe.d/blacklist-watchdog.conf: """ # Watchdog drivers should not be loaded automatically, but only if a # watchdog daemon is installed. """ We should blacklist module "hpwdt" by default for all Ubuntu versions.
2015-03-17 11:06:17	Andy Whitcroft	linux (Ubuntu): status	Confirmed	In Progress
2015-03-17 11:06:19	Andy Whitcroft	linux (Ubuntu): importance	Undecided	High
2015-03-17 11:06:21	Andy Whitcroft	linux (Ubuntu): assignee		Andy Whitcroft (apw)
2015-03-17 11:06:23	Andy Whitcroft	linux (Ubuntu): milestone		ubuntu-15.03
2015-03-17 11:09:45	Adam Conrad	nominated for series		Ubuntu Utopic
2015-03-17 11:09:45	Adam Conrad	bug task added		linux (Ubuntu Utopic)
2015-03-17 11:09:45	Adam Conrad	nominated for series		Ubuntu Trusty
2015-03-17 11:09:45	Adam Conrad	bug task added		linux (Ubuntu Trusty)
2015-03-17 11:09:45	Adam Conrad	nominated for series		Ubuntu Precise
2015-03-17 11:09:45	Adam Conrad	bug task added		linux (Ubuntu Precise)
2015-03-17 12:16:22	Andy Whitcroft	linux (Ubuntu Precise): status	New	In Progress
2015-03-17 12:16:24	Andy Whitcroft	linux (Ubuntu Trusty): status	New	In Progress
2015-03-17 12:16:26	Andy Whitcroft	linux (Ubuntu Utopic): status	New	In Progress
2015-03-17 12:16:29	Andy Whitcroft	linux (Ubuntu Precise): importance	Undecided	High
2015-03-17 12:16:31	Andy Whitcroft	linux (Ubuntu Trusty): importance	Undecided	High
2015-03-17 12:16:33	Andy Whitcroft	linux (Ubuntu Utopic): importance	Undecided	High
2015-03-17 12:16:36	Andy Whitcroft	linux (Ubuntu Precise): assignee		Andy Whitcroft (apw)
2015-03-17 12:16:38	Andy Whitcroft	linux (Ubuntu Trusty): assignee		Andy Whitcroft (apw)
2015-03-17 12:16:40	Andy Whitcroft	linux (Ubuntu Utopic): assignee		Andy Whitcroft (apw)
2015-03-17 12:18:32	Andy Whitcroft	linux (Ubuntu): status	In Progress	Fix Committed
2015-03-17 15:54:01	Micheal Waltz	bug			added subscriber Micheal Waltz
2015-03-18 13:42:33	Brad Figg	linux (Ubuntu Utopic): status	In Progress	Fix Committed
2015-03-18 13:42:37	Brad Figg	linux (Ubuntu Trusty): status	In Progress	Fix Committed
2015-03-18 13:42:40	Brad Figg	linux (Ubuntu Precise): status	In Progress	Fix Committed
2015-03-24 02:28:01	Launchpad Janitor	linux (Ubuntu): status	Fix Committed	Fix Released
2015-03-26 03:49:45	Dave Leaver	bug			added subscriber Dave Leaver
2015-03-26 17:39:40	Brad Figg	tags	cts	cts verification-needed-precise
2015-03-26 17:39:54	Brad Figg	tags	cts verification-needed-precise	cts verification-needed-precise verification-needed-trusty
2015-03-26 17:40:29	Brad Figg	tags	cts verification-needed-precise verification-needed-trusty	cts verification-needed-precise verification-needed-trusty verification-needed-utopic
2015-04-07 15:18:55	Rafael David Tinoco	tags	cts verification-needed-precise verification-needed-trusty verification-needed-utopic	cts verification-done
2015-04-08 15:40:29	Launchpad Janitor	linux (Ubuntu Trusty): status	Fix Committed	Fix Released
2015-04-08 15:40:29	Launchpad Janitor	cve linked		2015-1421
2015-04-08 15:40:29	Launchpad Janitor	cve linked		2015-1465
2015-04-08 15:40:29	Launchpad Janitor	cve linked		2015-1593
2015-04-08 15:40:29	Launchpad Janitor	cve linked		2015-2041
2015-04-08 15:40:29	Launchpad Janitor	cve linked		2015-2042
2015-04-08 16:13:01	Launchpad Janitor	linux (Ubuntu Precise): status	Fix Committed	Fix Released
2015-04-09 03:41:06	Launchpad Janitor	linux (Ubuntu Utopic): status	Fix Committed	Fix Released
2015-05-04 14:40:09	Rafael David Tinoco	marked as duplicate		1417580

Ubuntulinux package

Activity log for bug #1432837

Ubuntu
linux package