2012-03-01 06:55:35 |
Philipp Morger |
bug |
|
|
added bug |
2012-03-01 07:00:08 |
Brad Figg |
linux (Ubuntu): status |
New |
Incomplete |
|
2012-03-01 07:00:10 |
Brad Figg |
tags |
|
natty |
|
2012-03-01 14:57:44 |
Phoenix |
bug |
|
|
added subscriber Phoenix |
2012-03-01 14:59:37 |
Phoenix |
tags |
natty |
apport-collected natty |
|
2012-03-01 14:59:41 |
Phoenix |
attachment added |
|
BootDmesg.txt https://bugs.edge.launchpad.net/bugs/943815/+attachment/2797476/+files/BootDmesg.txt |
|
2012-03-01 14:59:42 |
Phoenix |
attachment added |
|
CurrentDmesg.txt https://bugs.edge.launchpad.net/bugs/943815/+attachment/2797477/+files/CurrentDmesg.txt |
|
2012-03-01 14:59:44 |
Phoenix |
attachment added |
|
Lspci.txt https://bugs.edge.launchpad.net/bugs/943815/+attachment/2797478/+files/Lspci.txt |
|
2012-03-01 14:59:46 |
Phoenix |
attachment added |
|
Lsusb.txt https://bugs.edge.launchpad.net/bugs/943815/+attachment/2797479/+files/Lsusb.txt |
|
2012-03-01 14:59:48 |
Phoenix |
attachment added |
|
ProcCpuinfo.txt https://bugs.edge.launchpad.net/bugs/943815/+attachment/2797480/+files/ProcCpuinfo.txt |
|
2012-03-01 14:59:50 |
Phoenix |
attachment added |
|
ProcInterrupts.txt https://bugs.edge.launchpad.net/bugs/943815/+attachment/2797481/+files/ProcInterrupts.txt |
|
2012-03-01 14:59:51 |
Phoenix |
attachment added |
|
ProcModules.txt https://bugs.edge.launchpad.net/bugs/943815/+attachment/2797482/+files/ProcModules.txt |
|
2012-03-01 14:59:53 |
Phoenix |
attachment added |
|
UdevDb.txt https://bugs.edge.launchpad.net/bugs/943815/+attachment/2797483/+files/UdevDb.txt |
|
2012-03-01 14:59:55 |
Phoenix |
attachment added |
|
UdevLog.txt https://bugs.edge.launchpad.net/bugs/943815/+attachment/2797484/+files/UdevLog.txt |
|
2012-03-01 15:00:16 |
Phoenix |
linux (Ubuntu): status |
Incomplete |
Confirmed |
|
2012-03-01 16:01:30 |
Joseph Salisbury |
linux (Ubuntu): importance |
Undecided |
Medium |
|
2012-03-01 16:04:17 |
Joseph Salisbury |
tags |
apport-collected natty |
apport-collected lucid needs-upstream-testing |
|
2012-03-01 16:04:27 |
Joseph Salisbury |
linux (Ubuntu): status |
Confirmed |
Incomplete |
|
2012-03-01 19:00:57 |
Phoenix |
attachment added |
|
Screenshot at 2012-01-18 14:48:27.png https://bugs.launchpad.net/ubuntu/+source/linux/+bug/943815/+attachment/2798263/+files/Screenshot%20at%202012-01-18%2014%3A48%3A27.png |
|
2012-03-02 15:50:32 |
Herton R. Krzesinski |
bug watch added |
|
http://bugzilla.kernel.org/show_bug.cgi?id=27142 |
|
2012-03-02 15:53:07 |
Herton R. Krzesinski |
summary |
Slow System Crash due to Kernel Problem |
panic in task_rq_lock (race with concurrent semtimedop() timeouts and IPC_RMID) |
|
2012-03-02 15:53:37 |
Herton R. Krzesinski |
nominated for series |
|
Ubuntu Natty |
|
2012-03-02 15:53:37 |
Herton R. Krzesinski |
bug task added |
|
linux (Ubuntu Natty) |
|
2012-03-02 15:53:49 |
Herton R. Krzesinski |
linux (Ubuntu Natty): importance |
Undecided |
Medium |
|
2012-03-02 15:53:54 |
Herton R. Krzesinski |
linux (Ubuntu Natty): status |
New |
In Progress |
|
2012-03-02 15:53:58 |
Herton R. Krzesinski |
linux (Ubuntu Natty): assignee |
|
Herton R. Krzesinski (herton) |
|
2012-03-02 15:54:05 |
Herton R. Krzesinski |
linux (Ubuntu): status |
Incomplete |
Fix Released |
|
2012-03-02 16:10:21 |
Herton R. Krzesinski |
description |
When logged in I saw:
unity kernel: [669168.472431] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
unity kernel: [669168.475971] Stack:
unity kernel: [669168.476634] Call Trace:
unity kernel: [669168.477094] Code: 00 48 c7 c3 c0 3c 01 00 49 89 fc 49 89 f5 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 49 89 55 00 49 8b 44 24 08 49 89 de <8b> 40 18 4c 03 34 c5 00 4b ac 81 4c 89 f7 e8 03 36 58 00 49 8b
unity kernel: [669168.479444] CR2: 00000000801f0f1d
In the log:
Mar 1 06:25:04 unity apache2[14216]: [Thu Mar 01 06:25:04 2012] [notice] SIGUSR1 received. Doing graceful restart
Mar 1 06:25:04 unity kernel: [669168.471999] BUG: unable to handle kernel paging request at 00000000801f0f1d
Mar 1 06:25:04 unity kernel: [669168.472131] IP: [<ffffffff81051aba>] task_rq_lock+0x4a/0xa0
Mar 1 06:25:04 unity kernel: [669168.472229] PGD 0
Mar 1 06:25:04 unity kernel: [669168.472312] Oops: 0000 [#1] SMP
Mar 1 06:25:04 unity kernel: [669168.472431] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
Mar 1 06:25:04 unity kernel: [669168.472508] CPU 7
Mar 1 06:25:04 unity kernel: [669168.472545] Modules linked in: ipt_MASQUERADE iptable_nat kvm_intel kvm ip6t_LOG xt_hl nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT ipt_LOG xt_limit xt_tcpudp ipt_addrtype xt_state
ip6table_filter ip6_tables radeon nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 ipmi_devintf nf_defrag_ipv4 ipmi_watchdog nf_conntrack_ftp psmouse nf_conntrack ttm drm_kms_helper ipmi_si drm ipt
able_filter serio_raw joydev i5400_edac edac_core ipmi_poweroff ip_tables ioatdma ipmi_msghandler i5k_amb lp i2c_algo_bit x_tables bridge stp parport shpchp usbhid hid usb_storage uas igb arcmsr dca
Mar 1 06:25:04 unity kernel: [669168.474703]
Mar 1 06:25:04 unity kernel: [669168.474756] Pid: 1832, comm: apache2 Not tainted 2.6.38-10-server #46~lucid1-Ubuntu Supermicro X7DWU/X7DWU
Mar 1 06:25:04 unity kernel: [669168.475004] RIP: 0010:[<ffffffff81051aba>] [<ffffffff81051aba>] task_rq_lock+0x4a/0xa0
Mar 1 06:25:04 unity kernel: [669168.475114] RSP: 0018:ffff88040c10fdc8 EFLAGS: 00010082
Mar 1 06:25:04 unity kernel: [669168.475171] RAX: 00000000801f0f05 RBX: 0000000000013cc0 RCX: 0000000000000002
Mar 1 06:25:04 unity kernel: [669168.475245] RDX: 0000000000000282 RSI: ffff88040c10fe20 RDI: 00007f558925f8f0
Mar 1 06:25:04 unity kernel: [669168.475320] RBP: ffff88040c10fde8 R08: 0000000000989680 R09: 000000000000028b
Mar 1 06:25:04 unity kernel: [669168.475393] R10: 0000000000007bea R11: 0000000000000001 R12: 00007f558925f8f0
Mar 1 06:25:04 unity kernel: [669168.475467] R13: ffff88040c10fe20 R14: 0000000000013cc0 R15: 0000000000000007
Mar 1 06:25:04 unity kernel: [669168.475542] FS: 00007f5589d03740(0000) GS:ffff8800cfdc0000(0000) knlGS:0000000000000000
Mar 1 06:25:04 unity kernel: [669168.475617] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 1 06:25:04 unity kernel: [669168.475674] CR2: 00000000801f0f1d CR3: 000000040eb35000 CR4: 00000000000026e0
Mar 1 06:25:04 unity kernel: [669168.475748] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 1 06:25:04 unity kernel: [669168.475821] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar 1 06:25:04 unity kernel: [669168.475895] Process apache2 (pid: 1832, threadinfo ffff88040c10e000, task ffff88040c1f2dc0)
Mar 1 06:25:04 unity kernel: [669168.475971] Stack:
Mar 1 06:25:04 unity kernel: [669168.476022] 00007f558925f8f0 ffff88040f155ec8 000000000000000f 0000000000000000
Mar 1 06:25:04 unity kernel: [669168.476225] ffff88040c10fe58 ffffffff8105f6dc ffff88040c10fe28 0000000700000286
Mar 1 06:25:04 unity kernel: [669168.476429] 0000000000000003 0000000181a4d7f0 ffff8804015d9850 0000000000000282
Mar 1 06:25:04 unity kernel: [669168.476634] Call Trace:
Mar 1 06:25:04 unity kernel: [669168.476689] [<ffffffff8105f6dc>] try_to_wake_up+0x3c/0x410
Mar 1 06:25:04 unity kernel: [669168.476747] [<ffffffff8105fb05>] wake_up_process+0x15/0x20
Mar 1 06:25:04 unity kernel: [669168.476806] [<ffffffff8126a590>] freeary+0x1e0/0x220
Mar 1 06:25:04 unity kernel: [669168.476863] [<ffffffff8126b610>] T.607+0xb0/0x100
Mar 1 06:25:04 unity kernel: [669168.476921] [<ffffffff81164055>] ? vfs_write+0x125/0x190
Mar 1 06:25:04 unity kernel: [669168.476979] [<ffffffff8126b6c9>] sys_semctl+0x69/0xa0
Mar 1 06:25:04 unity kernel: [669168.477036] [<ffffffff8100c042>] system_call_fastpath+0x16/0x1b
Mar 1 06:25:04 unity kernel: [669168.477094] Code: 00 48 c7 c3 c0 3c 01 00 49 89 fc 49 89 f5 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 49 89 55 00 49 8b 44 24 08 49 89 de <8b> 40 18 4c 03 34 c5 00 4b ac 81 4c 89 f7 e8 03 36 58 00 49 8b
Mar 1 06:25:04 unity kernel: [669168.479300] RIP [<ffffffff81051aba>] task_rq_lock+0x4a/0xa0
Mar 1 06:25:04 unity kernel: [669168.479391] RSP <ffff88040c10fdc8>
Mar 1 06:25:04 unity kernel: [669168.479444] CR2: 00000000801f0f1d
Mar 1 06:25:04 unity kernel: [669168.479497] ---[ end trace b2b87cfb63915f6c ]---
This happens QUITE OFTEN. Only solution: Sync Filesystem and power cycle (read: I can't reboot, I have to pull the plug! (well, pushing reset button or the same via MagicKey....)
Furthermore: Apache in this case will no longer answer, and won't be able to Stop, It goes zombie.
The System is still accessible, except for Apache - and Apache can't be braught back to live...
Can't say, if it is a memory issue, but note: This is a Server, it has ECC FB-DIMM Memory. Will have to do a memory check some time. But nothing in this regard has been seen in the logs of the daughter board.
Some System info:
Distributor ID: Ubuntu
Description: Ubuntu 10.04.4 LTS
Release: 10.04
Codename: lucid
*-memory
description: System Memory
physical id: 16
slot: System board or motherboard
size: 16GiB
*-bank:0
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 0
serial: 182194CD
slot: DIMM1A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:1
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 1
slot: DIMM1B
clock: 800MHz (1.2ns)
*-bank:2
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 2
serial: 1921B1FA
slot: DIMM2A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:3
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 3
slot: DIMM2B
clock: 800MHz (1.2ns)
*-bank:4
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 4
serial: 16213E76
slot: DIMM3A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:5
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 5
slot: DIMM3B
clock: 800MHz (1.2ns)
*-bank:6
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 6
serial: 1721CBB7
slot: DIMM4A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:7
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 7
slot: DIMM4B
clock: 800MHz (1.2ns)
*-cpu:0
description: CPU
product: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
vendor: Intel Corp.
physical id: 4
bus info: cpu@0
version: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
slot: LGA771/CPU1
size: 3GHz
width: 64 bits
clock: 1600MHz
capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx x86-64 constant_tsc arch_perfmon pebs bts
rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
*-cache:0
description: L1 cache
physical id: 6
slot: L1 Cache
size: 16KiB
capacity: 16KiB
capabilities: asynchronous internal write-back
*-cache:1
description: L2 cache
physical id: 7
slot: L2 Cache
size: 12MiB
capabilities: burst internal write-back
*-cpu:1
description: CPU
product: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
vendor: Intel Corp.
physical id: 5
bus info: cpu@1
version: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
slot: LGA771/CPU2
size: 3GHz
width: 64 bits
clock: 1600MHz
capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx x86-64 constant_tsc arch_perfmon pebs bts
rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
*-cache:0
description: L1 cache
physical id: 8
slot: L1 Cache
size: 16KiB
capacity: 16KiB
capabilities: asynchronous internal write-back
*-cache:1
description: L2 cache
physical id: 9
slot: L2 Cache
size: 12MiB
capabilities: burst internal write-back |
SRU justification
=================
Impact
------
Kernel crash, due to race explained in upstream bug report: https://bugzilla.kernel.org/show_bug.cgi?id=27142
In practice likely to happen on a highly loaded webserver
Fix
---
Upstream commit d694ad62bf539dbb20a0899ac2a954555f9e4a83
Testcase
--------
https://bugzilla.kernel.org/attachment.cgi?id=66162
I'll attach to this bug as well.
- Build with gcc -o timedrm timedrm.cpp -lpthread
- Run with "test 250", sometimes you have to run more than one time to get the oops, but it's very easy to get the crash.
---------------------------------------------------------------------------------------
When logged in I saw:
unity kernel: [669168.472431] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
unity kernel: [669168.475971] Stack:
unity kernel: [669168.476634] Call Trace:
unity kernel: [669168.477094] Code: 00 48 c7 c3 c0 3c 01 00 49 89 fc 49 89 f5 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 49 89 55 00 49 8b 44 24 08 49 89 de <8b> 40 18 4c 03 34 c5 00 4b ac 81 4c 89 f7 e8 03 36 58 00 49 8b
unity kernel: [669168.479444] CR2: 00000000801f0f1d
In the log:
Mar 1 06:25:04 unity apache2[14216]: [Thu Mar 01 06:25:04 2012] [notice] SIGUSR1 received. Doing graceful restart
Mar 1 06:25:04 unity kernel: [669168.471999] BUG: unable to handle kernel paging request at 00000000801f0f1d
Mar 1 06:25:04 unity kernel: [669168.472131] IP: [<ffffffff81051aba>] task_rq_lock+0x4a/0xa0
Mar 1 06:25:04 unity kernel: [669168.472229] PGD 0
Mar 1 06:25:04 unity kernel: [669168.472312] Oops: 0000 [#1] SMP
Mar 1 06:25:04 unity kernel: [669168.472431] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
Mar 1 06:25:04 unity kernel: [669168.472508] CPU 7
Mar 1 06:25:04 unity kernel: [669168.472545] Modules linked in: ipt_MASQUERADE iptable_nat kvm_intel kvm ip6t_LOG xt_hl nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT ipt_LOG xt_limit xt_tcpudp ipt_addrtype xt_state
ip6table_filter ip6_tables radeon nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 ipmi_devintf nf_defrag_ipv4 ipmi_watchdog nf_conntrack_ftp psmouse nf_conntrack ttm drm_kms_helper ipmi_si drm ipt
able_filter serio_raw joydev i5400_edac edac_core ipmi_poweroff ip_tables ioatdma ipmi_msghandler i5k_amb lp i2c_algo_bit x_tables bridge stp parport shpchp usbhid hid usb_storage uas igb arcmsr dca
Mar 1 06:25:04 unity kernel: [669168.474703]
Mar 1 06:25:04 unity kernel: [669168.474756] Pid: 1832, comm: apache2 Not tainted 2.6.38-10-server #46~lucid1-Ubuntu Supermicro X7DWU/X7DWU
Mar 1 06:25:04 unity kernel: [669168.475004] RIP: 0010:[<ffffffff81051aba>] [<ffffffff81051aba>] task_rq_lock+0x4a/0xa0
Mar 1 06:25:04 unity kernel: [669168.475114] RSP: 0018:ffff88040c10fdc8 EFLAGS: 00010082
Mar 1 06:25:04 unity kernel: [669168.475171] RAX: 00000000801f0f05 RBX: 0000000000013cc0 RCX: 0000000000000002
Mar 1 06:25:04 unity kernel: [669168.475245] RDX: 0000000000000282 RSI: ffff88040c10fe20 RDI: 00007f558925f8f0
Mar 1 06:25:04 unity kernel: [669168.475320] RBP: ffff88040c10fde8 R08: 0000000000989680 R09: 000000000000028b
Mar 1 06:25:04 unity kernel: [669168.475393] R10: 0000000000007bea R11: 0000000000000001 R12: 00007f558925f8f0
Mar 1 06:25:04 unity kernel: [669168.475467] R13: ffff88040c10fe20 R14: 0000000000013cc0 R15: 0000000000000007
Mar 1 06:25:04 unity kernel: [669168.475542] FS: 00007f5589d03740(0000) GS:ffff8800cfdc0000(0000) knlGS:0000000000000000
Mar 1 06:25:04 unity kernel: [669168.475617] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 1 06:25:04 unity kernel: [669168.475674] CR2: 00000000801f0f1d CR3: 000000040eb35000 CR4: 00000000000026e0
Mar 1 06:25:04 unity kernel: [669168.475748] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 1 06:25:04 unity kernel: [669168.475821] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar 1 06:25:04 unity kernel: [669168.475895] Process apache2 (pid: 1832, threadinfo ffff88040c10e000, task ffff88040c1f2dc0)
Mar 1 06:25:04 unity kernel: [669168.475971] Stack:
Mar 1 06:25:04 unity kernel: [669168.476022] 00007f558925f8f0 ffff88040f155ec8 000000000000000f 0000000000000000
Mar 1 06:25:04 unity kernel: [669168.476225] ffff88040c10fe58 ffffffff8105f6dc ffff88040c10fe28 0000000700000286
Mar 1 06:25:04 unity kernel: [669168.476429] 0000000000000003 0000000181a4d7f0 ffff8804015d9850 0000000000000282
Mar 1 06:25:04 unity kernel: [669168.476634] Call Trace:
Mar 1 06:25:04 unity kernel: [669168.476689] [<ffffffff8105f6dc>] try_to_wake_up+0x3c/0x410
Mar 1 06:25:04 unity kernel: [669168.476747] [<ffffffff8105fb05>] wake_up_process+0x15/0x20
Mar 1 06:25:04 unity kernel: [669168.476806] [<ffffffff8126a590>] freeary+0x1e0/0x220
Mar 1 06:25:04 unity kernel: [669168.476863] [<ffffffff8126b610>] T.607+0xb0/0x100
Mar 1 06:25:04 unity kernel: [669168.476921] [<ffffffff81164055>] ? vfs_write+0x125/0x190
Mar 1 06:25:04 unity kernel: [669168.476979] [<ffffffff8126b6c9>] sys_semctl+0x69/0xa0
Mar 1 06:25:04 unity kernel: [669168.477036] [<ffffffff8100c042>] system_call_fastpath+0x16/0x1b
Mar 1 06:25:04 unity kernel: [669168.477094] Code: 00 48 c7 c3 c0 3c 01 00 49 89 fc 49 89 f5 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 49 89 55 00 49 8b 44 24 08 49 89 de <8b> 40 18 4c 03 34 c5 00 4b ac 81 4c 89 f7 e8 03 36 58 00 49 8b
Mar 1 06:25:04 unity kernel: [669168.479300] RIP [<ffffffff81051aba>] task_rq_lock+0x4a/0xa0
Mar 1 06:25:04 unity kernel: [669168.479391] RSP <ffff88040c10fdc8>
Mar 1 06:25:04 unity kernel: [669168.479444] CR2: 00000000801f0f1d
Mar 1 06:25:04 unity kernel: [669168.479497] ---[ end trace b2b87cfb63915f6c ]---
This happens QUITE OFTEN. Only solution: Sync Filesystem and power cycle (read: I can't reboot, I have to pull the plug! (well, pushing reset button or the same via MagicKey....)
Furthermore: Apache in this case will no longer answer, and won't be able to Stop, It goes zombie.
The System is still accessible, except for Apache - and Apache can't be braught back to live...
Can't say, if it is a memory issue, but note: This is a Server, it has ECC FB-DIMM Memory. Will have to do a memory check some time. But nothing in this regard has been seen in the logs of the daughter board.
Some System info:
Distributor ID: Ubuntu
Description: Ubuntu 10.04.4 LTS
Release: 10.04
Codename: lucid
*-memory
description: System Memory
physical id: 16
slot: System board or motherboard
size: 16GiB
*-bank:0
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 0
serial: 182194CD
slot: DIMM1A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:1
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 1
slot: DIMM1B
clock: 800MHz (1.2ns)
*-bank:2
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 2
serial: 1921B1FA
slot: DIMM2A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:3
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 3
slot: DIMM2B
clock: 800MHz (1.2ns)
*-bank:4
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 4
serial: 16213E76
slot: DIMM3A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:5
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 5
slot: DIMM3B
clock: 800MHz (1.2ns)
*-bank:6
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 6
serial: 1721CBB7
slot: DIMM4A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:7
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 7
slot: DIMM4B
clock: 800MHz (1.2ns)
*-cpu:0
description: CPU
product: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
vendor: Intel Corp.
physical id: 4
bus info: cpu@0
version: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
slot: LGA771/CPU1
size: 3GHz
width: 64 bits
clock: 1600MHz
capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx x86-64 constant_tsc arch_perfmon pebs bts
rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
*-cache:0
description: L1 cache
physical id: 6
slot: L1 Cache
size: 16KiB
capacity: 16KiB
capabilities: asynchronous internal write-back
*-cache:1
description: L2 cache
physical id: 7
slot: L2 Cache
size: 12MiB
capabilities: burst internal write-back
*-cpu:1
description: CPU
product: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
vendor: Intel Corp.
physical id: 5
bus info: cpu@1
version: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
slot: LGA771/CPU2
size: 3GHz
width: 64 bits
clock: 1600MHz
capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx x86-64 constant_tsc arch_perfmon pebs bts
rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
*-cache:0
description: L1 cache
physical id: 8
slot: L1 Cache
size: 16KiB
capacity: 16KiB
capabilities: asynchronous internal write-back
*-cache:1
description: L2 cache
physical id: 9
slot: L2 Cache
size: 12MiB
capabilities: burst internal write-back |
|
2012-03-02 16:11:36 |
Herton R. Krzesinski |
attachment added |
|
Testcase https://bugs.launchpad.net/ubuntu/+source/linux/+bug/943815/+attachment/2801657/+files/timedrm.cpp |
|
2012-03-02 16:13:35 |
Herton R. Krzesinski |
attachment added |
|
0001-ipc-sem.c-fix-race-with-concurrent-semtimedop-timeou.patch https://bugs.launchpad.net/ubuntu/+source/linux/+bug/943815/+attachment/2801660/+files/0001-ipc-sem.c-fix-race-with-concurrent-semtimedop-timeou.patch |
|
2012-03-02 16:51:10 |
Herton R. Krzesinski |
description |
SRU justification
=================
Impact
------
Kernel crash, due to race explained in upstream bug report: https://bugzilla.kernel.org/show_bug.cgi?id=27142
In practice likely to happen on a highly loaded webserver
Fix
---
Upstream commit d694ad62bf539dbb20a0899ac2a954555f9e4a83
Testcase
--------
https://bugzilla.kernel.org/attachment.cgi?id=66162
I'll attach to this bug as well.
- Build with gcc -o timedrm timedrm.cpp -lpthread
- Run with "test 250", sometimes you have to run more than one time to get the oops, but it's very easy to get the crash.
---------------------------------------------------------------------------------------
When logged in I saw:
unity kernel: [669168.472431] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
unity kernel: [669168.475971] Stack:
unity kernel: [669168.476634] Call Trace:
unity kernel: [669168.477094] Code: 00 48 c7 c3 c0 3c 01 00 49 89 fc 49 89 f5 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 49 89 55 00 49 8b 44 24 08 49 89 de <8b> 40 18 4c 03 34 c5 00 4b ac 81 4c 89 f7 e8 03 36 58 00 49 8b
unity kernel: [669168.479444] CR2: 00000000801f0f1d
In the log:
Mar 1 06:25:04 unity apache2[14216]: [Thu Mar 01 06:25:04 2012] [notice] SIGUSR1 received. Doing graceful restart
Mar 1 06:25:04 unity kernel: [669168.471999] BUG: unable to handle kernel paging request at 00000000801f0f1d
Mar 1 06:25:04 unity kernel: [669168.472131] IP: [<ffffffff81051aba>] task_rq_lock+0x4a/0xa0
Mar 1 06:25:04 unity kernel: [669168.472229] PGD 0
Mar 1 06:25:04 unity kernel: [669168.472312] Oops: 0000 [#1] SMP
Mar 1 06:25:04 unity kernel: [669168.472431] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
Mar 1 06:25:04 unity kernel: [669168.472508] CPU 7
Mar 1 06:25:04 unity kernel: [669168.472545] Modules linked in: ipt_MASQUERADE iptable_nat kvm_intel kvm ip6t_LOG xt_hl nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT ipt_LOG xt_limit xt_tcpudp ipt_addrtype xt_state
ip6table_filter ip6_tables radeon nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 ipmi_devintf nf_defrag_ipv4 ipmi_watchdog nf_conntrack_ftp psmouse nf_conntrack ttm drm_kms_helper ipmi_si drm ipt
able_filter serio_raw joydev i5400_edac edac_core ipmi_poweroff ip_tables ioatdma ipmi_msghandler i5k_amb lp i2c_algo_bit x_tables bridge stp parport shpchp usbhid hid usb_storage uas igb arcmsr dca
Mar 1 06:25:04 unity kernel: [669168.474703]
Mar 1 06:25:04 unity kernel: [669168.474756] Pid: 1832, comm: apache2 Not tainted 2.6.38-10-server #46~lucid1-Ubuntu Supermicro X7DWU/X7DWU
Mar 1 06:25:04 unity kernel: [669168.475004] RIP: 0010:[<ffffffff81051aba>] [<ffffffff81051aba>] task_rq_lock+0x4a/0xa0
Mar 1 06:25:04 unity kernel: [669168.475114] RSP: 0018:ffff88040c10fdc8 EFLAGS: 00010082
Mar 1 06:25:04 unity kernel: [669168.475171] RAX: 00000000801f0f05 RBX: 0000000000013cc0 RCX: 0000000000000002
Mar 1 06:25:04 unity kernel: [669168.475245] RDX: 0000000000000282 RSI: ffff88040c10fe20 RDI: 00007f558925f8f0
Mar 1 06:25:04 unity kernel: [669168.475320] RBP: ffff88040c10fde8 R08: 0000000000989680 R09: 000000000000028b
Mar 1 06:25:04 unity kernel: [669168.475393] R10: 0000000000007bea R11: 0000000000000001 R12: 00007f558925f8f0
Mar 1 06:25:04 unity kernel: [669168.475467] R13: ffff88040c10fe20 R14: 0000000000013cc0 R15: 0000000000000007
Mar 1 06:25:04 unity kernel: [669168.475542] FS: 00007f5589d03740(0000) GS:ffff8800cfdc0000(0000) knlGS:0000000000000000
Mar 1 06:25:04 unity kernel: [669168.475617] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 1 06:25:04 unity kernel: [669168.475674] CR2: 00000000801f0f1d CR3: 000000040eb35000 CR4: 00000000000026e0
Mar 1 06:25:04 unity kernel: [669168.475748] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 1 06:25:04 unity kernel: [669168.475821] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar 1 06:25:04 unity kernel: [669168.475895] Process apache2 (pid: 1832, threadinfo ffff88040c10e000, task ffff88040c1f2dc0)
Mar 1 06:25:04 unity kernel: [669168.475971] Stack:
Mar 1 06:25:04 unity kernel: [669168.476022] 00007f558925f8f0 ffff88040f155ec8 000000000000000f 0000000000000000
Mar 1 06:25:04 unity kernel: [669168.476225] ffff88040c10fe58 ffffffff8105f6dc ffff88040c10fe28 0000000700000286
Mar 1 06:25:04 unity kernel: [669168.476429] 0000000000000003 0000000181a4d7f0 ffff8804015d9850 0000000000000282
Mar 1 06:25:04 unity kernel: [669168.476634] Call Trace:
Mar 1 06:25:04 unity kernel: [669168.476689] [<ffffffff8105f6dc>] try_to_wake_up+0x3c/0x410
Mar 1 06:25:04 unity kernel: [669168.476747] [<ffffffff8105fb05>] wake_up_process+0x15/0x20
Mar 1 06:25:04 unity kernel: [669168.476806] [<ffffffff8126a590>] freeary+0x1e0/0x220
Mar 1 06:25:04 unity kernel: [669168.476863] [<ffffffff8126b610>] T.607+0xb0/0x100
Mar 1 06:25:04 unity kernel: [669168.476921] [<ffffffff81164055>] ? vfs_write+0x125/0x190
Mar 1 06:25:04 unity kernel: [669168.476979] [<ffffffff8126b6c9>] sys_semctl+0x69/0xa0
Mar 1 06:25:04 unity kernel: [669168.477036] [<ffffffff8100c042>] system_call_fastpath+0x16/0x1b
Mar 1 06:25:04 unity kernel: [669168.477094] Code: 00 48 c7 c3 c0 3c 01 00 49 89 fc 49 89 f5 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 49 89 55 00 49 8b 44 24 08 49 89 de <8b> 40 18 4c 03 34 c5 00 4b ac 81 4c 89 f7 e8 03 36 58 00 49 8b
Mar 1 06:25:04 unity kernel: [669168.479300] RIP [<ffffffff81051aba>] task_rq_lock+0x4a/0xa0
Mar 1 06:25:04 unity kernel: [669168.479391] RSP <ffff88040c10fdc8>
Mar 1 06:25:04 unity kernel: [669168.479444] CR2: 00000000801f0f1d
Mar 1 06:25:04 unity kernel: [669168.479497] ---[ end trace b2b87cfb63915f6c ]---
This happens QUITE OFTEN. Only solution: Sync Filesystem and power cycle (read: I can't reboot, I have to pull the plug! (well, pushing reset button or the same via MagicKey....)
Furthermore: Apache in this case will no longer answer, and won't be able to Stop, It goes zombie.
The System is still accessible, except for Apache - and Apache can't be braught back to live...
Can't say, if it is a memory issue, but note: This is a Server, it has ECC FB-DIMM Memory. Will have to do a memory check some time. But nothing in this regard has been seen in the logs of the daughter board.
Some System info:
Distributor ID: Ubuntu
Description: Ubuntu 10.04.4 LTS
Release: 10.04
Codename: lucid
*-memory
description: System Memory
physical id: 16
slot: System board or motherboard
size: 16GiB
*-bank:0
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 0
serial: 182194CD
slot: DIMM1A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:1
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 1
slot: DIMM1B
clock: 800MHz (1.2ns)
*-bank:2
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 2
serial: 1921B1FA
slot: DIMM2A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:3
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 3
slot: DIMM2B
clock: 800MHz (1.2ns)
*-bank:4
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 4
serial: 16213E76
slot: DIMM3A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:5
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 5
slot: DIMM3B
clock: 800MHz (1.2ns)
*-bank:6
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 6
serial: 1721CBB7
slot: DIMM4A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:7
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 7
slot: DIMM4B
clock: 800MHz (1.2ns)
*-cpu:0
description: CPU
product: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
vendor: Intel Corp.
physical id: 4
bus info: cpu@0
version: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
slot: LGA771/CPU1
size: 3GHz
width: 64 bits
clock: 1600MHz
capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx x86-64 constant_tsc arch_perfmon pebs bts
rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
*-cache:0
description: L1 cache
physical id: 6
slot: L1 Cache
size: 16KiB
capacity: 16KiB
capabilities: asynchronous internal write-back
*-cache:1
description: L2 cache
physical id: 7
slot: L2 Cache
size: 12MiB
capabilities: burst internal write-back
*-cpu:1
description: CPU
product: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
vendor: Intel Corp.
physical id: 5
bus info: cpu@1
version: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
slot: LGA771/CPU2
size: 3GHz
width: 64 bits
clock: 1600MHz
capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx x86-64 constant_tsc arch_perfmon pebs bts
rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
*-cache:0
description: L1 cache
physical id: 8
slot: L1 Cache
size: 16KiB
capacity: 16KiB
capabilities: asynchronous internal write-back
*-cache:1
description: L2 cache
physical id: 9
slot: L2 Cache
size: 12MiB
capabilities: burst internal write-back |
SRU justification
=================
Impact
------
Kernel crash, due to race explained in upstream bug report: https://bugzilla.kernel.org/show_bug.cgi?id=27142
In practice likely to happen on a highly loaded webserver
Fix
---
Upstream commit d694ad62bf539dbb20a0899ac2a954555f9e4a83
Testcase
--------
https://bugzilla.kernel.org/attachment.cgi?id=66162
It's attached to this bug as well.
- Build with gcc -o timedrm timedrm.cpp -lpthread
- Run with ./timedrm 250, sometimes you have to run more than one time to get the oops, but it's very easy to get the crash.
---------------------------------------------------------------------------------------
When logged in I saw:
unity kernel: [669168.472431] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
unity kernel: [669168.475971] Stack:
unity kernel: [669168.476634] Call Trace:
unity kernel: [669168.477094] Code: 00 48 c7 c3 c0 3c 01 00 49 89 fc 49 89 f5 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 49 89 55 00 49 8b 44 24 08 49 89 de <8b> 40 18 4c 03 34 c5 00 4b ac 81 4c 89 f7 e8 03 36 58 00 49 8b
unity kernel: [669168.479444] CR2: 00000000801f0f1d
In the log:
Mar 1 06:25:04 unity apache2[14216]: [Thu Mar 01 06:25:04 2012] [notice] SIGUSR1 received. Doing graceful restart
Mar 1 06:25:04 unity kernel: [669168.471999] BUG: unable to handle kernel paging request at 00000000801f0f1d
Mar 1 06:25:04 unity kernel: [669168.472131] IP: [<ffffffff81051aba>] task_rq_lock+0x4a/0xa0
Mar 1 06:25:04 unity kernel: [669168.472229] PGD 0
Mar 1 06:25:04 unity kernel: [669168.472312] Oops: 0000 [#1] SMP
Mar 1 06:25:04 unity kernel: [669168.472431] last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
Mar 1 06:25:04 unity kernel: [669168.472508] CPU 7
Mar 1 06:25:04 unity kernel: [669168.472545] Modules linked in: ipt_MASQUERADE iptable_nat kvm_intel kvm ip6t_LOG xt_hl nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT ipt_LOG xt_limit xt_tcpudp ipt_addrtype xt_state
ip6table_filter ip6_tables radeon nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4 ipmi_devintf nf_defrag_ipv4 ipmi_watchdog nf_conntrack_ftp psmouse nf_conntrack ttm drm_kms_helper ipmi_si drm ipt
able_filter serio_raw joydev i5400_edac edac_core ipmi_poweroff ip_tables ioatdma ipmi_msghandler i5k_amb lp i2c_algo_bit x_tables bridge stp parport shpchp usbhid hid usb_storage uas igb arcmsr dca
Mar 1 06:25:04 unity kernel: [669168.474703]
Mar 1 06:25:04 unity kernel: [669168.474756] Pid: 1832, comm: apache2 Not tainted 2.6.38-10-server #46~lucid1-Ubuntu Supermicro X7DWU/X7DWU
Mar 1 06:25:04 unity kernel: [669168.475004] RIP: 0010:[<ffffffff81051aba>] [<ffffffff81051aba>] task_rq_lock+0x4a/0xa0
Mar 1 06:25:04 unity kernel: [669168.475114] RSP: 0018:ffff88040c10fdc8 EFLAGS: 00010082
Mar 1 06:25:04 unity kernel: [669168.475171] RAX: 00000000801f0f05 RBX: 0000000000013cc0 RCX: 0000000000000002
Mar 1 06:25:04 unity kernel: [669168.475245] RDX: 0000000000000282 RSI: ffff88040c10fe20 RDI: 00007f558925f8f0
Mar 1 06:25:04 unity kernel: [669168.475320] RBP: ffff88040c10fde8 R08: 0000000000989680 R09: 000000000000028b
Mar 1 06:25:04 unity kernel: [669168.475393] R10: 0000000000007bea R11: 0000000000000001 R12: 00007f558925f8f0
Mar 1 06:25:04 unity kernel: [669168.475467] R13: ffff88040c10fe20 R14: 0000000000013cc0 R15: 0000000000000007
Mar 1 06:25:04 unity kernel: [669168.475542] FS: 00007f5589d03740(0000) GS:ffff8800cfdc0000(0000) knlGS:0000000000000000
Mar 1 06:25:04 unity kernel: [669168.475617] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 1 06:25:04 unity kernel: [669168.475674] CR2: 00000000801f0f1d CR3: 000000040eb35000 CR4: 00000000000026e0
Mar 1 06:25:04 unity kernel: [669168.475748] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 1 06:25:04 unity kernel: [669168.475821] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar 1 06:25:04 unity kernel: [669168.475895] Process apache2 (pid: 1832, threadinfo ffff88040c10e000, task ffff88040c1f2dc0)
Mar 1 06:25:04 unity kernel: [669168.475971] Stack:
Mar 1 06:25:04 unity kernel: [669168.476022] 00007f558925f8f0 ffff88040f155ec8 000000000000000f 0000000000000000
Mar 1 06:25:04 unity kernel: [669168.476225] ffff88040c10fe58 ffffffff8105f6dc ffff88040c10fe28 0000000700000286
Mar 1 06:25:04 unity kernel: [669168.476429] 0000000000000003 0000000181a4d7f0 ffff8804015d9850 0000000000000282
Mar 1 06:25:04 unity kernel: [669168.476634] Call Trace:
Mar 1 06:25:04 unity kernel: [669168.476689] [<ffffffff8105f6dc>] try_to_wake_up+0x3c/0x410
Mar 1 06:25:04 unity kernel: [669168.476747] [<ffffffff8105fb05>] wake_up_process+0x15/0x20
Mar 1 06:25:04 unity kernel: [669168.476806] [<ffffffff8126a590>] freeary+0x1e0/0x220
Mar 1 06:25:04 unity kernel: [669168.476863] [<ffffffff8126b610>] T.607+0xb0/0x100
Mar 1 06:25:04 unity kernel: [669168.476921] [<ffffffff81164055>] ? vfs_write+0x125/0x190
Mar 1 06:25:04 unity kernel: [669168.476979] [<ffffffff8126b6c9>] sys_semctl+0x69/0xa0
Mar 1 06:25:04 unity kernel: [669168.477036] [<ffffffff8100c042>] system_call_fastpath+0x16/0x1b
Mar 1 06:25:04 unity kernel: [669168.477094] Code: 00 48 c7 c3 c0 3c 01 00 49 89 fc 49 89 f5 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 49 89 55 00 49 8b 44 24 08 49 89 de <8b> 40 18 4c 03 34 c5 00 4b ac 81 4c 89 f7 e8 03 36 58 00 49 8b
Mar 1 06:25:04 unity kernel: [669168.479300] RIP [<ffffffff81051aba>] task_rq_lock+0x4a/0xa0
Mar 1 06:25:04 unity kernel: [669168.479391] RSP <ffff88040c10fdc8>
Mar 1 06:25:04 unity kernel: [669168.479444] CR2: 00000000801f0f1d
Mar 1 06:25:04 unity kernel: [669168.479497] ---[ end trace b2b87cfb63915f6c ]---
This happens QUITE OFTEN. Only solution: Sync Filesystem and power cycle (read: I can't reboot, I have to pull the plug! (well, pushing reset button or the same via MagicKey....)
Furthermore: Apache in this case will no longer answer, and won't be able to Stop, It goes zombie.
The System is still accessible, except for Apache - and Apache can't be braught back to live...
Can't say, if it is a memory issue, but note: This is a Server, it has ECC FB-DIMM Memory. Will have to do a memory check some time. But nothing in this regard has been seen in the logs of the daughter board.
Some System info:
Distributor ID: Ubuntu
Description: Ubuntu 10.04.4 LTS
Release: 10.04
Codename: lucid
*-memory
description: System Memory
physical id: 16
slot: System board or motherboard
size: 16GiB
*-bank:0
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 0
serial: 182194CD
slot: DIMM1A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:1
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 1
slot: DIMM1B
clock: 800MHz (1.2ns)
*-bank:2
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 2
serial: 1921B1FA
slot: DIMM2A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:3
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 3
slot: DIMM2B
clock: 800MHz (1.2ns)
*-bank:4
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 4
serial: 16213E76
slot: DIMM3A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:5
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 5
slot: DIMM3B
clock: 800MHz (1.2ns)
*-bank:6
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns)
vendor: 9801
physical id: 6
serial: 1721CBB7
slot: DIMM4A
size: 4GiB
width: 64 bits
clock: 800MHz (1.2ns)
*-bank:7
description: DIMM DDR2 FB-DIMM Synchronous 800 MHz (1.2 ns) [empty]
physical id: 7
slot: DIMM4B
clock: 800MHz (1.2ns)
*-cpu:0
description: CPU
product: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
vendor: Intel Corp.
physical id: 4
bus info: cpu@0
version: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
slot: LGA771/CPU1
size: 3GHz
width: 64 bits
clock: 1600MHz
capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx x86-64 constant_tsc arch_perfmon pebs bts
rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
*-cache:0
description: L1 cache
physical id: 6
slot: L1 Cache
size: 16KiB
capacity: 16KiB
capabilities: asynchronous internal write-back
*-cache:1
description: L2 cache
physical id: 7
slot: L2 Cache
size: 12MiB
capabilities: burst internal write-back
*-cpu:1
description: CPU
product: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
vendor: Intel Corp.
physical id: 5
bus info: cpu@1
version: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz
slot: LGA771/CPU2
size: 3GHz
width: 64 bits
clock: 1600MHz
capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx x86-64 constant_tsc arch_perfmon pebs bts
rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
*-cache:0
description: L1 cache
physical id: 8
slot: L1 Cache
size: 16KiB
capacity: 16KiB
capabilities: asynchronous internal write-back
*-cache:1
description: L2 cache
physical id: 9
slot: L2 Cache
size: 12MiB
capabilities: burst internal write-back |
|
2012-03-02 21:34:51 |
Tim Gardner |
linux (Ubuntu Natty): status |
In Progress |
Fix Committed |
|
2012-03-07 15:47:08 |
Herton R. Krzesinski |
tags |
apport-collected lucid needs-upstream-testing |
apport-collected lucid needs-upstream-testing verification-needed-natty |
|
2012-03-09 06:15:21 |
Launchpad Janitor |
branch linked |
|
lp:ubuntu/lucid-proposed/linux-lts-backport-natty |
|
2012-03-09 18:03:16 |
Herton R. Krzesinski |
tags |
apport-collected lucid needs-upstream-testing verification-needed-natty |
apport-collected lucid needs-upstream-testing verification-done-natty verification-needed-natty |
|
2012-03-09 18:03:26 |
Herton R. Krzesinski |
tags |
apport-collected lucid needs-upstream-testing verification-done-natty verification-needed-natty |
apport-collected lucid needs-upstream-testing verification-done-natty |
|
2012-03-22 14:50:12 |
Launchpad Janitor |
linux (Ubuntu Natty): status |
Fix Committed |
Fix Released |
|
2012-03-22 14:50:12 |
Launchpad Janitor |
cve linked |
|
2011-4347 |
|