SMP kernel fails to boot most of the time
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
I've had this problem for a long time now, on 8.10, 9.04, and now 9.10, and it hasn't been solved. As you can see by my output, I have a quad-core AMD processor. However, most of the time it fails to boot. I've turned off the quiet kernel option, so I can see what happens, and if it tries to boot more that one core (but less than all four) it fails to boot. I get a "not responding" message when booting, and after I get the kernel message along the lines of "booting x cores (yyyyy.y BogoMIPS)" the boot hangs (where 1 < x < 4). I will try to capture the exact output, but since I don't have access to a shell when this happens, I can't easily grab it.
The data collected on this machine for this ticket was created on a clean, one-core boot. I enabled these options to the kernel: noapic nolapic acpi=noirq pci=noirq, but that disabled SMP (and hence, I was able to boot). When I upgraded this system from Jaunty to Karmic, it booted all four cores fine, so I know it can work. I'm pretty sure it's a hardware bug, due to the fact that this motherboard manufacturer doesn't explicitly advertise that this motherboard supports this processor. My own fault for not paying close attention to that. I'm looking for a workaround, perhaps the kernel boot line magic to get this working every time.
Thanks in advance!
ProblemType: Bug
Architecture: amd64
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/
/dev/snd/
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
Card hw:0 'NVidia'/'HDA NVidia at 0xfe020000 irq 11'
Mixer name : 'Nvidia MCP78 HDMI'
Components : 'HDA:10ec0888,
Controls : 40
Simple ctrls : 20
Card1.Amixer.info:
Card hw:1 'U0x46d0x9a5'/'USB Device 0x46d:0x9a5 at usb-0000:00:02.1-5, high speed'
Mixer name : 'USB Mixer'
Components : 'USB046d:09a5'
Controls : 2
Simple ctrls : 1
Card1.Amixer.
Simple mixer control 'Mic',0
Capabilities: cvolume cvolume-joined cswitch cswitch-joined
Capture channels: Mono
Limits: Capture 0 - 3072
Mono: Capture 0 [0%] [23.00dB] [on]
Date: Sun Jan 31 15:15:03 2010
DistroRelease: Ubuntu 9.10
HibernationDevice: RESUME=
IwConfig:
lo no wireless extensions.
eth0 no wireless extensions.
MachineType: Shuttle Inc SN78S
NonfreeKernelMo
Package: linux-image-
ProcCmdLine: root=UUID=
ProcEnviron:
LANG=en_US.UTF-8
SHELL=/usr/bin/zsh
ProcVersionSign
RelatedPackageV
linux-
linux-firmware 1.25
RfKill:
SourcePackage: linux
Uname: Linux 2.6.31-17-generic x86_64
WpaSupplicantLog:
XsessionErrors:
(gnome-
(polkit-
(nautilus:2826): Eel-CRITICAL **: eel_preferences
dmi.bios.date: 11/06/2008
dmi.bios.vendor: Phoenix Technologies, LTD
dmi.bios.version: 6.00 PG
dmi.board.name: FN78S
dmi.board.vendor: Shuttle Inc
dmi.board.version: V10
dmi.chassis.type: 3
dmi.chassis.vendor: Shuttle Inc
dmi.chassis.
dmi.modalias: dmi:bvnPhoenixT
dmi.product.name: SN78S
dmi.product.
dmi.sys.vendor: Shuttle Inc
description: | updated |
Changed in linux (Ubuntu): | |
status: | Incomplete → Invalid |
Here's a (retyped) snippet of the kernel boot log when it fails to activate all four cores. This is incomplete, and it's just what remains on the screen when it hangs on boot:
[ 0.010000] CPU 1/0x1 -> Node 0
[ 0.010000] CPU: Physical Processor ID: 0
[ 0.010000] CPU: Processor Core ID: 1
[ 0.010000] mce: CPU supports 6 MCE banks
[ 0.010000] x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
[ 0.180554] CPU1: AMD Phenom(tm) 9950 Quad-Core Processor stepping 03
[ 0.182409] checking TSC synchronization [CPU#0 -> CPU#1]: passed.
[ 0.190055] Booting processor 2 APIC 0x2 ip 0x6000
[ 0.010000] Initializing CPU#2
[ 0.010000] Calibrating delay using timer specific routine.. 5174.78 BogoMIPS (lpj=25873929)
[ 0.010000] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[ 0.010000] CPU: L2 Cache: 512K (64 bytes/line)
[ 0.010000] CPU: 2/0x2 -> Node 0
[ 0.010000] CPU: Physical Processor ID: 0
[ 0.010000] CPU: Processor Core ID: 3
[ 0.010000] mce: CPU supports 6 MCE banks
[ 0.010000] x86 PAT enabled: cpu 2, old 0x7040600060406, new 0x7010600070106
[ 0.350647] CPU2: AMD Phenom(tm) 9950 Quad-Core Processor stepping 03
[ 0.351196] checking TSC synchronization [CPU#0 -> CPU#2]: passed.
[ 0.360051] Booting processor 3 APIC 0x3 ip 0x6000
[ 5.804396] Not responding
[ 5.804518] Brought up 3 CPUs
[ 5.804576] Total of 3 processors activated (15550.23 BogoMIPS)
At this point, the system just hangs. It never continues beyond this. Sometimes, I get the dreaded "Not responding" message on the second core, most of the time when that happens it continues to boot in single core mode. If it stops on the third or fourth cores, it never boots. If I don't get the "Not responding" message, it boots rather quickly, and all four cores are brought up.