Kernel Oops - unable to handle kernel NULL pointer dereference; EIP is at mptsas_probe_expander_phys+0x72/0x610 [mptsas]
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
High
|
Stefan Bader |
Bug Description
Binary package hint: linux-source-2.6.24
Installing hardy alpha 6 on a Dell PowerEdge M600, the installer kernel panics when modprobing mptsas:
[ 176.439736] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000010 │
[ 176.448254] printing eip: f89cf622 *pde = 00000000 ───────
[ 176.453140] Oops: 0000 [#1] SMP
[ 176.456375] Modules linked in: sg mptsas mptscsih mptbase scsi_transport_sas af_packet sr_mod cdrom sd_mod rsrc_nonstatic pcmcia_core usbserial usbhid hid usbkbd fan usb_storage scsi_mod libusual ehci_hcd thermal evdev psmouse uhci_hcd bnx2 usbcore processor
[ 176.479455]
[ 176.480939] Pid: 10954, comm: modprobe Not tainted (2.6.24-12-generic #1)
[ 176.487704] EIP: 0060:[<f89cf622>] EFLAGS: 00010246 CPU: 7
[ 176.493174] EIP is at mptsas_
[ 176.499508] EAX: 00000010 EBX: df993bc0 ECX: f6e29df4 EDX: 00000288
[ 176.505755] ESI: df900800 EDI: dfa5f800 EBP: 00000000 ESP: f6e29d34
[ 176.512001] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 176.517381] Process modprobe (pid: 10954, ti=f6e28000 task=f6819680 task.ti=f6e28000)
[ 176.525013] Stack: 0000ffff 61747300 00656c6c 00002aca f6e29df4 df900800 00000001 00000000
[ 176.533428] 00000000 00000000 00100100 00200200 00000000 00200200 ffff124f f89e0360
[ 176.541845] df900800 f7c8e000 00000000 70646f6d 65626f72 61747300 00656c6c 00002aca
[ 176.550265] Call Trace:
[ 176.552885] [<f89e0360>] mpt_timer_
[ 176.558713] [<f89d17af>] mptsas_
[ 176.564279] [<c01d47c3>] sysfs_create_
[ 176.569414] [<c0223526>] pci_device_
[ 176.574372] [<c027eab8>] driver_
[ 176.579680] [<c0212290>] kobject_
[ 176.584901] [<c027ed2e>] __driver_
[ 176.589774] [<c027deeb>] bus_for_
[ 176.594733] [<c027e936>] driver_
[ 176.599431] [<c027ec90>] __driver_
[ 176.604215] [<c027e26a>] bus_add_
[ 176.606754] [<c02236d6>] __pci_register_
[ 176.611638] [<f884a0c2>] mptsas_
[ 176.617041] [<c01516c6>] sys_init_
[ 176.622363] [<c0105442>] syscall_
[ 176.622367] =======
[ 176.622367] Code: 89 44 24 20 74 1e 8b 43 0c e8 8b aa 7b c7 89 d8 e8 84 aa 7b c7 8b 44 24 20 81 c4 8c 00 00 00 5b 5e 5f 5d c3 8b 43 0c 8b 4c 24 10 <0f> b7 00 89 01 8b 44 24 14 05 3c 05 00 00 89 44 24 18 e8 e7 8a
[ 176.622379] EIP: [<f89cf622>] mptsas_
[ 176.622386] ---[ end trace d5327034c75fc0a7 ]---
[ 176.688271] Intel ISA PCIC probe: not found.
It looks like this is a known issue and is fixed in 2.6.25-rc5. See Bugzilla for details: http://
I've attached the Bugzilla patch, just for additional info, but I haven't tried applying this to 2.6.24 myself to see if it works.
I'm guessing hardy is frozen at 2.6.24 - any chance of this fix being backported?
Originally reported as bug #204328, however I realized I filed it against linux-meta, rather than linux-source-
Changed in linux: | |
assignee: | nobody → stefan-bader-canonical |
Changed in linux: | |
status: | Triaged → In Progress |
Changed in linux: | |
status: | In Progress → Fix Committed |
A few more data points on this. I consistently get this oops on our new Dell PowerEdge M600, but do not get the oops on a Dell PowerEdge 1955. They are the same model card, but slightly different revisions:
Dell M600: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08)
Dell 1955: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01)
I applied the upstream patch to the 2.6.24-12 sources and re-compiled the module. The patch location was slightly different, but it patched and compiled cleanly. I then swapped in this patched module after the "Loading Additional Components" step, but before the "Configuring the Clock." With the patched module, the installer is able to load the module and the installation proceeds normally. Of course, the unpatched module gets installed in /target, so I have to swap it in there before the reboot so that the machine does not panic at boot. :-)
I'm not sure what the next steps are to help the kernel team get this patched, but I'm happy to provide any additional information and/or assistance. If this could make it into hardy final, we'd love to see this, since we're trying to move our servers to the LTS release.
I've attached the actual patch I used on the hardy linux 2.6.24-12 sources this morning.