Comment 7 for bug 140032

Revision history for this message
Pavel Zheltouhov (pwlnw) wrote : Re: IOCStatus(0x004b): SCSI IOC Terminated

VmWare Server2.0 on Host Interpid 8.10 2.6.27-9-server x86_64 and same for guest.
I install linux-image-virtual and linux-virtual packages.
I reproduce this bug many times under heavy disk load.
It seams we need option on driver mptscsih or in filesystems which turn off timeouts and verification.
Patching kernel for package linux-image-virtual will help too.

Here is my syslog .

Sometimes all goes good, driver succesfully reset

Jan 24 19:37:22 uadb kernel: [17989.370267] mptscsih: ioc0: attempting task abort! (sc=ffff88006ac123c0)
Jan 24 19:37:22 uadb kernel: [17989.370268] sd 2:0:2:0: [sdc] CDB: Write(10): 2a 00 01 43 d0 37 00 04 00 00
Jan 24 19:37:22 uadb kernel: [17989.370273] mptscsih: ioc0: task abort: SUCCESS (sc=ffff88006ac123c0)
Jan 24 19:37:22 uadb kernel: [17989.370292] mptscsih: ioc0: attempting task abort! (sc=ffff88006ac12640)
Jan 24 19:37:22 uadb kernel: [17989.370294] sd 2:0:2:0: [sdc] CDB: Write(10): 2a 00 01 43 d4 37 00 00 08 00
Jan 24 19:37:22 uadb kernel: [17989.370298] mptscsih: ioc0: task abort: SUCCESS (sc=ffff88006ac12640)
Jan 24 19:37:22 uadb kernel: [17989.370318] mptscsih: ioc0: attempting task abort! (sc=ffff88006ac12c80)
Jan 24 19:37:22 uadb kernel: [17989.370319] sd 2:0:2:0: [sdc] CDB: Write(10): 2a 00 01 43 d4 3f 00 00 90 00
Jan 24 19:37:22 uadb kernel: [17989.370324] mptscsih: ioc0: task abort: SUCCESS (sc=ffff88006ac12c80)
Jan 24 19:37:22 uadb kernel: [17989.370344] mptscsih: ioc0: attempting task abort! (sc=ffff88003788d140)
Jan 24 19:37:22 uadb kernel: [17989.370346] sd 2:0:1:0: [sdb] CDB: Read(10): 28 00 00 be b5 17 00 00 08 00
Jan 24 19:37:22 uadb kernel: [17991.772561] mptbase: ioc0: Initiating recovery
Jan 24 19:37:22 uadb kernel: [17993.342132] mptscsih: ioc0: Issue of TaskMgmt failed!
Jan 24 19:37:22 uadb kernel: [17993.342224] mptscsih: ioc0: task abort: FAILED (sc=ffff88003788d140)
Jan 24 19:37:22 uadb kernel: [17993.342226] mptscsih: ioc0: attempting task abort! (sc=ffff88003788da00)
Jan 24 19:37:22 uadb kernel: [17993.342229] sd 2:0:1:0: [sdb] CDB: Read(10): 28 00 00 be b5 2f 00 00 08 00
Jan 24 19:37:22 uadb kernel: [17993.342235] mptscsih: ioc0: task abort: SUCCESS (sc=ffff88003788da00)
Jan 24 19:37:22 uadb kernel: [17993.342362] mptscsih: ioc0: attempting target reset! (sc=ffff88003788d140)
Jan 24 19:37:22 uadb kernel: [17993.342364] sd 2:0:1:0: [sdb] CDB: Read(10): 28 00 00 be b5 17 00 00 08 00
Jan 24 19:37:22 uadb kernel: [17993.342828] scsi target2:0:0: Beginning Domain Validation
Jan 24 19:37:22 uadb kernel: [17993.620049] mptscsih: ioc0: target reset: SUCCESS (sc=ffff88003788d140)

but in really heavy load :

Jan 24 20:51:49 uadb kernel: [22458.461947] mptscsih: ioc0: bus reset: FAILED (sc=ffff880068b81280)
Jan 24 20:51:49 uadb kernel: [22458.461958] mptscsih: ioc0: attempting host reset! (sc=ffff880068b81280)
Jan 24 20:51:49 uadb kernel: [22458.461974] mptbase: ioc0: Initiating recovery
Jan 24 20:51:49 uadb kernel: [22459.601222] sd 2:0:1:0: mptscsih: ioc0: completing cmds: fw_channel 0, fw_id 1, sc=ffff880068b818c0, mf = ffff880174583e80, idx=3c
Jan 24 20:51:49 uadb kernel: [22459.601236] sd 2:0:1:0: mptscsih: ioc0: completing cmds: fw_channel 0, fw_id 1, sc=ffff880068b81dc0, mf = ffff880174584d20, idx=63
Jan 24 20:51:49 uadb kernel: [22459.601241] sd 2:0:1:0: mptscsih: ioc0: completing cmds: fw_channel 0, fw_id 1, sc=ffff880068b81a00, mf = ffff880174584e40, idx=66
Jan 24 20:51:49 uadb kernel: [22459.601253] sd 2:0:1:0: mptscsih: ioc0: completing cmds: fw_channel 0, fw_id 1, sc=ffff880068b81280, mf = ffff880174585620, idx=7b
Jan 24 20:51:49 uadb kernel: [22460.000277] mptscsih: ioc0: host reset: SUCCESS (sc=ffff880068b81280)
Jan 24 20:51:49 uadb kernel: [22460.000286] sd 2:0:1:0: Device offlined - not ready after error recovery
Jan 24 20:51:49 uadb kernel: [22460.000288] sd 2:0:1:0: Device offlined - not ready after error recovery
Jan 24 20:51:49 uadb kernel: [22460.000289] sd 2:0:1:0: Device offlined - not ready after error recovery
Jan 24 20:51:49 uadb kernel: [22460.000290] sd 2:0:1:0: Device offlined - not ready after error recovery
Jan 24 20:51:49 uadb kernel: [22460.000309] sd 2:0:1:0: rejecting I/O to offline device
Jan 24 20:51:49 uadb kernel: [22460.010086] Buffer I/O error on device sdb1, logical block 2844192
Jan 24 20:51:49 uadb kernel: [22460.010086] lost page write due to I/O error on sdb1
Jan 24 20:51:49 uadb kernel: [22460.010086] Buffer I/O error on device sdb1, logical block 2844193
Jan 24 20:51:49 uadb kernel: [22460.010086] sd 2:0:1:0: rejecting I/O to offline device
Jan 24 20:51:49 uadb last message repeated 107 times
Jan 24 20:51:49 uadb kernel: [22460.713791] sd 2:0:1:0: rejecting I/O to offline device

....
Jan 24 20:51:49 uadb kernel: [22460.762970] sd 2:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
Jan 24 20:51:49 uadb kernel: [22460.762982] end_request: I/O error, dev sdb, sector 18966103
Jan 24 20:51:49 uadb kernel: [22460.763012] Aborting journal on device sdb1.
Jan 24 20:51:49 uadb kernel: [22460.763018] sd 2:0:1:0: rejecting I/O to offline device
Jan 24 20:51:49 uadb kernel: [22460.763022] sd 2:0:1:0: rejecting I/O to offline device
Jan 24 20:51:49 uadb kernel: [22460.763724] sd 2:0:1:0: rejecting I/O to offline device
Jan 24 20:51:49 uadb kernel: [22460.764010] sd 2:0:1:0: rejecting I/O to offline device
Jan 24 20:51:49 uadb kernel: [22460.764034] ext3_abort called.
Jan 24 20:51:49 uadb kernel: [22460.764039] EXT3-fs error (device sdb1): ext3_journal_start_sb: Detected aborted journal
Jan 24 20:51:49 uadb kernel: [22460.764044] Remounting filesystem read-only