Activity log for bug #1559609

Date Who What changed Old value New value Message
2016-03-20 01:04:50 Martin Koniczek bug added bug
2016-03-20 01:30:09 Brad Figg linux (Ubuntu): status New Incomplete
2016-03-20 01:45:33 Martin Koniczek linux (Ubuntu): status Incomplete Confirmed
2016-03-20 17:43:08 Andy Whitcroft linux (Ubuntu): importance Undecided High
2016-03-20 17:43:10 Andy Whitcroft linux (Ubuntu): assignee Andy Whitcroft (apw)
2016-03-20 17:43:15 Andy Whitcroft linux (Ubuntu): milestone ubuntu-16.03
2016-03-20 17:47:20 Andy Whitcroft description tested the latest xenial iso on a file server featuring an ARC-1882ix-24 RAID controller, and got weird timeout issues, followed by complete loss of access to anything connected to the RAID controller. The timeouts occur after a random amount of uptime (sometimes minutes, sometimes days), for example: kernel: [1665409.969229] arcmsr2: abort device command of scsi id = 0 lun = 1 kernel: [1665411.727535] arcmsr2: scsi id = 0 lun = 1 ccb = '0xffff884fe008e780' poll command abort successfully kernel: [1665411.727885] arcmsr2: abort device command of scsi id = 0 lun = 1 kernel: [1665411.727898] arcmsr2: abort device command of scsi id = 0 lun = 1 kernel: [1665413.138235] arcmsr2: scsi id = 0 lun = 1 ccb = '0xffff884fe0012300' poll command abort successfully ... kernel: [1665445.804546] arcmsr: executing bus reset eh.....num_resets = 2, num_aborts = 146 kernel: [1665455.851353] arcmsr2: pCCB ='0xffff884fe002a700' isr got aborted command kernel: [1665455.851366] arcmsr2: pCCB ='0xffff884fe01c0a00' isr got aborted command kernel: [1665455.851373] arcmsr2: isr get an illegal ccb command #011#011#011#011done acb = '0xffff884fe0b8c798'ccb = '0xffff884fe00e9680' ccbacb = '0xffff884fe0b8c798' startdone = 0x0 ccboutstandingcount = -1 kernel: [1665455.851378] arcmsr2: isr get an illegal ccb command #011#011#011#011done acb = '0xffff884fe0b8c798'ccb = '0xffff884fe0070280' ccbacb = '0xffff884fe0b8c798' startdone = 0x0 ccboutstandingcount = -1 ... kernel: [1665455.852655] sd 2:0:0:3: [sdd] Medium access timeout failure. Offlining disk! kernel: [1665455.890032] sd 2:0:0:4: [sde] Medium access timeout failure. Offlining disk! kernel: [1665455.926613] sd 2:0:0:1: [sdb] Medium access timeout failure. Offlining disk! kernel: [1665455.963288] sd 2:0:0:2: [sdc] Medium access timeout failure. Offlining disk! some digging revealed that mainline 4.4 as well as xenial's 4.4.0-14-generic still feature an old, buggy arcmsr driver v1.30.00.04-20140919, which claims to "supports" the 1882, but does not really... Areca seems to have managed to get a fixed driver into mainline 4.5 (version v1.30.00.22-20151126), and it seems to be a small patch on arcmsr.h and a large one on arcmsr_hba.c, and upon a first glance, I didn't see anything 4.5-specific in the code: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/diff/drivers/scsi/arcmsr/arcmsr.h?id=v4.5&id2=v4.4 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/diff/drivers/scsi/arcmsr/arcmsr_hba.c?id=v4.5&id2=v4.4 Note that we are using v1.30.0X.21-20151016 (as provided by Areca.com.tw) on productive 14.04.4 LTS servers featuring ARC1882 controllers, so chances are good that version 22 (as included in 4.5 mainline) to work well. This would not only allow ARC-188x controllers to work properly with Xenial out-of-the-box, it should also add support for the (somewhat popular?) ARC-1203 series tested the latest xenial iso on a file server featuring an ARC-1882ix-24 RAID controller, and got weird timeout issues, followed by complete loss of access to anything connected to the RAID controller. The timeouts occur after a random amount of uptime (sometimes minutes, sometimes days), for example: kernel: [1665409.969229] arcmsr2: abort device command of scsi id = 0 lun = 1 kernel: [1665411.727535] arcmsr2: scsi id = 0 lun = 1 ccb = '0xffff884fe008e780' poll command abort successfully kernel: [1665411.727885] arcmsr2: abort device command of scsi id = 0 lun = 1 kernel: [1665411.727898] arcmsr2: abort device command of scsi id = 0 lun = 1 kernel: [1665413.138235] arcmsr2: scsi id = 0 lun = 1 ccb = '0xffff884fe0012300' poll command abort successfully ... kernel: [1665445.804546] arcmsr: executing bus reset eh.....num_resets = 2, num_aborts = 146 kernel: [1665455.851353] arcmsr2: pCCB ='0xffff884fe002a700' isr got aborted command kernel: [1665455.851366] arcmsr2: pCCB ='0xffff884fe01c0a00' isr got aborted command kernel: [1665455.851373] arcmsr2: isr get an illegal ccb command #011#011#011#011done acb = '0xffff884fe0b8c798'ccb = '0xffff884fe00e9680' ccbacb = '0xffff884fe0b8c798' startdone = 0x0 ccboutstandingcount = -1 kernel: [1665455.851378] arcmsr2: isr get an illegal ccb command #011#011#011#011done acb = '0xffff884fe0b8c798'ccb = '0xffff884fe0070280' ccbacb = '0xffff884fe0b8c798' startdone = 0x0 ccboutstandingcount = -1 ... kernel: [1665455.852655] sd 2:0:0:3: [sdd] Medium access timeout failure. Offlining disk! kernel: [1665455.890032] sd 2:0:0:4: [sde] Medium access timeout failure. Offlining disk! kernel: [1665455.926613] sd 2:0:0:1: [sdb] Medium access timeout failure. Offlining disk! kernel: [1665455.963288] sd 2:0:0:2: [sdc] Medium access timeout failure. Offlining disk! some digging revealed that mainline 4.4 as well as xenial's 4.4.0-14-generic still feature an old, buggy arcmsr driver v1.30.00.04-20140919, which claims to "supports" the 1882, but does not really... Areca seems to have managed to get a fixed driver into mainline 4.5 (version v1.30.00.22-20151126), and it seems to be a small patch on arcmsr.h and a large one on arcmsr_hba.c, and upon a first glance, I didn't see anything 4.5-specific in the code: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/diff/drivers/scsi/arcmsr/arcmsr.h?id=v4.5&id2=v4.4 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/diff/drivers/scsi/arcmsr/arcmsr_hba.c?id=v4.5&id2=v4.4 Note that we are using v1.30.0X.21-20151016 (as provided by Areca.com.tw) on productive 14.04.4 LTS servers featuring ARC1882 controllers, so chances are good that version 22 (as included in 4.5 mainline) to work well. This would not only allow ARC-188x controllers to work properly with Xenial out-of-the-box, it should also add support for the (somewhat popular?) ARC-1203 series === Kernel-Description: update arcmsr to version v1.30.00.22-20151126 to fix card timeouts
2016-03-21 16:25:57 Tim Gardner nominated for series Ubuntu Xenial
2016-03-21 16:25:57 Tim Gardner bug task added linux (Ubuntu Xenial)
2016-03-21 16:27:25 Tim Gardner linux (Ubuntu Xenial): status Confirmed Fix Committed
2016-03-21 16:27:25 Tim Gardner linux (Ubuntu Xenial): assignee Andy Whitcroft (apw) Tim Gardner (timg-tpi)
2016-03-29 16:43:09 Launchpad Janitor linux (Ubuntu Xenial): status Fix Committed Fix Released