Comment 0 for bug 1682644

Revision history for this message
bugproxy (bugproxy) wrote :

---Problem Description---
IPR driver causes multipath to fail paths/stuck IO on Medium Errors

This problem is resolved with this upstream accepted patch, scheduled for 4.11.
The detailed problem description and resolution are described in the commit message.

> scsi: ipr: do not set DID_PASSTHROUGH on CHECK CONDITION
> http://git.kernel.org/?p=linux/kernel/git/jejb/scsi.git;a=commit;h=785a470496d8e0a32e3d39f376984eb2c98ca5b3

Please apply to 17.04 and 16.04.

The business justification for the SRU is:

Clients with a dual-controller multipathed IPR configuration that eventually runs into failing disk/sectors, will experience an I/O hang once the drive reports a Medium Error, which can hang an application or even the root filesystem (whatever is doing I/O to the failing drive), potentially hanging the system.

Thanks.

---Additional Hardware Info---
Dual (IPR) controller setup, multipath enabled

---Steps to Reproduce---
1) Use a disk with bad sectors (or force such condition, via internal/special tools)
2) Multipath that disk
3) Run IO to the multipath device on the bad sectors
4) Both paths will be failed, and IO is stuck due to queue_if_no_path (enabled by default for IPR)

The detailed problem description and resolution are described in the commit message.