hisi_sas: Failures during host reset
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Fix Released
|
Undecided
|
dann frazier | ||
Bionic |
Fix Released
|
Undecided
|
dann frazier |
Bug Description
[Impact]
When error handling progresses to host reset, several issues may prevent the system from recovering, therefore requiring a system power cycle.
[Test Case]
$ iozone -a &
$ while :; do sudo sg_reset --device /dev/sda; sleep 5; done
[Fix]
There is a race between device removal and host reset which can cause the driver to hang, preventing the user from accessing attached devices until a reboot. Fix this by adding locking around the critical path.
After a soft host reset, commands maybe sent to the device before the hardware is ready to receive them. This can result in additional errors when the user access the device. Fix this by blocking commands until the hardware has been reinitialized.
Stale PHY events may still get processed by the driver after reset. This can cause e.g. ports to be detached because an old pre-reset "phy down" event gets processed, causing the user to lose access to attached devices. Fix this by filtering out pre-reset PHY events.
Resource starvation can occur after a "clear nexus ha" reset. Fix this by releasing those resources during the reset.
[Regression Risk]
The required fixes are localized to the hisi_sas driver. This driver is only used by two platforms supported by Ubuntu: HiSilicon D05 and HiSilicon D06. We will directly verify these fixes on those platforms.
Changed in linux (Ubuntu): | |
status: | New → In Progress |
Changed in linux (Ubuntu Bionic): | |
status: | New → In Progress |
Changed in linux (Ubuntu): | |
assignee: | nobody → dann frazier (dannf) |
Changed in linux (Ubuntu Bionic): | |
assignee: | nobody → dann frazier (dannf) |
Changed in linux (Ubuntu Bionic): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu): | |
status: | In Progress → Fix Released |
This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification- needed- bionic' to 'verification- done-bionic' . If the problem still exists, change the tag 'verification- needed- bionic' to 'verification- failed- bionic' .
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.
See https:/ /wiki.ubuntu. com/Testing/ EnableProposed for documentation how to enable and use -proposed. Thank you!