[sas-1126]scsi: hisi_sas: Fix the conflict between device gone and host reset
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
kunpeng920 |
Fix Released
|
Undecided
|
Unassigned | ||
Ubuntu-18.04 |
Fix Released
|
Undecided
|
Ike Panhc | ||
Ubuntu-18.04-hwe |
Fix Released
|
Undecided
|
Ike Panhc | ||
Ubuntu-19.04 |
Fix Released
|
Undecided
|
Ike Panhc | ||
Ubuntu-19.10 |
Fix Released
|
Undecided
|
Ike Panhc | ||
Ubuntu-20.04 |
Fix Released
|
Undecided
|
Unassigned | ||
Upstream-kernel |
Fix Released
|
Undecided
|
Unassigned | ||
linux (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Ike Panhc | ||
Disco |
Fix Released
|
Undecided
|
Ike Panhc | ||
Eoan |
Fix Released
|
Undecided
|
Ike Panhc | ||
Focal |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Impact]
Some SAS devices is gone when recovering
[Test Case]
No known test case, stress test on SAS deivce.
[Fix]
e74006edd0d4 scsi: hisi_sas: Fix the conflict between device gone and host reset
[Regression Risk]
Patch restricted to hisi_sas driver.
"[Steps to Reproduce]
1. Close all the PHYS;
2. Inject error;
3. Open one PHY;
[Actual Results]
Some disk will be lost
[Expected Results]
No disk will be lost
[Reproducibility]
occasionally
[Additional information]
Hardware: D06 CS
Firmware: NA+I59
Kernel: NA
[Resolution]
When init device for SAS disks, it will send TMF IO to clear disks. At that
time TMF IO is broken by some operations such as injecting controller reset
from HW RAs event, the TMF IO will be timeout, and at last device will be
gone. Print is as followed:
hisi_sas_v3_hw 0000:74:02.0: dev[240:1] found
...
hisi_sas_v3_hw 0000:74:02.0: controller resetting...
hisi_sas_v3_hw 0000:74:02.0: phyup: phy7 link_rate=10(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy0 link_rate=9(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy1 link_rate=9(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy2 link_rate=9(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy3 link_rate=9(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy6 link_rate=10(sata)
hisi_sas_v3_hw 0000:74:02.0: phyup: phy5 link_rate=11
hisi_sas_v3_hw 0000:74:02.0: phyup: phy4 link_rate=11
hisi_sas_v3_hw 0000:74:02.0: controller reset complete
hisi_sas_v3_hw 0000:74:02.0: abort tmf: TMF task timeout and not done
hisi_sas_v3_hw 0000:74:02.0: dev[240:1] is gone
sas: driver on host 0000:74:02.0 cannot handle device 5000c500a75a860d,
error:5
To improve the reliability, retry TMF IO max of 3 times for SAS disks which
is the same as softreset does."
scsi: hisi_sas: Fix the conflict between device gone and host reset
Changed in kunpeng920: | |
status: | Incomplete → New |
Changed in kunpeng920: | |
status: | New → Triaged |
Changed in linux (Ubuntu Bionic): | |
status: | New → Triaged |
Changed in linux (Ubuntu Disco): | |
status: | New → Triaged |
Changed in linux (Ubuntu Eoan): | |
status: | New → Triaged |
Changed in linux (Ubuntu Focal): | |
status: | New → Fix Released |
Changed in linux (Ubuntu Bionic): | |
assignee: | nobody → Ike Panhc (ikepanhc) |
status: | Triaged → In Progress |
Changed in linux (Ubuntu Disco): | |
assignee: | nobody → Ike Panhc (ikepanhc) |
status: | Triaged → In Progress |
Changed in linux (Ubuntu Eoan): | |
assignee: | nobody → Ike Panhc (ikepanhc) |
status: | Triaged → In Progress |
description: | updated |
Changed in linux (Ubuntu Bionic): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Disco): | |
status: | In Progress → Fix Committed |
Changed in linux (Ubuntu Eoan): | |
status: | In Progress → Fix Committed |
Changed in kunpeng920: | |
status: | In Progress → Fix Committed |
Changed in kunpeng920: | |
status: | Fix Committed → Fix Released |
Could you provide how to inject error into SAS driver please?