[hns3-0114]net: hns3: log and clear hardware error after reset complete

Bug #1859564 reported by Fred Kimmy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kunpeng920
Fix Released
Undecided
Unassigned
Ubuntu-18.04
Won't Fix
Undecided
Taihsiang Ho
Ubuntu-18.04-hwe
Fix Released
Undecided
Unassigned
Ubuntu-20.04
Fix Released
Undecided
Unassigned
Upstream-kernel
Fix Released
Undecided
Unassigned

Bug Description

[Bug Description]
When device is resetting, the CMDQ service may be stopped until
reset completed. If a new RAS error occurs at this moment, it
will no be able to clear the RAS source.

[Steps to Reproduce]
1.enject RAS & global reset at the same time

[Actual Results]
RAS error cannot be cleared

[Expected Results]
RAS error can be cleared

[Reproducibility]
Inevitably

[Additional information]
Hardware: D06
Firmware: NA
Kernel: NA

[Resolution]
This patch fixes it by clearing the RAS source after reset complete.

4fdd0bca6152 net: hns3: log and clear hardware error after reset complete

Ike Panhc (ikepanhc)
description: updated
Ike Panhc (ikepanhc)
tags: added: ikeradar
Ike Panhc (ikepanhc)
Changed in kunpeng920:
status: New → In Progress
Taihsiang Ho (tai271828)
tags: added: tairadar
Ike Panhc (ikepanhc)
tags: removed: ikeradar
Revision history for this message
Taihsiang Ho (tai271828) wrote :

The upstream commit uses a function "hclge_handle_all_hns_hw_errors", and this function is not implemented in Bionic kernel source.

Revision history for this message
Taihsiang Ho (tai271828) wrote :

hclge_handle_all_hns_hw_errors shows up in the v5.3 upstream kernel for the first time.

4fdd0bca6152 shows up in the v5.5 upstream kernel for the first time.

However, the bionic stable channel update mainly pulls upstream patches via v4.14 and v4.19, which are the nearest upstream longterm kernel. Backporting this patch (4fdd0bca6152) will need a lot extra patches out of v4.14 and v4.19, and it will only helps logging enhancement rather than a malfunction fix. I would suggest to reduce the risk of device malfunction by "won't fix".

Ike Panhc (ikepanhc)
Changed in kunpeng920:
status: In Progress → Fix Committed
Revision history for this message
Taihsiang Ho (tai271828) wrote :

Already in bionic-hwe since Ubuntu-hwe-5.3.0-40.32_18.04.1.

Revision history for this message
Taihsiang Ho (tai271828) wrote :

For focal:

$ lp_master_next_tags "net: hns3: log and clear hardware error after reset complete"
33dc5e5d22dd
Ubuntu-5.4-5.4.0-10.13
Ubuntu-5.4-5.4.0-11.14
Ubuntu-5.4-5.4.0-12.15
Ubuntu-5.4-5.4.0-13.16
Ubuntu-5.4-5.4.0-14.17
Ubuntu-5.4.0-15.18
Ubuntu-5.4.0-16.19
Ubuntu-5.4.0-17.20
Ubuntu-5.4.0-17.21

tags: removed: tairadar
Ike Panhc (ikepanhc)
Changed in kunpeng920:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.