【hns-0416】net: hns3: fix kernel crash when unload VF while it is being reset

Bug #1969268 reported by Fred Kimmy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kunpeng920
Fix Released
Undecided
Ike Panhc
Ubuntu-20.04-hwe
Fix Released
Undecided
Ike Panhc

Bug Description

[Bug Description]
When fully configure VLANs for a VF, then unload the VF while
triggering a reset to PF, will cause a kernel crash because the
irq is already uninit.

[78895.670263] ------------[ cut here ]------------
[78895.674867] kernel BUG at drivers/pci/msi.c:378!
[78895.679467] Internal error: Oops - BUG: 0 [#1] SMP
[78895.684239] Modules linked in: binfmt_misc hns_roce_hw_v2 ib_uverbs ib_umad ib_core hclgevf 8021q garp mrp stp llc nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_ssif joydev input_leds arm_spe_pmu efi_pstore hisi_sec2 authenc hisi_hpre hisi_zip hisi_qm uacce hisi_dma uio_pdrv_genirq uio hisi_trng_v2 acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler cppc_cpufreq sch_fq_codel ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure realtek hid_generic hibmc_drm drm_vram_helper drm_ttm_helper ttm i2c_algo_bit drm_kms_helper syscopyarea crct10dif_ce sysfillrect ghash_ce sysimgblt sha2_ce fb_sys_fops hisi_sas_v3_hw cec sha256_arm64 hns3 rc_core usbhid hisi_sas_main sha1_ce hclge hid ixgbe drm libsas xhci_pci hnae3 xhci_pci_renesas ahci megaraid_sas scsi_transport_sas xfrm_algo mdio spi_dw_mmio gpio_dwapb spi_dw aes_neon_bs
[78895.684337] aes_neon_blk aes_ce_blk crypto_simd cryptd aes_ce_cipher [last unloaded: ib_core]
[78895.778896] CPU: 8 PID: 1671944 Comm: kworker/8:1 Not tainted 5.11.0-27-generic #29~20.04.1-Ubuntu
[78895.787813] Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V5.B211.01 11/10/2021
[78895.796645] Workqueue: hclgevf hclgevf_service_task [hclgevf]
[78895.802374] pstate: 00400009 (nzcv daif +PAN -UAO -TCO BTYPE=--)
[78895.808354] pc : free_msi_irqs+0x188/0x1a8
[78895.812433] lr : free_msi_irqs+0x178/0x1a8
[78895.816511] sp : ffff800033eabba0
[78895.819811] x29: ffff800033eabba0 x28: 0000000000020c00
[78895.825100] x27: 00000000000000e1 x26: 000047c8bdaf3151
[78895.830388] x25: ffff0041c20e0430 x24: ffff0041c20e03f8
[78895.835677] x23: 0000000000027008 x22: ffff0040a1a48000
[78895.840965] x21: ffff0040a1a482f0 x20: 0000000000000000
[78895.846254] x19: ffff0040feac9580 x18: 0000000000000000
[78895.851542] x17: 0000000000000000 x16: ffffdef63b555010
[78895.856830] x15: 0000000000000000 x14: 0000000000000000
[78895.862117] x13: ffffffffffffff00 x12: ffffffffffffffff
[78895.867406] x11: 0000000000000040 x10: ffffdef63cd680a0
[78895.872694] x9 : ffffdef63aec8464 x8 : ffff20400745c250
[78895.877982] x7 : 0000000000000000 x6 : 0000000000000000
[78895.883270] x5 : ffff20400745c228 x4 : ffff20400745c298
[78895.888558] x3 : 0000000000000000 x2 : 0000000000000000
[78895.893845] x1 : 0000000000000549 x0 : 0000000000000001
[78895.899133] Call trace:
[78895.901569] free_msi_irqs+0x188/0x1a8
[78895.905300] pci_disable_msix+0xec/0x118
[78895.909207] pci_free_irq_vectors+0x20/0x38
[78895.913371] hclgevf_uninit_msi+0x48/0x60 [hclgevf]
[78895.918231] hclgevf_reset_rebuild+0x178/0x398 [hclgevf]
[78895.923520] hclgevf_reset_service_task+0x354/0x600 [hclgevf]
[78895.929242] hclgevf_service_task+0x1b8/0x2e0 [hclgevf]
[78895.934445] process_one_work+0x1fc/0x4d0
[78895.938437] worker_thread+0x148/0x510
[78895.942170] kthread+0xf4/0x120
[78895.945297] ret_from_fork+0x10/0x18
[78895.948859] Code: 72001c1f 54ffff00 a90363f7 f90023f9 (d4210000)
[78895.954925] ---[ end trace d83c0afbdc9d99a2 ]---
[78895.960560] ------------[ cut here ]------------

[Steps to Reproduce]
1)echo 1 > /sys/class/net/<ethx>/device/sriov_numvfs
2)ip link set dev <vf_name> up
3)config full vlan_id 0~4094
ip link add dev <vf_name>.<vlan_id> link <vf_name> type vlan id <vlan_id>
ifconfig <vf_name>.<vlan_id> up
4) enable VF spoofchk function
ip link set <pf_name> vf <vf_id> spoofchk on
5) resest vf and clear pf
echo 1 > /sys/class/net/<pf_name>/device/reset
echo 0 > /sys/class/net/<pf_name>/device/sriov_numvfs
6) resest vf and clear pf
echo 1 > /sys/class/net/<pf_name>/device/reset
echo 1 > /sys/class/net/<pf_name>/device/sriov_numvfs
7) re-test step 3

[Actual Results]
call trace

[Expected Results]
no call trace

[Reproducibility]

[Additional information]
(Firmware version, kernel version, affected hardware, etc. if required2022011211577):
OS:Ubuntu 20.04.3 LTS
DRV(driver version):Linux tx 5.11.0-27-generic #29~20.04.1-Ubuntu SMP Wed Aug 11 15:58:08 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux

[Resolution]

net: hns3: fix kernel crash when unload VF while it is being reset

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/drivers/net/ethernet/hisilicon?id=e140c7983e3054be0652bf914f4454f16c5520b0

Revision history for this message
Ike Panhc (ikepanhc) wrote :

e140c7983e30 <email address hidden> 2021-11-10 14:20:43 +0000 net: hns3: fix kernel crash when unload VF while it is being reset

This patch has been merged into mainline kernel since v5.16. I will try to backport to Ubuntu kernels.

Changed in kunpeng920:
assignee: nobody → Ike Panhc (ikepanhc)
status: New → In Progress
Revision history for this message
Ike Panhc (ikepanhc) wrote :

This patch has been applied to Ubuntu 5.15 kernel since 5.15.0-12.12. When focal HWE kernel rolls to 5.15, this issue will be fixed.

Changed in kunpeng920:
status: In Progress → Fix Committed
Revision history for this message
Ike Panhc (ikepanhc) wrote :

focal HWE kernel has been rolled to 5.15.

$ rmadison linux-generic-hwe-20.04 | grep focal-updates
 linux-generic-hwe-20.04 | 5.13.0.1026.29~20.04.1 | focal-updates | riscv64
 linux-generic-hwe-20.04 | 5.15.0.41.44~20.04.13 | focal-updates | amd64, arm64, armhf, ppc64el, s390x

Changed in kunpeng920:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.