Activity log for bug #1987287

Date Who What changed Old value New value Message
2022-08-22 12:29:22 bugproxy bug added bug
2022-08-22 12:29:24 bugproxy tags architecture-s3903164 bugnameltc-199463 severity-high targetmilestone-inin---
2022-08-22 12:29:26 bugproxy attachment added Backport of upstream commit to 5.4.0-124.140 https://bugs.launchpad.net/bugs/1987287/+attachment/5610710/+files/0001-net-mlx5-Avoid-processing-commands-before-cmdif-is-r.patch
2022-08-22 12:29:27 bugproxy ubuntu: assignee Skipper Bug Screeners (skipper-screen-team)
2022-08-22 12:29:33 bugproxy affects ubuntu linux (Ubuntu)
2022-08-22 14:42:47 Frank Heimes bug task added ubuntu-z-systems
2022-08-22 14:43:05 Frank Heimes ubuntu-z-systems: assignee Skipper Bug Screeners (skipper-screen-team)
2022-08-22 14:43:10 Frank Heimes ubuntu-z-systems: importance Undecided High
2022-08-22 14:43:52 Frank Heimes nominated for series Ubuntu Focal
2022-08-22 14:43:52 Frank Heimes bug task added linux (Ubuntu Focal)
2022-08-22 14:44:04 Frank Heimes linux (Ubuntu): status New Invalid
2022-08-22 14:44:17 Frank Heimes linux (Ubuntu Focal): assignee Skipper Bug Screeners (skipper-screen-team)
2022-08-22 14:44:21 Frank Heimes linux (Ubuntu): assignee Skipper Bug Screeners (skipper-screen-team)
2022-08-31 10:02:51 Frank Heimes description We recently got a bug report for systems running Ubuntu 20.04 that were crashing with backtraces pointing at the mlx5 driver's handling of mlx5_ethtool_get_link_ksettings() when this is called through the sysfs (going through ethtool might have different checks). I managed to find a reliable way to reproduce the issue that I believe isn't tied to IBM Z at all. The procedure to reproduce is as follows. I created a script to read the sysfs attributes for the link's speed and duplex mode in a loop: #!/usr/bin/env bash if [ $# -lt 1 ]; then echo "Usage: $0 <netif>" exit 1 fi while true; do cat /sys/class/net/$1/duplex > /dev/null cat /sys/class/net/$1/speed > /dev/null done Executed with: # ./script.sh enP10p0s0 I ran this in one bash session and then in another one I triggered a PCI reset with the follwoing command where one needs to replace <dev> with the PCI address of the NIC: echo 1 > /sys/bus/pci/devices/<dev>/reset Then first I got a lot of the following messages: mlx5_core 0010:00:00.0 enP16p0s0: mlx5e_ethtool_get_link_ksettings: query port ptys failed: -5 And then as the mlx5 driver's recovery kicks in the oops as below: [ 659.103947] mlx5_core 0010:00:00.0: wait vital counter value 0x7b399f after 1 iterations [ 659.103947] mlx5_core 0010:00:00.0: mlx5_pci_resume was called [ 659.103966] mlx5_core 0010:00:00.0: firmware version: 14.32.1010 [ 659.104169] Unable to handle kernel pointer dereference in virtual kernel address space [ 659.104171] Failing address: 0000000000000000 TEID: 0000000000000483 [ 659.104172] Fault in home space mode while using kernel ASCE. [ 659.104173] AS:000000003d29c007 R3:00000000fffd0007 S:00000000fffd5800 P:000000000000003d [ 659.104200] Oops: 0004 ilc:2 [#1] SMP [ 659.104202] Modules linked in: s390_trng ism smc pnet chsc_sch eadm_sch vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio sch_fq_codel drm drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 linear mlx5_ib dm_service_time pkey zcrypt crc32_vx_s390 ib_uverbs ghash_s390 ib_core qeth_l2 prng aes_s390 des_s390 nvme libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common mlx5_core tls mlxfw ptp nvme_core pps_core dasd_eckd_mod dasd_mod zfcp scsi_transport_fc qeth qdio ccwgroup scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath [ 659.104232] CPU: 6 PID: 438216 Comm: cat Not tainted 5.4.0-124-generic #140-Ubuntu [ 659.104233] Hardware name: IBM 3931 XYZ XXXX (LPAR) [ 659.104234] Krnl PSW : 0404c00180000000 000000003bfa661e (__queue_work+0xfe/0x520) [ 659.104241] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [ 659.104242] Krnl GPRS: 000000003c291570 0000000000000000 0000000000000007 000000007fffffff [ 659.104243] 00000000e2fe46e0 0000000fffffffe0 0000000000000006 000000003d039588 [ 659.104244] 0000000000000000 0000000000000000 00000000e2fe46e0 00000000bfb3e000 [ 659.104245] 00000000e194c400 000003e007d6fb78 000000003bfa6602 000003e007d6f860 [ 659.104251] Krnl Code: 000000003bfa6612: a77400e5 brc 7,000000003bfa67dc 000000003bfa6616: 582003ac l %r2,940 #000000003bfa661a: a7180000 lhi %r1,0 >000000003bfa661e: ba129000 cs %r1,%r2,0(%r9) 000000003bfa6622: a77401a7 brc 7,000000003bfa6970 000000003bfa6626: e310b0180012 lt %r1,24(%r11) 000000003bfa662c: a78400ff brc 8,000000003bfa682a 000000003bfa6630: c0040000004b brcl 0,000000003bfa66c6 [ 659.104261] Call Trace: [ 659.104263] ([<0000000000000000>] 0x0) [ 659.104265] [<000000003bfa6aa2>] queue_work_on+0x62/0x70 [ 659.104329] [<000003ff80a2920a>] cmd_exec+0x4ea/0x840 [mlx5_core] [ 659.104349] [<000003ff80a29680>] mlx5_cmd_exec+0x40/0x70 [mlx5_core] [ 659.104369] [<000003ff80a334a8>] mlx5_core_access_reg+0x108/0x150 [mlx5_core] [ 659.104387] [<000003ff80a3354e>] mlx5_query_port_ptys+0x5e/0x70 [mlx5_core] [ 659.104407] [<000003ff80a5b928>] mlx5e_ethtool_get_link_ksettings+0x58/0x460 [mlx5_core] [ 659.104410] [<000000003c662a68>] duplex_show+0x78/0xe0 [ 659.104414] [<000000003c538ecc>] dev_attr_show+0x2c/0x70 [ 659.104417] [<000000003c293386>] sysfs_kf_seq_show+0xa6/0x150 [ 659.104420] [<000000003c207470>] seq_read+0xe0/0x4f0 [ 659.104422] [<000000003c1d4cf4>] vfs_read+0x94/0x160 [ 659.104423] [<000000003c1d4ea8>] ksys_read+0x68/0x100 [ 659.104426] [<000000003c7f2034>] system_call+0xd8/0x2c8 [ 659.104427] Last Breaking-Event-Address: [ 659.104428] [<000000003bfa65d6>] __queue_work+0xb6/0x520 [ 659.104430] ---[ end trace 9fc1a6358b456876 ]--- Digging into the code and git history I found the following upstream commit added in v5.7 which besides being part of the 5.6.x stable patches somehow didn't make it into the 5.4.x stable queue nor Ubuntu 20.04, possibly because there is a (trivial) merge conflict: commit f7936ddd35d8b849daf0372770c7c9dbe7910fca Author: Eran Ben Elisha <eranbe@mellanox.com> Date: Thu Mar 19 21:43:13 2020 +0200 net/mlx5: Avoid processing commands before cmdif is ready When driver is reloading during recovery flow, it can't get new commands till command interface is up again. Otherwise we may get to null pointer trying to access non initialized command structures. Add cmdif state to avoid processing commands while cmdif is not ready. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> With a quick backport onto 5.4.0-124.140 (patch attached below) the issue is gone and the system no longer crashes but instead recovers successfully. I believe the crash we saw is then exactly the null pointer access mentioned in the commit description. == Comment: #4 - Niklas Schnelle - 2022-08-22 06:41:41 == There was only a trivial merge conflict where the context of the struct decleration changed. SRU Justification: ------------------ [Impact] * If the mlx5 driver is reloading while the recovery flow is happening, and if it receives new commands before the command interface is up again, this can lead to null pointer that tries to access non- initialized command structures. * So it's required to avoid processing commands before the command interface is up again. * This is accomplished by a new cmdif state that helps to avoid processing commands while cmdif is not ready. [Fix] * backport of f7936ddd35d8 f7936ddd35d8b849daf0372770c7c9dbe7910fca "net/mlx5: Avoid processing commands before cmdif is ready" [Test Plan] * An Ubuntu Server for s390x 18.04 or 20.04 LPAR or z/VM installation is needed that has Mellanox cards (RoCE Express 2.1) assigned, configured and enabled and that runs a 5.4 kernel (on bionic hwe-5.4). * Now trigger a recovery (guess that can be done at the Support Element) and reload the driver at the same time. * Make sure the module/driver mlx5 is loaded and in use (otherwise it can't be removed/unloaded). * Now remove/unload the module with: sudo modprobe -r mlx5 and (re-)load it again with: sudo modprobe mlx5 * Due to the lack of RoCE Express 2.1 hardware, IBM needs to do the verification. [Where problems could occur] * In case there is an issue with 'cmdif' it might not have the correct interface state, which: - either might lead to the fact that commands are not properly blocked and the situation is similar like before - or the commands may get always blocked, which render the hardware useless - or might block in wrong situation, which will cause unexpected issues and broken behavior. * Since the patch got upstream accepted with v5.7-rc7 it's not new to the kernel, was already part of groovy (and above) and is therefor already in use by newer Ubuntu releases. [Other Info] * Since the patch is upstream since v5.7-rc7, it's already included in jammy and kinetic. * Since the upstream patch incl. the line: Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") it looks to me that it was forgotten to mark the patch for upstream stable updates. __________ We recently got a bug report for systems running Ubuntu 20.04 that were crashing with backtraces pointing at the mlx5 driver's handling of mlx5_ethtool_get_link_ksettings() when this is called through the sysfs (going through ethtool might have different checks). I managed to find a reliable way to reproduce the issue that I believe isn't tied to IBM Z at all. The procedure to reproduce is as follows. I created a script to read the sysfs attributes for the link's speed and duplex mode in a loop: #!/usr/bin/env bash if [ $# -lt 1 ]; then         echo "Usage: $0 <netif>"         exit 1 fi while true; do         cat /sys/class/net/$1/duplex > /dev/null         cat /sys/class/net/$1/speed > /dev/null done Executed with: # ./script.sh enP10p0s0 I ran this in one bash session and then in another one I triggered a PCI reset with the follwoing command where one needs to replace <dev> with the PCI address of the NIC: echo 1 > /sys/bus/pci/devices/<dev>/reset Then first I got a lot of the following messages:  mlx5_core 0010:00:00.0 enP16p0s0: mlx5e_ethtool_get_link_ksettings: query port ptys failed: -5 And then as the mlx5 driver's recovery kicks in the oops as below: [ 659.103947] mlx5_core 0010:00:00.0: wait vital counter value 0x7b399f after 1 iterations [ 659.103947] mlx5_core 0010:00:00.0: mlx5_pci_resume was called [ 659.103966] mlx5_core 0010:00:00.0: firmware version: 14.32.1010 [ 659.104169] Unable to handle kernel pointer dereference in virtual kernel address space [ 659.104171] Failing address: 0000000000000000 TEID: 0000000000000483 [ 659.104172] Fault in home space mode while using kernel ASCE. [ 659.104173] AS:000000003d29c007 R3:00000000fffd0007 S:00000000fffd5800 P:000000000000003d [ 659.104200] Oops: 0004 ilc:2 [#1] SMP [ 659.104202] Modules linked in: s390_trng ism smc pnet chsc_sch eadm_sch vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio sch_fq_codel drm drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 linear mlx5_ib dm_service_time pkey zcrypt crc32_vx_s390 ib_uverbs ghash_s390 ib_core qeth_l2 prng aes_s390 des_s390 nvme libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common mlx5_core tls mlxfw ptp nvme_core pps_core dasd_eckd_mod dasd_mod zfcp scsi_transport_fc qeth qdio ccwgroup scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath [ 659.104232] CPU: 6 PID: 438216 Comm: cat Not tainted 5.4.0-124-generic #140-Ubuntu [ 659.104233] Hardware name: IBM 3931 XYZ XXXX (LPAR) [ 659.104234] Krnl PSW : 0404c00180000000 000000003bfa661e (__queue_work+0xfe/0x520) [ 659.104241] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [ 659.104242] Krnl GPRS: 000000003c291570 0000000000000000 0000000000000007 000000007fffffff [ 659.104243] 00000000e2fe46e0 0000000fffffffe0 0000000000000006 000000003d039588 [ 659.104244] 0000000000000000 0000000000000000 00000000e2fe46e0 00000000bfb3e000 [ 659.104245] 00000000e194c400 000003e007d6fb78 000000003bfa6602 000003e007d6f860 [ 659.104251] Krnl Code: 000000003bfa6612: a77400e5 brc 7,000000003bfa67dc                           000000003bfa6616: 582003ac l %r2,940                          #000000003bfa661a: a7180000 lhi %r1,0                          >000000003bfa661e: ba129000 cs %r1,%r2,0(%r9)                           000000003bfa6622: a77401a7 brc 7,000000003bfa6970                           000000003bfa6626: e310b0180012 lt %r1,24(%r11)                           000000003bfa662c: a78400ff brc 8,000000003bfa682a                           000000003bfa6630: c0040000004b brcl 0,000000003bfa66c6 [ 659.104261] Call Trace: [ 659.104263] ([<0000000000000000>] 0x0) [ 659.104265] [<000000003bfa6aa2>] queue_work_on+0x62/0x70 [ 659.104329] [<000003ff80a2920a>] cmd_exec+0x4ea/0x840 [mlx5_core] [ 659.104349] [<000003ff80a29680>] mlx5_cmd_exec+0x40/0x70 [mlx5_core] [ 659.104369] [<000003ff80a334a8>] mlx5_core_access_reg+0x108/0x150 [mlx5_core] [ 659.104387] [<000003ff80a3354e>] mlx5_query_port_ptys+0x5e/0x70 [mlx5_core] [ 659.104407] [<000003ff80a5b928>] mlx5e_ethtool_get_link_ksettings+0x58/0x460 [mlx5_core] [ 659.104410] [<000000003c662a68>] duplex_show+0x78/0xe0 [ 659.104414] [<000000003c538ecc>] dev_attr_show+0x2c/0x70 [ 659.104417] [<000000003c293386>] sysfs_kf_seq_show+0xa6/0x150 [ 659.104420] [<000000003c207470>] seq_read+0xe0/0x4f0 [ 659.104422] [<000000003c1d4cf4>] vfs_read+0x94/0x160 [ 659.104423] [<000000003c1d4ea8>] ksys_read+0x68/0x100 [ 659.104426] [<000000003c7f2034>] system_call+0xd8/0x2c8 [ 659.104427] Last Breaking-Event-Address: [ 659.104428] [<000000003bfa65d6>] __queue_work+0xb6/0x520 [ 659.104430] ---[ end trace 9fc1a6358b456876 ]--- Digging into the code and git history I found the following upstream commit added in v5.7 which besides being part of the 5.6.x stable patches somehow didn't make it into the 5.4.x stable queue nor Ubuntu 20.04, possibly because there is a (trivial) merge conflict: commit f7936ddd35d8b849daf0372770c7c9dbe7910fca Author: Eran Ben Elisha <eranbe@mellanox.com> Date: Thu Mar 19 21:43:13 2020 +0200     net/mlx5: Avoid processing commands before cmdif is ready     When driver is reloading during recovery flow, it can't get new commands     till command interface is up again. Otherwise we may get to null pointer     trying to access non initialized command structures.     Add cmdif state to avoid processing commands while cmdif is not ready.     Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")     Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>     Signed-off-by: Moshe Shemesh <moshe@mellanox.com>     Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> With a quick backport onto 5.4.0-124.140 (patch attached below) the issue is gone and the system no longer crashes but instead recovers successfully. I believe the crash we saw is then exactly the null pointer access mentioned in the commit description. == Comment: #4 - Niklas Schnelle - 2022-08-22 06:41:41 == There was only a trivial merge conflict where the context of the struct decleration changed.
2022-09-01 13:46:47 Frank Heimes description SRU Justification: ------------------ [Impact] * If the mlx5 driver is reloading while the recovery flow is happening, and if it receives new commands before the command interface is up again, this can lead to null pointer that tries to access non- initialized command structures. * So it's required to avoid processing commands before the command interface is up again. * This is accomplished by a new cmdif state that helps to avoid processing commands while cmdif is not ready. [Fix] * backport of f7936ddd35d8 f7936ddd35d8b849daf0372770c7c9dbe7910fca "net/mlx5: Avoid processing commands before cmdif is ready" [Test Plan] * An Ubuntu Server for s390x 18.04 or 20.04 LPAR or z/VM installation is needed that has Mellanox cards (RoCE Express 2.1) assigned, configured and enabled and that runs a 5.4 kernel (on bionic hwe-5.4). * Now trigger a recovery (guess that can be done at the Support Element) and reload the driver at the same time. * Make sure the module/driver mlx5 is loaded and in use (otherwise it can't be removed/unloaded). * Now remove/unload the module with: sudo modprobe -r mlx5 and (re-)load it again with: sudo modprobe mlx5 * Due to the lack of RoCE Express 2.1 hardware, IBM needs to do the verification. [Where problems could occur] * In case there is an issue with 'cmdif' it might not have the correct interface state, which: - either might lead to the fact that commands are not properly blocked and the situation is similar like before - or the commands may get always blocked, which render the hardware useless - or might block in wrong situation, which will cause unexpected issues and broken behavior. * Since the patch got upstream accepted with v5.7-rc7 it's not new to the kernel, was already part of groovy (and above) and is therefor already in use by newer Ubuntu releases. [Other Info] * Since the patch is upstream since v5.7-rc7, it's already included in jammy and kinetic. * Since the upstream patch incl. the line: Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") it looks to me that it was forgotten to mark the patch for upstream stable updates. __________ We recently got a bug report for systems running Ubuntu 20.04 that were crashing with backtraces pointing at the mlx5 driver's handling of mlx5_ethtool_get_link_ksettings() when this is called through the sysfs (going through ethtool might have different checks). I managed to find a reliable way to reproduce the issue that I believe isn't tied to IBM Z at all. The procedure to reproduce is as follows. I created a script to read the sysfs attributes for the link's speed and duplex mode in a loop: #!/usr/bin/env bash if [ $# -lt 1 ]; then         echo "Usage: $0 <netif>"         exit 1 fi while true; do         cat /sys/class/net/$1/duplex > /dev/null         cat /sys/class/net/$1/speed > /dev/null done Executed with: # ./script.sh enP10p0s0 I ran this in one bash session and then in another one I triggered a PCI reset with the follwoing command where one needs to replace <dev> with the PCI address of the NIC: echo 1 > /sys/bus/pci/devices/<dev>/reset Then first I got a lot of the following messages:  mlx5_core 0010:00:00.0 enP16p0s0: mlx5e_ethtool_get_link_ksettings: query port ptys failed: -5 And then as the mlx5 driver's recovery kicks in the oops as below: [ 659.103947] mlx5_core 0010:00:00.0: wait vital counter value 0x7b399f after 1 iterations [ 659.103947] mlx5_core 0010:00:00.0: mlx5_pci_resume was called [ 659.103966] mlx5_core 0010:00:00.0: firmware version: 14.32.1010 [ 659.104169] Unable to handle kernel pointer dereference in virtual kernel address space [ 659.104171] Failing address: 0000000000000000 TEID: 0000000000000483 [ 659.104172] Fault in home space mode while using kernel ASCE. [ 659.104173] AS:000000003d29c007 R3:00000000fffd0007 S:00000000fffd5800 P:000000000000003d [ 659.104200] Oops: 0004 ilc:2 [#1] SMP [ 659.104202] Modules linked in: s390_trng ism smc pnet chsc_sch eadm_sch vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio sch_fq_codel drm drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 linear mlx5_ib dm_service_time pkey zcrypt crc32_vx_s390 ib_uverbs ghash_s390 ib_core qeth_l2 prng aes_s390 des_s390 nvme libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common mlx5_core tls mlxfw ptp nvme_core pps_core dasd_eckd_mod dasd_mod zfcp scsi_transport_fc qeth qdio ccwgroup scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath [ 659.104232] CPU: 6 PID: 438216 Comm: cat Not tainted 5.4.0-124-generic #140-Ubuntu [ 659.104233] Hardware name: IBM 3931 XYZ XXXX (LPAR) [ 659.104234] Krnl PSW : 0404c00180000000 000000003bfa661e (__queue_work+0xfe/0x520) [ 659.104241] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [ 659.104242] Krnl GPRS: 000000003c291570 0000000000000000 0000000000000007 000000007fffffff [ 659.104243] 00000000e2fe46e0 0000000fffffffe0 0000000000000006 000000003d039588 [ 659.104244] 0000000000000000 0000000000000000 00000000e2fe46e0 00000000bfb3e000 [ 659.104245] 00000000e194c400 000003e007d6fb78 000000003bfa6602 000003e007d6f860 [ 659.104251] Krnl Code: 000000003bfa6612: a77400e5 brc 7,000000003bfa67dc                           000000003bfa6616: 582003ac l %r2,940                          #000000003bfa661a: a7180000 lhi %r1,0                          >000000003bfa661e: ba129000 cs %r1,%r2,0(%r9)                           000000003bfa6622: a77401a7 brc 7,000000003bfa6970                           000000003bfa6626: e310b0180012 lt %r1,24(%r11)                           000000003bfa662c: a78400ff brc 8,000000003bfa682a                           000000003bfa6630: c0040000004b brcl 0,000000003bfa66c6 [ 659.104261] Call Trace: [ 659.104263] ([<0000000000000000>] 0x0) [ 659.104265] [<000000003bfa6aa2>] queue_work_on+0x62/0x70 [ 659.104329] [<000003ff80a2920a>] cmd_exec+0x4ea/0x840 [mlx5_core] [ 659.104349] [<000003ff80a29680>] mlx5_cmd_exec+0x40/0x70 [mlx5_core] [ 659.104369] [<000003ff80a334a8>] mlx5_core_access_reg+0x108/0x150 [mlx5_core] [ 659.104387] [<000003ff80a3354e>] mlx5_query_port_ptys+0x5e/0x70 [mlx5_core] [ 659.104407] [<000003ff80a5b928>] mlx5e_ethtool_get_link_ksettings+0x58/0x460 [mlx5_core] [ 659.104410] [<000000003c662a68>] duplex_show+0x78/0xe0 [ 659.104414] [<000000003c538ecc>] dev_attr_show+0x2c/0x70 [ 659.104417] [<000000003c293386>] sysfs_kf_seq_show+0xa6/0x150 [ 659.104420] [<000000003c207470>] seq_read+0xe0/0x4f0 [ 659.104422] [<000000003c1d4cf4>] vfs_read+0x94/0x160 [ 659.104423] [<000000003c1d4ea8>] ksys_read+0x68/0x100 [ 659.104426] [<000000003c7f2034>] system_call+0xd8/0x2c8 [ 659.104427] Last Breaking-Event-Address: [ 659.104428] [<000000003bfa65d6>] __queue_work+0xb6/0x520 [ 659.104430] ---[ end trace 9fc1a6358b456876 ]--- Digging into the code and git history I found the following upstream commit added in v5.7 which besides being part of the 5.6.x stable patches somehow didn't make it into the 5.4.x stable queue nor Ubuntu 20.04, possibly because there is a (trivial) merge conflict: commit f7936ddd35d8b849daf0372770c7c9dbe7910fca Author: Eran Ben Elisha <eranbe@mellanox.com> Date: Thu Mar 19 21:43:13 2020 +0200     net/mlx5: Avoid processing commands before cmdif is ready     When driver is reloading during recovery flow, it can't get new commands     till command interface is up again. Otherwise we may get to null pointer     trying to access non initialized command structures.     Add cmdif state to avoid processing commands while cmdif is not ready.     Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")     Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>     Signed-off-by: Moshe Shemesh <moshe@mellanox.com>     Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> With a quick backport onto 5.4.0-124.140 (patch attached below) the issue is gone and the system no longer crashes but instead recovers successfully. I believe the crash we saw is then exactly the null pointer access mentioned in the commit description. == Comment: #4 - Niklas Schnelle - 2022-08-22 06:41:41 == There was only a trivial merge conflict where the context of the struct decleration changed. SRU Justification: ------------------ [Impact]  * If the mlx5 driver is reloading while the recovery flow is happening,    and if it receives new commands before the command interface is up    again, this can lead to null pointer that tries to access non-    initialized command structures.  * So it's required to avoid processing commands before the command    interface is up again.  * This is accomplished by a new cmdif state that helps to avoid    processing commands while cmdif is not ready. [Fix]  * backport of f7936ddd35d8 f7936ddd35d8b849daf0372770c7c9dbe7910fca "net/mlx5: Avoid processing commands before cmdif is ready" [Test Plan]  * An Ubuntu Server for s390x 18.04 or 20.04 LPAR or z/VM installation    is needed that has Mellanox cards (RoCE Express 2.1) assigned,    configured and enabled and that runs a 5.4 kernel (on bionic hwe-5.4).  * Now trigger a recovery (guess that can be done at the Support Element)    and reload the driver at the same time.  * Make sure the module/driver mlx5 is loaded and in use    (otherwise it can't be removed/unloaded).  * Now remove/unload the module with:    sudo modprobe -r mlx5    and (re-)load it again with:    sudo modprobe mlx5  * Due to the lack of RoCE Express 2.1 hardware,    IBM needs to do the verification. [Where problems could occur]  * In case there is an issue with 'cmdif' it might not have the correct    interface state, which:    - either might lead to the fact that commands are not properly blocked      and the situation is similar like before    - or the commands may get always blocked,      which render the hardware useless    - or might block in wrong situation,      which will cause unexpected issues and broken behavior.  * Since the patch got upstream accepted with v5.7-rc7 it's    not new to the kernel, was already part of groovy (and above)    and is therefor already in use by newer Ubuntu releases. [Other Info]  * Since the patch is upstream since v5.7-rc7,    it's already included in jammy and kinetic.  * Since the upstream patch incl. the line:    Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox    Connect-IB adapters") it looks to me that it was forgotten    to mark the patch for upstream stable updates. * Such SRUs for focal's 5.4 will automatically land in bionic's hwe-5.4, too. But since this was especially requested for bionic's hwe-5.4, I wanted to mention this here. __________ We recently got a bug report for systems running Ubuntu 20.04 that were crashing with backtraces pointing at the mlx5 driver's handling of mlx5_ethtool_get_link_ksettings() when this is called through the sysfs (going through ethtool might have different checks). I managed to find a reliable way to reproduce the issue that I believe isn't tied to IBM Z at all. The procedure to reproduce is as follows. I created a script to read the sysfs attributes for the link's speed and duplex mode in a loop: #!/usr/bin/env bash if [ $# -lt 1 ]; then         echo "Usage: $0 <netif>"         exit 1 fi while true; do         cat /sys/class/net/$1/duplex > /dev/null         cat /sys/class/net/$1/speed > /dev/null done Executed with: # ./script.sh enP10p0s0 I ran this in one bash session and then in another one I triggered a PCI reset with the follwoing command where one needs to replace <dev> with the PCI address of the NIC: echo 1 > /sys/bus/pci/devices/<dev>/reset Then first I got a lot of the following messages:  mlx5_core 0010:00:00.0 enP16p0s0: mlx5e_ethtool_get_link_ksettings: query port ptys failed: -5 And then as the mlx5 driver's recovery kicks in the oops as below: [ 659.103947] mlx5_core 0010:00:00.0: wait vital counter value 0x7b399f after 1 iterations [ 659.103947] mlx5_core 0010:00:00.0: mlx5_pci_resume was called [ 659.103966] mlx5_core 0010:00:00.0: firmware version: 14.32.1010 [ 659.104169] Unable to handle kernel pointer dereference in virtual kernel address space [ 659.104171] Failing address: 0000000000000000 TEID: 0000000000000483 [ 659.104172] Fault in home space mode while using kernel ASCE. [ 659.104173] AS:000000003d29c007 R3:00000000fffd0007 S:00000000fffd5800 P:000000000000003d [ 659.104200] Oops: 0004 ilc:2 [#1] SMP [ 659.104202] Modules linked in: s390_trng ism smc pnet chsc_sch eadm_sch vfio_ccw vfio_mdev mdev vfio_iommu_type1 vfio sch_fq_codel drm drm_panel_orientation_quirks i2c_core ip_tables x_tables btrfs zstd_compress zlib_deflate raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 linear mlx5_ib dm_service_time pkey zcrypt crc32_vx_s390 ib_uverbs ghash_s390 ib_core qeth_l2 prng aes_s390 des_s390 nvme libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common mlx5_core tls mlxfw ptp nvme_core pps_core dasd_eckd_mod dasd_mod zfcp scsi_transport_fc qeth qdio ccwgroup scsi_dh_emc scsi_dh_rdac scsi_dh_alua dm_multipath [ 659.104232] CPU: 6 PID: 438216 Comm: cat Not tainted 5.4.0-124-generic #140-Ubuntu [ 659.104233] Hardware name: IBM 3931 XYZ XXXX (LPAR) [ 659.104234] Krnl PSW : 0404c00180000000 000000003bfa661e (__queue_work+0xfe/0x520) [ 659.104241] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [ 659.104242] Krnl GPRS: 000000003c291570 0000000000000000 0000000000000007 000000007fffffff [ 659.104243] 00000000e2fe46e0 0000000fffffffe0 0000000000000006 000000003d039588 [ 659.104244] 0000000000000000 0000000000000000 00000000e2fe46e0 00000000bfb3e000 [ 659.104245] 00000000e194c400 000003e007d6fb78 000000003bfa6602 000003e007d6f860 [ 659.104251] Krnl Code: 000000003bfa6612: a77400e5 brc 7,000000003bfa67dc                           000000003bfa6616: 582003ac l %r2,940                          #000000003bfa661a: a7180000 lhi %r1,0                          >000000003bfa661e: ba129000 cs %r1,%r2,0(%r9)                           000000003bfa6622: a77401a7 brc 7,000000003bfa6970                           000000003bfa6626: e310b0180012 lt %r1,24(%r11)                           000000003bfa662c: a78400ff brc 8,000000003bfa682a                           000000003bfa6630: c0040000004b brcl 0,000000003bfa66c6 [ 659.104261] Call Trace: [ 659.104263] ([<0000000000000000>] 0x0) [ 659.104265] [<000000003bfa6aa2>] queue_work_on+0x62/0x70 [ 659.104329] [<000003ff80a2920a>] cmd_exec+0x4ea/0x840 [mlx5_core] [ 659.104349] [<000003ff80a29680>] mlx5_cmd_exec+0x40/0x70 [mlx5_core] [ 659.104369] [<000003ff80a334a8>] mlx5_core_access_reg+0x108/0x150 [mlx5_core] [ 659.104387] [<000003ff80a3354e>] mlx5_query_port_ptys+0x5e/0x70 [mlx5_core] [ 659.104407] [<000003ff80a5b928>] mlx5e_ethtool_get_link_ksettings+0x58/0x460 [mlx5_core] [ 659.104410] [<000000003c662a68>] duplex_show+0x78/0xe0 [ 659.104414] [<000000003c538ecc>] dev_attr_show+0x2c/0x70 [ 659.104417] [<000000003c293386>] sysfs_kf_seq_show+0xa6/0x150 [ 659.104420] [<000000003c207470>] seq_read+0xe0/0x4f0 [ 659.104422] [<000000003c1d4cf4>] vfs_read+0x94/0x160 [ 659.104423] [<000000003c1d4ea8>] ksys_read+0x68/0x100 [ 659.104426] [<000000003c7f2034>] system_call+0xd8/0x2c8 [ 659.104427] Last Breaking-Event-Address: [ 659.104428] [<000000003bfa65d6>] __queue_work+0xb6/0x520 [ 659.104430] ---[ end trace 9fc1a6358b456876 ]--- Digging into the code and git history I found the following upstream commit added in v5.7 which besides being part of the 5.6.x stable patches somehow didn't make it into the 5.4.x stable queue nor Ubuntu 20.04, possibly because there is a (trivial) merge conflict: commit f7936ddd35d8b849daf0372770c7c9dbe7910fca Author: Eran Ben Elisha <eranbe@mellanox.com> Date: Thu Mar 19 21:43:13 2020 +0200     net/mlx5: Avoid processing commands before cmdif is ready     When driver is reloading during recovery flow, it can't get new commands     till command interface is up again. Otherwise we may get to null pointer     trying to access non initialized command structures.     Add cmdif state to avoid processing commands while cmdif is not ready.     Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")     Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>     Signed-off-by: Moshe Shemesh <moshe@mellanox.com>     Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> With a quick backport onto 5.4.0-124.140 (patch attached below) the issue is gone and the system no longer crashes but instead recovers successfully. I believe the crash we saw is then exactly the null pointer access mentioned in the commit description. == Comment: #4 - Niklas Schnelle - 2022-08-22 06:41:41 == There was only a trivial merge conflict where the context of the struct decleration changed.
2022-09-01 17:02:28 Frank Heimes linux (Ubuntu Focal): status New In Progress
2022-09-01 17:02:31 Frank Heimes ubuntu-z-systems: status New In Progress
2022-09-01 17:02:46 Frank Heimes linux (Ubuntu Focal): assignee Skipper Bug Screeners (skipper-screen-team) Canonical Kernel Team (canonical-kernel-team)
2022-09-14 14:08:10 Stefan Bader linux (Ubuntu Focal): importance Undecided Medium
2022-09-14 14:08:22 Stefan Bader linux (Ubuntu Focal): status In Progress Fix Committed
2022-09-14 15:02:20 Frank Heimes ubuntu-z-systems: status In Progress Fix Committed
2022-10-10 08:39:38 Launchpad Janitor linux (Ubuntu Focal): status Fix Committed Fix Released
2022-10-10 08:39:38 Launchpad Janitor cve linked 2022-3176
2022-10-10 09:00:13 Frank Heimes ubuntu-z-systems: status Fix Committed Fix Released
2022-10-10 14:20:08 bugproxy tags architecture-s3903164 bugnameltc-199463 severity-high targetmilestone-inin--- architecture-s3903164 bugnameltc-199463 severity-high targetmilestone-inin2004
2022-12-15 13:51:42 Ubuntu Kernel Bot tags architecture-s3903164 bugnameltc-199463 severity-high targetmilestone-inin2004 architecture-s3903164 bugnameltc-199463 kernel-spammed-focal-linux-xilinx-zynqmp severity-high targetmilestone-inin2004 verification-needed-focal
2022-12-15 14:21:00 Frank Heimes tags architecture-s3903164 bugnameltc-199463 kernel-spammed-focal-linux-xilinx-zynqmp severity-high targetmilestone-inin2004 verification-needed-focal architecture-s3903164 bugnameltc-199463 kernel-spammed-focal-linux-xilinx-zynqmp severity-high targetmilestone-inin2004 verification-done-focal