Activity log for bug #1903682

Date Who What changed Old value New value Message
2020-11-10 10:59:26 bugproxy bug added bug
2020-11-10 10:59:29 bugproxy tags architecture-s39064 bugnameltc-189163 severity-medium targetmilestone-inin2010
2020-11-10 10:59:30 bugproxy ubuntu: assignee Skipper Bug Screeners (skipper-screen-team)
2020-11-10 10:59:33 bugproxy affects ubuntu linux (Ubuntu)
2020-11-10 11:01:45 Frank Heimes bug task added ubuntu-z-systems
2020-11-10 11:02:16 Frank Heimes ubuntu-z-systems: importance Undecided Medium
2020-11-10 11:11:48 Frank Heimes ubuntu-z-systems: assignee Skipper Bug Screeners (skipper-screen-team)
2020-11-10 11:11:52 Frank Heimes linux (Ubuntu): assignee Skipper Bug Screeners (skipper-screen-team) Frank Heimes (fheimes)
2020-11-10 11:11:57 Frank Heimes ubuntu-z-systems: status New Triaged
2020-11-10 11:27:18 Frank Heimes nominated for series Ubuntu Focal
2020-11-10 11:27:18 Frank Heimes bug task added linux (Ubuntu Focal)
2020-11-10 11:27:18 Frank Heimes nominated for series Ubuntu Hirsute
2020-11-10 11:27:18 Frank Heimes bug task added linux (Ubuntu Hirsute)
2020-11-10 11:27:18 Frank Heimes nominated for series Ubuntu Groovy
2020-11-10 11:27:18 Frank Heimes bug task added linux (Ubuntu Groovy)
2020-11-10 11:27:31 Frank Heimes linux (Ubuntu Hirsute): assignee Frank Heimes (fheimes)
2020-11-10 11:27:48 Frank Heimes linux (Ubuntu Focal): assignee Frank Heimes (fheimes)
2020-11-10 11:34:41 Frank Heimes linux (Ubuntu Hirsute): status New Fix Committed
2020-11-10 11:34:53 Frank Heimes linux (Ubuntu Groovy): status New Triaged
2020-11-12 09:12:49 Frank Heimes summary [UBUNTU 20.10] NULL pointer dereference when configuring multi-function with devfn != 0 before devfn == 0 NULL pointer dereference when configuring multi-function with devfn != 0 before devfn == 0
2020-11-12 09:47:54 Frank Heimes description Background: When handling multifunction devices in zPCI we take the UID of the PCI function with function number 0 (that always exists according to the PCI spec) as domain number. Therefore when hot plugging functions with function number larger than 0 before function 0, we need to hold these in standby before creating the domain and bus. This has been tested during feature development using a patched QEMU and with DPM but never in Classic Mode. Reproduction: This issue was introduced with the Topology aware PCI Enumeration code so test with a Linux supporting that feature. E.g. Upstream, Devel Driver etc. On a Classic Mode machine with a multi-function device, hot plug ("Reassign I/O Path") only the FID of the second port to the LPAR. Symptom: After this any additional hotplug and even just deconfiguring a PCI device will hang. A hotplug makes the entire Linux instance unresponsive. Analysis: The problem occurs in Classic Mode but not with previous testing as the LPAR hypervisor does hot plug/Reassign I/O Path as a two step process: 1. zPCI event with PEC 0x0302 to plug the zPCI function in Standby 2. zPCI event with PEC 0x0301 to configure the zPCI function For the first event we create the zdev in clp_add_pci_device() in Standby which is all fine so far. The problem then occurs in step 2 as we then find the existing zdev and try to configure it. This however does not work as the PCI bus is not yet created (as we still don't know the UID of function 0 that will become its domain). The bus pointer zdev->zbus->bus pointer is thus still NULL but will be accessed by common code which inevitably results in disaster including the above mentioned hang and (possibly) the below RCU stall: [ 689.724703] rcu: INFO: rcu_sched self-detected stall on CPU [ 689.724712] rcu: 16-....: (42004 ticks this GP) idle=6ee/1/0x4000000000000002 softirq=1234/1234 fqs=14001 [ 689.724742] (t=42006 jiffies g=89 q=3770) [ 689.724743] Task dump for CPU 16: [ 689.724745] task:kmcheck state:R running task stack: 0 pid: 205 ppid: 2 flags:0x00000004 [ 689.724747] Call Trace: [ 689.724757] [<0000000ccde0b5c4>] show_stack+0x8c/0xd8 [ 689.724762] [<0000000ccd0dabc4>] sched_show_task.part.0+0xe4/0x110 [ 689.724764] [<0000000ccde0ea5e>] rcu_dump_cpu_stacks+0xde/0x120 [ 689.724767] [<0000000ccd1465c6>] print_cpu_stall+0x266/0x330 [ 689.724768] [<0000000ccd14a428>] rcu_sched_clock_irq+0x618/0x670 [ 689.724771] [<0000000ccd15cd7a>] update_process_times+0xba/0xf0 [ 689.724775] [<0000000ccd1766fa>] tick_sched_timer+0x9a/0x220 [ 689.724777] [<0000000ccd15d962>] __hrtimer_run_queues+0x182/0x3a0 [ 689.724779] [<0000000ccd1602f8>] hrtimer_interrupt+0x138/0x450 [ 689.724782] [<0000000ccd0451c0>] do_IRQ+0x90/0xa0 [ 689.724784] [<0000000ccde2be96>] ext_int_handler+0x17e/0x184 [ 689.724790] [<0000000ccd9f373e>] pci_get_slot+0x5e/0xa0 [ 689.724794] [<0000000ccd9dc182>] pci_scan_single_device+0x32/0x2a0 [ 689.724797] [<0000000ccd0868f2>] __zpci_event_availability+0x192/0x360 [ 689.724800] [<0000000ccdd40c16>] chsc_process_crw+0x2e6/0x300 [ 689.724802] [<0000000ccdd4b088>] crw_collect_info+0x2b8/0x320 [ 689.724804] [<0000000ccd0caf3a>] kthread+0x14a/0x170 [ 689.724805] [<0000000ccde2b814>] ret_from_fork+0x24/0x2c The fix is very simple, we check zdev->zbus->bus for being NULL and in that case bail from the case 0x0301 before calling the PCI common code pci_scan_single_device() with the NULL pointer. The only subtlety is that we still need to do the zpci_enable_device() because the code in arch/s390/pci/pci_bus.c assumes that it can immediately do a scan of all devfn != 0 PCI functions once PCI function 0 is found. It thereby mimics what happens when we only find the FID for a function with devfn != 0 in the CLP List PCI Functions. This is implemented in the following upstream commit: 0b2ca2c7d0c9e2731d01b6c862375d44a7e13923 s390/pci: fix hot-plug of PCI function missing bus It is included in v5.10-rc3 and has been tagged for stable > v5.8 i.e. all upstream versions with the PCI enumeration changes. Also it carries the appropriate Fixes tag. I have verified that it cherry-picks cleanly on current focal master-next and expect it to cleanly cherry-pick on newer Ubuntu Kernels too. SRU Justification: ================== [Impact] * While handling multifunction devices in zPCI the UID of the PCI function with function number 0 (that always exists according to the PCI spec) is taken as domain number. * Therefore if hot plugging functions with a function number larger than 0 are used before function 0, these need to be held in standby before creating the domain and bus. * This has been tested during development of this feature using a patched QEMU and in DPM, but unfortunately never in classic/traditional HMC mode. * On a classic/traditional mode machine with a multi-function device, and hot plug ("Reassign I/O Path") of the FID of the second port of the LPAR, any additional hotplug (and even just deconfiguring a PCI device) will hang - and hotplug now makes the entire Linux instance unresponsive. * The reason for this is a NULL pointer dereference - inc case configuring multi-function with devfn != 0 before devfn == 0. * This issue was introduced with the topology-aware PCI enumeration code. [Fix] * 0b2ca2c7d0c9e2731d01b6c862375d44a7e13923 0b2ca2c7d0c9 "s390/pci: fix hot-plug of PCI function missing bus" [Test Case] * IBM Z or LinuxONE hardware, equipped with hot-pluggable, multi-functional PCIe cards (like for example RoCE Express 2 adapters) in classic/traditional mode. * An Ubuntu OS running in LPAR, that comes with a kernel that includes the topology-aware PCI enumeration code (like for example 20.04.1 w/o further updates or 20.10 GA kernel). * Now on a system that is in classic/traditional mode, hot plug ("Reassign I/O Path") a multi-function device, but using the FID of the second port. [Regression Potential] * There is at least some regression risk, but I consider it as low, because: * Even is the modification is a single if statement (that spans two lines) in 'zpci_event_availability' it could harm the zPCI event management even more, in worst case it could break hot plug not only for systems in classic/traditional mode, but also in DPM mode (and making the system hang) or for all ports. * In such a case no enabling / disabling of devices would be possible. * But the fix is very simple and straight-forward, it checks zdev->zbus->bus for being NULL and in such a case break the function - means breaking instead of calling the PCI common code pci_scan_single_device() with the NULL pointer. * PCIe devices are usually more optional devices on s390x (compared to CCW and OSA devices for network) and this affects the zPCI subsystem only, which is unique to s390x. [Other] * The patch got upstream accepted with kernel v5.10-rc3, hence it will land sooner or later in Hirsute. * The patch has also been tagged for the upstream stable v5.8 series, hence will land in Groovy (based on kernel teams regular 'Groovy update: v5.8.x upstream stable release' LP bug). * Hence requesting this Kernel SRU for Focal only, since Ubuntu releases older than Focal do not have the topology-aware zPCI enumeration code. __________ Background: When handling multifunction devices in zPCI we take the UID of the PCI function with function number 0 (that always exists according to the PCI spec) as domain number. Therefore when hot plugging functions with function number larger than 0 before function 0, we need to hold these in standby before creating the domain and bus. This has been tested during feature development using a patched QEMU and with DPM but never in Classic Mode. Reproduction: This issue was introduced with the Topology aware PCI Enumeration code so test with a Linux supporting that feature. E.g. Upstream, Devel Driver etc. On a Classic Mode machine with a multi-function device, hot plug ("Reassign I/O Path") only the FID of the second port to the LPAR. Symptom: After this any additional hotplug and even just deconfiguring a PCI device will hang. A hotplug makes the entire Linux instance unresponsive. Analysis: The problem occurs in Classic Mode but not with previous testing as the LPAR hypervisor does hot plug/Reassign I/O Path as a two step process: 1. zPCI event with PEC 0x0302 to plug the zPCI function in Standby 2. zPCI event with PEC 0x0301 to configure the zPCI function For the first event we create the zdev in clp_add_pci_device() in Standby which is all fine so far. The problem then occurs in step 2 as we then find the existing zdev and try to configure it. This however does not work as the PCI bus is not yet created (as we still don't know the UID of function 0 that will become its domain). The bus pointer zdev->zbus->bus pointer is thus still NULL but will be accessed by common code which inevitably results in disaster including the above mentioned hang and (possibly) the below RCU stall: [ 689.724703] rcu: INFO: rcu_sched self-detected stall on CPU [ 689.724712] rcu: 16-....: (42004 ticks this GP) idle=6ee/1/0x4000000000000002 softirq=1234/1234 fqs=14001 [ 689.724742] (t=42006 jiffies g=89 q=3770) [ 689.724743] Task dump for CPU 16: [ 689.724745] task:kmcheck state:R running task stack: 0 pid: 205 ppid: 2 flags:0x00000004 [ 689.724747] Call Trace: [ 689.724757] [<0000000ccde0b5c4>] show_stack+0x8c/0xd8 [ 689.724762] [<0000000ccd0dabc4>] sched_show_task.part.0+0xe4/0x110 [ 689.724764] [<0000000ccde0ea5e>] rcu_dump_cpu_stacks+0xde/0x120 [ 689.724767] [<0000000ccd1465c6>] print_cpu_stall+0x266/0x330 [ 689.724768] [<0000000ccd14a428>] rcu_sched_clock_irq+0x618/0x670 [ 689.724771] [<0000000ccd15cd7a>] update_process_times+0xba/0xf0 [ 689.724775] [<0000000ccd1766fa>] tick_sched_timer+0x9a/0x220 [ 689.724777] [<0000000ccd15d962>] __hrtimer_run_queues+0x182/0x3a0 [ 689.724779] [<0000000ccd1602f8>] hrtimer_interrupt+0x138/0x450 [ 689.724782] [<0000000ccd0451c0>] do_IRQ+0x90/0xa0 [ 689.724784] [<0000000ccde2be96>] ext_int_handler+0x17e/0x184 [ 689.724790] [<0000000ccd9f373e>] pci_get_slot+0x5e/0xa0 [ 689.724794] [<0000000ccd9dc182>] pci_scan_single_device+0x32/0x2a0 [ 689.724797] [<0000000ccd0868f2>] __zpci_event_availability+0x192/0x360 [ 689.724800] [<0000000ccdd40c16>] chsc_process_crw+0x2e6/0x300 [ 689.724802] [<0000000ccdd4b088>] crw_collect_info+0x2b8/0x320 [ 689.724804] [<0000000ccd0caf3a>] kthread+0x14a/0x170 [ 689.724805] [<0000000ccde2b814>] ret_from_fork+0x24/0x2c The fix is very simple, we check zdev->zbus->bus for being NULL and in that case bail from the case 0x0301 before calling the PCI common code pci_scan_single_device() with the NULL pointer. The only subtlety is that we still need to do the zpci_enable_device() because the code in arch/s390/pci/pci_bus.c assumes that it can immediately do a scan of all devfn != 0 PCI functions once PCI function 0 is found. It thereby mimics what happens when we only find the FID for a function with devfn != 0 in the CLP List PCI Functions. This is implemented in the following upstream commit: 0b2ca2c7d0c9e2731d01b6c862375d44a7e13923 s390/pci: fix hot-plug of PCI function missing bus It is included in v5.10-rc3 and has been tagged for stable > v5.8 i.e. all upstream versions with the PCI enumeration changes. Also it carries the appropriate Fixes tag. I have verified that it cherry-picks cleanly on current focal master-next and expect it to cleanly cherry-pick on newer Ubuntu Kernels too.
2020-11-12 09:48:33 Frank Heimes linux (Ubuntu Focal): status New In Progress
2020-11-12 09:48:41 Frank Heimes linux (Ubuntu Focal): assignee Frank Heimes (fheimes)
2020-11-12 09:48:57 Frank Heimes linux (Ubuntu Groovy): assignee Frank Heimes (fheimes)
2020-11-12 09:49:05 Frank Heimes linux (Ubuntu Hirsute): status Fix Committed Triaged
2020-11-12 09:49:10 Frank Heimes linux (Ubuntu Hirsute): assignee Frank Heimes (fheimes)
2020-11-12 15:34:48 Terry Rudd bug added subscriber Terry Rudd
2020-11-27 13:06:16 Stefan Bader linux (Ubuntu Focal): importance Undecided Medium
2020-11-27 13:06:16 Stefan Bader linux (Ubuntu Focal): status In Progress Fix Committed
2020-11-27 13:25:20 Frank Heimes ubuntu-z-systems: status Triaged In Progress
2020-11-30 23:47:45 Ubuntu Kernel Bot tags architecture-s39064 bugnameltc-189163 severity-medium targetmilestone-inin2010 architecture-s39064 bugnameltc-189163 severity-medium targetmilestone-inin2010 verification-needed-focal
2020-12-02 16:50:53 Frank Heimes linux (Ubuntu Groovy): status Triaged In Progress
2020-12-02 16:52:26 Frank Heimes description SRU Justification: ================== [Impact] * While handling multifunction devices in zPCI the UID of the PCI function with function number 0 (that always exists according to the PCI spec) is taken as domain number. * Therefore if hot plugging functions with a function number larger than 0 are used before function 0, these need to be held in standby before creating the domain and bus. * This has been tested during development of this feature using a patched QEMU and in DPM, but unfortunately never in classic/traditional HMC mode. * On a classic/traditional mode machine with a multi-function device, and hot plug ("Reassign I/O Path") of the FID of the second port of the LPAR, any additional hotplug (and even just deconfiguring a PCI device) will hang - and hotplug now makes the entire Linux instance unresponsive. * The reason for this is a NULL pointer dereference - inc case configuring multi-function with devfn != 0 before devfn == 0. * This issue was introduced with the topology-aware PCI enumeration code. [Fix] * 0b2ca2c7d0c9e2731d01b6c862375d44a7e13923 0b2ca2c7d0c9 "s390/pci: fix hot-plug of PCI function missing bus" [Test Case] * IBM Z or LinuxONE hardware, equipped with hot-pluggable, multi-functional PCIe cards (like for example RoCE Express 2 adapters) in classic/traditional mode. * An Ubuntu OS running in LPAR, that comes with a kernel that includes the topology-aware PCI enumeration code (like for example 20.04.1 w/o further updates or 20.10 GA kernel). * Now on a system that is in classic/traditional mode, hot plug ("Reassign I/O Path") a multi-function device, but using the FID of the second port. [Regression Potential] * There is at least some regression risk, but I consider it as low, because: * Even is the modification is a single if statement (that spans two lines) in 'zpci_event_availability' it could harm the zPCI event management even more, in worst case it could break hot plug not only for systems in classic/traditional mode, but also in DPM mode (and making the system hang) or for all ports. * In such a case no enabling / disabling of devices would be possible. * But the fix is very simple and straight-forward, it checks zdev->zbus->bus for being NULL and in such a case break the function - means breaking instead of calling the PCI common code pci_scan_single_device() with the NULL pointer. * PCIe devices are usually more optional devices on s390x (compared to CCW and OSA devices for network) and this affects the zPCI subsystem only, which is unique to s390x. [Other] * The patch got upstream accepted with kernel v5.10-rc3, hence it will land sooner or later in Hirsute. * The patch has also been tagged for the upstream stable v5.8 series, hence will land in Groovy (based on kernel teams regular 'Groovy update: v5.8.x upstream stable release' LP bug). * Hence requesting this Kernel SRU for Focal only, since Ubuntu releases older than Focal do not have the topology-aware zPCI enumeration code. __________ Background: When handling multifunction devices in zPCI we take the UID of the PCI function with function number 0 (that always exists according to the PCI spec) as domain number. Therefore when hot plugging functions with function number larger than 0 before function 0, we need to hold these in standby before creating the domain and bus. This has been tested during feature development using a patched QEMU and with DPM but never in Classic Mode. Reproduction: This issue was introduced with the Topology aware PCI Enumeration code so test with a Linux supporting that feature. E.g. Upstream, Devel Driver etc. On a Classic Mode machine with a multi-function device, hot plug ("Reassign I/O Path") only the FID of the second port to the LPAR. Symptom: After this any additional hotplug and even just deconfiguring a PCI device will hang. A hotplug makes the entire Linux instance unresponsive. Analysis: The problem occurs in Classic Mode but not with previous testing as the LPAR hypervisor does hot plug/Reassign I/O Path as a two step process: 1. zPCI event with PEC 0x0302 to plug the zPCI function in Standby 2. zPCI event with PEC 0x0301 to configure the zPCI function For the first event we create the zdev in clp_add_pci_device() in Standby which is all fine so far. The problem then occurs in step 2 as we then find the existing zdev and try to configure it. This however does not work as the PCI bus is not yet created (as we still don't know the UID of function 0 that will become its domain). The bus pointer zdev->zbus->bus pointer is thus still NULL but will be accessed by common code which inevitably results in disaster including the above mentioned hang and (possibly) the below RCU stall: [ 689.724703] rcu: INFO: rcu_sched self-detected stall on CPU [ 689.724712] rcu: 16-....: (42004 ticks this GP) idle=6ee/1/0x4000000000000002 softirq=1234/1234 fqs=14001 [ 689.724742] (t=42006 jiffies g=89 q=3770) [ 689.724743] Task dump for CPU 16: [ 689.724745] task:kmcheck state:R running task stack: 0 pid: 205 ppid: 2 flags:0x00000004 [ 689.724747] Call Trace: [ 689.724757] [<0000000ccde0b5c4>] show_stack+0x8c/0xd8 [ 689.724762] [<0000000ccd0dabc4>] sched_show_task.part.0+0xe4/0x110 [ 689.724764] [<0000000ccde0ea5e>] rcu_dump_cpu_stacks+0xde/0x120 [ 689.724767] [<0000000ccd1465c6>] print_cpu_stall+0x266/0x330 [ 689.724768] [<0000000ccd14a428>] rcu_sched_clock_irq+0x618/0x670 [ 689.724771] [<0000000ccd15cd7a>] update_process_times+0xba/0xf0 [ 689.724775] [<0000000ccd1766fa>] tick_sched_timer+0x9a/0x220 [ 689.724777] [<0000000ccd15d962>] __hrtimer_run_queues+0x182/0x3a0 [ 689.724779] [<0000000ccd1602f8>] hrtimer_interrupt+0x138/0x450 [ 689.724782] [<0000000ccd0451c0>] do_IRQ+0x90/0xa0 [ 689.724784] [<0000000ccde2be96>] ext_int_handler+0x17e/0x184 [ 689.724790] [<0000000ccd9f373e>] pci_get_slot+0x5e/0xa0 [ 689.724794] [<0000000ccd9dc182>] pci_scan_single_device+0x32/0x2a0 [ 689.724797] [<0000000ccd0868f2>] __zpci_event_availability+0x192/0x360 [ 689.724800] [<0000000ccdd40c16>] chsc_process_crw+0x2e6/0x300 [ 689.724802] [<0000000ccdd4b088>] crw_collect_info+0x2b8/0x320 [ 689.724804] [<0000000ccd0caf3a>] kthread+0x14a/0x170 [ 689.724805] [<0000000ccde2b814>] ret_from_fork+0x24/0x2c The fix is very simple, we check zdev->zbus->bus for being NULL and in that case bail from the case 0x0301 before calling the PCI common code pci_scan_single_device() with the NULL pointer. The only subtlety is that we still need to do the zpci_enable_device() because the code in arch/s390/pci/pci_bus.c assumes that it can immediately do a scan of all devfn != 0 PCI functions once PCI function 0 is found. It thereby mimics what happens when we only find the FID for a function with devfn != 0 in the CLP List PCI Functions. This is implemented in the following upstream commit: 0b2ca2c7d0c9e2731d01b6c862375d44a7e13923 s390/pci: fix hot-plug of PCI function missing bus It is included in v5.10-rc3 and has been tagged for stable > v5.8 i.e. all upstream versions with the PCI enumeration changes. Also it carries the appropriate Fixes tag. I have verified that it cherry-picks cleanly on current focal master-next and expect it to cleanly cherry-pick on newer Ubuntu Kernels too. SRU Justification: ================== [Impact] * While handling multifunction devices in zPCI the UID of the PCI function with function number 0 (that always exists according to the PCI spec) is taken as domain number. * Therefore if hot plugging functions with a function number larger than 0 are used before function 0, these need to be held in standby before creating the domain and bus. * This has been tested during development of this feature using a patched QEMU and in DPM, but unfortunately never in classic/traditional HMC mode. * On a classic/traditional mode machine with a multi-function device, and hot plug ("Reassign I/O Path") of the FID of the second port of the LPAR, any additional hotplug (and even just deconfiguring a PCI device) will hang - and hotplug now makes the entire Linux instance unresponsive. * The reason for this is a NULL pointer dereference - inc case configuring multi-function with devfn != 0 before devfn == 0. * This issue was introduced with the topology-aware PCI enumeration code. [Fix] * 0b2ca2c7d0c9e2731d01b6c862375d44a7e13923 0b2ca2c7d0c9 "s390/pci: fix hot-plug of PCI function missing bus" [Test Case] * IBM Z or LinuxONE hardware, equipped with hot-pluggable, multi-functional PCIe cards (like for example RoCE Express 2 adapters) in classic/traditional mode. * An Ubuntu OS running in LPAR, that comes with a kernel that includes the topology-aware PCI enumeration code (like for example 20.04.1 w/o further updates or 20.10 GA kernel). * Now on a system that is in classic/traditional mode, hot plug ("Reassign I/O Path") a multi-function device, but using the FID of the second port. [Regression Potential] * There is at least some regression risk, but I consider it as low, because: * Even is the modification is a single if statement (that spans two lines) in 'zpci_event_availability' it could harm the zPCI event management even more, in worst case it could break hot plug not only for systems in classic/traditional mode, but also in DPM mode (and making the system hang) or for all ports. * In such a case no enabling / disabling of devices would be possible. * But the fix is very simple and straight-forward, it checks zdev->zbus->bus for being NULL and in such a case break the function - means breaking instead of calling the PCI common code pci_scan_single_device() with the NULL pointer. * PCIe devices are usually more optional devices on s390x (compared to CCW and OSA devices for network) and this affects the zPCI subsystem only, which is unique to s390x. [Other] * The patch got upstream accepted with kernel v5.10-rc3, hence it will land sooner or later in Hirsute. * It was initially planned to address groovy via 5.8 upstream stable update, and in fact the patch was already marked for this, but it didn't made it because 5.8 reached it's EOL already. * Hence in addition to the already submitted SRU for focal, this is now a separate SRU for groovy. __________ Background: When handling multifunction devices in zPCI we take the UID of the PCI function with function number 0 (that always exists according to the PCI spec) as domain number. Therefore when hot plugging functions with function number larger than 0 before function 0, we need to hold these in standby before creating the domain and bus. This has been tested during feature development using a patched QEMU and with DPM but never in Classic Mode. Reproduction: This issue was introduced with the Topology aware PCI Enumeration code so test with a Linux supporting that feature. E.g. Upstream, Devel Driver etc. On a Classic Mode machine with a multi-function device, hot plug ("Reassign I/O Path") only the FID of the second port to the LPAR. Symptom: After this any additional hotplug and even just deconfiguring a PCI device will hang. A hotplug makes the entire Linux instance unresponsive. Analysis: The problem occurs in Classic Mode but not with previous testing as the LPAR hypervisor does hot plug/Reassign I/O Path as a two step process: 1. zPCI event with PEC 0x0302 to plug the zPCI function in Standby 2. zPCI event with PEC 0x0301 to configure the zPCI function For the first event we create the zdev in clp_add_pci_device() in Standby which is all fine so far. The problem then occurs in step 2 as we then find the existing zdev and try to configure it. This however does not work as the PCI bus is not yet created (as we still don't know the UID of function 0 that will become its domain). The bus pointer zdev->zbus->bus pointer is thus still NULL but will be accessed by common code which inevitably results in disaster including the above mentioned hang and (possibly) the below RCU stall: [ 689.724703] rcu: INFO: rcu_sched self-detected stall on CPU [ 689.724712] rcu: 16-....: (42004 ticks this GP) idle=6ee/1/0x4000000000000002 softirq=1234/1234 fqs=14001 [ 689.724742] (t=42006 jiffies g=89 q=3770) [ 689.724743] Task dump for CPU 16: [ 689.724745] task:kmcheck state:R running task stack: 0 pid: 205 ppid: 2 flags:0x00000004 [ 689.724747] Call Trace: [ 689.724757] [<0000000ccde0b5c4>] show_stack+0x8c/0xd8 [ 689.724762] [<0000000ccd0dabc4>] sched_show_task.part.0+0xe4/0x110 [ 689.724764] [<0000000ccde0ea5e>] rcu_dump_cpu_stacks+0xde/0x120 [ 689.724767] [<0000000ccd1465c6>] print_cpu_stall+0x266/0x330 [ 689.724768] [<0000000ccd14a428>] rcu_sched_clock_irq+0x618/0x670 [ 689.724771] [<0000000ccd15cd7a>] update_process_times+0xba/0xf0 [ 689.724775] [<0000000ccd1766fa>] tick_sched_timer+0x9a/0x220 [ 689.724777] [<0000000ccd15d962>] __hrtimer_run_queues+0x182/0x3a0 [ 689.724779] [<0000000ccd1602f8>] hrtimer_interrupt+0x138/0x450 [ 689.724782] [<0000000ccd0451c0>] do_IRQ+0x90/0xa0 [ 689.724784] [<0000000ccde2be96>] ext_int_handler+0x17e/0x184 [ 689.724790] [<0000000ccd9f373e>] pci_get_slot+0x5e/0xa0 [ 689.724794] [<0000000ccd9dc182>] pci_scan_single_device+0x32/0x2a0 [ 689.724797] [<0000000ccd0868f2>] __zpci_event_availability+0x192/0x360 [ 689.724800] [<0000000ccdd40c16>] chsc_process_crw+0x2e6/0x300 [ 689.724802] [<0000000ccdd4b088>] crw_collect_info+0x2b8/0x320 [ 689.724804] [<0000000ccd0caf3a>] kthread+0x14a/0x170 [ 689.724805] [<0000000ccde2b814>] ret_from_fork+0x24/0x2c The fix is very simple, we check zdev->zbus->bus for being NULL and in that case bail from the case 0x0301 before calling the PCI common code pci_scan_single_device() with the NULL pointer. The only subtlety is that we still need to do the zpci_enable_device() because the code in arch/s390/pci/pci_bus.c assumes that it can immediately do a scan of all devfn != 0 PCI functions once PCI function 0 is found. It thereby mimics what happens when we only find the FID for a function with devfn != 0 in the CLP List PCI Functions. This is implemented in the following upstream commit: 0b2ca2c7d0c9e2731d01b6c862375d44a7e13923 s390/pci: fix hot-plug of PCI function missing bus It is included in v5.10-rc3 and has been tagged for stable > v5.8 i.e. all upstream versions with the PCI enumeration changes. Also it carries the appropriate Fixes tag. I have verified that it cherry-picks cleanly on current focal master-next and expect it to cleanly cherry-pick on newer Ubuntu Kernels too.
2020-12-03 08:44:16 Stefan Bader linux (Ubuntu Groovy): importance Undecided Medium
2020-12-03 09:00:58 Frank Heimes linux (Ubuntu Groovy): assignee Frank Heimes (fheimes)
2020-12-08 15:59:37 Ian May linux (Ubuntu Groovy): status In Progress Fix Committed
2020-12-08 16:03:39 Frank Heimes ubuntu-z-systems: status In Progress Fix Committed
2020-12-16 23:56:30 Kelsey Steele tags architecture-s39064 bugnameltc-189163 severity-medium targetmilestone-inin2010 verification-needed-focal architecture-s39064 bugnameltc-189163 severity-medium targetmilestone-inin2010 verification-done-focal
2021-01-04 13:54:06 Launchpad Janitor linux (Ubuntu Focal): status Fix Committed Fix Released
2021-01-29 07:38:58 Ubuntu Kernel Bot tags architecture-s39064 bugnameltc-189163 severity-medium targetmilestone-inin2010 verification-done-focal architecture-s39064 bugnameltc-189163 severity-medium targetmilestone-inin2010 verification-done-focal verification-needed-groovy
2021-01-29 07:56:42 Frank Heimes linux (Ubuntu Hirsute): status Triaged Fix Committed
2021-01-29 12:10:03 Frank Heimes tags architecture-s39064 bugnameltc-189163 severity-medium targetmilestone-inin2010 verification-done-focal verification-needed-groovy architecture-s39064 bugnameltc-189163 severity-medium targetmilestone-inin2010 verification-done-focal verification-done-groovy
2021-02-22 07:29:43 Frank Heimes linux (Ubuntu Hirsute): status Fix Committed Fix Released
2021-02-23 16:22:36 Launchpad Janitor linux (Ubuntu Groovy): status Fix Committed Fix Released
2021-02-23 16:22:36 Launchpad Janitor cve linked 2020-27777
2021-02-23 16:22:36 Launchpad Janitor cve linked 2020-28974
2021-02-23 17:10:22 Frank Heimes ubuntu-z-systems: status Fix Committed Fix Released