powerpc/powernv/pci: Work around races in PCI bridge enabling

Bug #1788549 reported by bugproxy on 2018-08-23
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Critical
Canonical Kernel Team
linux (Ubuntu)
Critical
Joseph Salisbury
Xenial
Critical
Joseph Salisbury
Bionic
Critical
Joseph Salisbury
Cosmic
Critical
Joseph Salisbury

Bug Description

== Comment: #0 - Michael Ranweiler <email address hidden> - 2018-08-21 13:52:03 ==
+++ This bug was initially created as a clone of Bug #170766 +++

Please apply the following kernel patch:

https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=next&id=db2173198b9513f7add8009f225afa1f1c79bcc6

powerpc/powernv/pci: Work around races in PCI bridge enabling

The generic code is racy when multiple children of a PCI bridge try to
enable it simultaneously.

This leads to drivers trying to access a device through a
not-yet-enabled bridge, and this EEH errors under various
circumstances when using parallel driver probing.

There is work going on to fix that properly in the PCI core but it
will take some time.

x86 gets away with it because (outside of hotplug), the BIOS enables
all the bridges at boot time.

This patch does the same thing on powernv by enabling all bridges that
have child devices at boot time, thus avoiding subsequent races. It's
suitable for backporting to stable and distros, while the proper PCI
fix will probably be significantly more invasive.

Signed-off-by: Benjamin Herrenschmidt <email address hidden>
Cc: <email address hidden>
Signed-off-by: Michael Ellerman <email address hidden>

== Comment: #2 - Michael Ranweiler <email address hidden> - 2018-08-21 18:23:35 ==
This has some fuzz and also move it back from the pci macro to dev_err so we'll attach the backported patch.

bugproxy (bugproxy) on 2018-08-23
tags: added: architecture-ppc64le bugnameltc-170793 severity-critical targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → kernel-package (Ubuntu)
Changed in ubuntu-power-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → Critical
tags: added: triage-g
affects: kernel-package (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu Bionic):
status: New → In Progress
Changed in linux (Ubuntu Cosmic):
status: New → In Progress
Changed in linux (Ubuntu Bionic):
importance: Undecided → Critical
Changed in linux (Ubuntu Cosmic):
importance: Undecided → Critical
Changed in linux (Ubuntu Bionic):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Cosmic):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Joseph Salisbury (jsalisbury)
Joseph Salisbury (jsalisbury) wrote :

I built a test kernel with commit db2173198b95. The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1788549

Can you test this kernel and see if it resolves this bug?

Note about installing test kernels:
• If the test kernel is prior to 4.15(Bionic) you need to install the linux-image and linux-image-extra .deb packages.
• If the test kernel is 4.15(Bionic) or newer, you need to install the linux-modules, linux-modules-extra and linux-image-unsigned .deb packages.

Thanks in advance!

Changed in ubuntu-power-systems:
status: New → In Progress
Changed in linux (Ubuntu Xenial):
status: New → In Progress
importance: Undecided → Critical
assignee: nobody → Joseph Salisbury (jsalisbury)
Joseph Salisbury (jsalisbury) wrote :

I also built a Xenial test kernel with commit db2173198b9513f7add8009f225afa1f1c79bcc6, since it was requested in bug 1788850. That bug is now marked as a duplicate of this bug.

The Xenial test kernel is available from:
http://kernel.ubuntu.com/~jsalisbury/lp1788549/xenial

Manoj Iyer (manjo) wrote :

IBM, could you please verify the PPA test kernels provided by the kernel team, and report back here.

Manoj Iyer (manjo) on 2018-10-01
Changed in ubuntu-power-systems:
status: In Progress → Incomplete
Changed in ubuntu-power-systems:
status: Incomplete → In Progress
Changed in ubuntu-power-systems:
status: In Progress → Incomplete
Andrew Cloke (andrew-cloke) wrote :

IBM is currently testing the test kernel from the PPA referred to in comment #2.

Changed in linux (Ubuntu):
status: In Progress → Incomplete
Changed in linux (Ubuntu Xenial):
status: In Progress → Incomplete
Changed in linux (Ubuntu Bionic):
status: In Progress → Incomplete
Changed in linux (Ubuntu Cosmic):
status: In Progress → Incomplete
Mike Ranweiler (mranweil) wrote :

I've tried this on two systems and tried some different testing - I can't recreate the original problem that is is intermittent and can require quite a few devices, but I did not see any regressions.

Changed in ubuntu-power-systems:
status: Incomplete → Triaged
Changed in linux (Ubuntu):
status: Incomplete → In Progress
Changed in linux (Ubuntu Xenial):
status: Incomplete → In Progress
Changed in linux (Ubuntu Bionic):
status: Incomplete → In Progress
Changed in linux (Ubuntu Cosmic):
status: Incomplete → In Progress
Changed in ubuntu-power-systems:
status: Triaged → In Progress
Joseph Salisbury (jsalisbury) wrote :

@Mike, do you feel comfortable with the testing you've done to submit an SRU request? Or would you like to perform some additional testing?

Changed in linux (Ubuntu Cosmic):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers