[SRU][Artful] using vfio-pci on a combination of cn8xxx and some PCI devices results in a kernel panic.

Bug #1770254 reported by Manoj Iyer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Critical
Canonical Kernel Team
Artful
Fix Released
Critical
Unassigned

Bug Description

[Impact]
Using vfio-pci on a combination of cn8xxx and some PCI devices results in a kernel panic. This is triggered by issuing a bus or a slot reset on the PCI device.

[Fix]
The following patches checks indicate that the reset is not possible
preventing the kernel panic.
357027786f35 PCI: Avoid bus reset if bridge itself is broken
822155100e58 PCI: Mark Cavium CN8xxx to avoid bus reset
33ba90aa4d44 PCI: Avoid slot reset if bridge itself is broken

These patches are already in Bionic, we need these patches in Artful so that xenial linux-hwe also has the fix. The platforms of interest were certified with Xenial and linux-hwe, so we are interested in fixing this only in Artful.

[Test]
- Artful Host with Artful/bionic guests
- Pass though PCI device (vfio-pci) from Artful to bionic and make sure its usable.
- The device should pass through and not cause a kernel panic.

A test kernel is available in ppa:manjo/lp1770254

[Regression Potential]
None.

Manoj Iyer (manjo)
description: updated
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1770254

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: artful
Revision history for this message
Manoj Iyer (manjo) wrote :

I was able to pass through a PCIE virtual function NIC from the host to the guest. No kernel panics seen in the host.

-- Host with kernel patches --
ubuntu@seuss:~$ ethtool -i enP2p1s0f1
driver: thunder-nicvf
version: 1.0
firmware-version:
expansion-rom-version:
bus-info: 0002:01:00.1
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

ubuntu@seuss:~$ readlink /sys/bus/pci/devices/0002\:01\:00.1/driver
../../../../../../../bus/pci/drivers/thunder-nicvf
ubuntu@seuss:~$

ubuntu@seuss:~$ virsh nodedev-detach pci_0002_01_00_1
Device pci_0002_01_00_1 detached

ubuntu@seuss:~$ readlink /sys/bus/pci/devices/0002\:01\:00.1/driver
../../../../../../../bus/pci/drivers/vfio-pci
ubuntu@seuss:~$

-- Bionic Guest --

ubuntu@vm1:~$ lspci -k
00:00.0 Host bridge: Red Hat, Inc. QEMU PCIe Host bridge
        Subsystem: Red Hat, Inc QEMU PCIe Host bridge
00:01.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
        Kernel driver in use: pcieport
        Kernel modules: shpchp
00:01.1 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
        Kernel driver in use: pcieport
        Kernel modules: shpchp
00:01.2 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
        Kernel driver in use: pcieport
        Kernel modules: shpchp
00:01.3 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
        Kernel driver in use: pcieport
        Kernel modules: shpchp
00:01.4 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
        Kernel driver in use: pcieport
        Kernel modules: shpchp
00:01.5 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
        Kernel driver in use: pcieport
        Kernel modules: shpchp
01:00.0 Ethernet controller: Red Hat, Inc Virtio network device (rev 01)
        Subsystem: Red Hat, Inc Virtio network device
        Kernel driver in use: virtio-pci
02:00.0 Communication controller: Red Hat, Inc Virtio console (rev 01)
        Subsystem: Red Hat, Inc Virtio console
        Kernel driver in use: virtio-pci
03:00.0 SCSI storage controller: Red Hat, Inc Virtio block device (rev 01)
        Subsystem: Red Hat, Inc Virtio block device
        Kernel driver in use: virtio-pci
04:00.0 SCSI storage controller: Red Hat, Inc Virtio block device (rev 01)
        Subsystem: Red Hat, Inc Virtio block device
        Kernel driver in use: virtio-pci
05:00.0 Ethernet controller: Cavium, Inc. THUNDERX Network Interface Controller virtual function (rev 09)
        Subsystem: Cavium, Inc. THUNDERX Network Interface Controller virtual function
        Kernel driver in use: thunder-nicvf
        Kernel modules: nicvf
ubuntu@vm1:~$

ubuntu@vm1:~$ ethtool -i enp5s0
driver: thunder-nicvf
version: 1.0
firmware-version:
expansion-rom-version:
bus-info: 0000:05:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no
ubuntu@vm1:~$

description: updated
Manoj Iyer (manjo)
Changed in linux (Ubuntu):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Changed in linux (Ubuntu Artful):
status: New → In Progress
importance: Undecided → Critical
Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Changed in linux (Ubuntu Artful):
status: In Progress → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-artful' to 'verification-done-artful'. If the problem still exists, change the tag 'verification-needed-artful' to 'verification-failed-artful'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-artful
Revision history for this message
Manoj Iyer (manjo) wrote :

-- Host with -proposed kernel --

ubuntu@anuchin:~$ virsh nodedev-detach pci_0002_01_00_1
Device pci_0002_01_00_1 detached

ubuntu@anuchin:~$

ubuntu@anuchin:~$ readlink /sys/bus/pci/devices/0002\:01\:00.1/driver
../../../../../../../bus/pci/drivers/vfio-pci
ubuntu@anuchin:~$

-- Bionic Guest --
ubuntu@ubuntu-pcitest:~$ lspci -k
00:00.0 Host bridge: Red Hat, Inc. QEMU PCIe Host bridge
 Subsystem: Red Hat, Inc QEMU PCIe Host bridge
00:01.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
 Kernel driver in use: pcieport
 Kernel modules: shpchp
00:01.1 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
 Kernel driver in use: pcieport
 Kernel modules: shpchp
00:01.2 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
 Kernel driver in use: pcieport
 Kernel modules: shpchp
00:01.3 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
 Kernel driver in use: pcieport
 Kernel modules: shpchp
00:01.4 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
 Kernel driver in use: pcieport
 Kernel modules: shpchp
00:01.5 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
 Kernel driver in use: pcieport
 Kernel modules: shpchp
01:00.0 Ethernet controller: Red Hat, Inc Virtio network device (rev 01)
 Subsystem: Red Hat, Inc Virtio network device
 Kernel driver in use: virtio-pci
02:00.0 Communication controller: Red Hat, Inc Virtio console (rev 01)
 Subsystem: Red Hat, Inc Virtio console
 Kernel driver in use: virtio-pci
03:00.0 SCSI storage controller: Red Hat, Inc Virtio block device (rev 01)
 Subsystem: Red Hat, Inc Virtio block device
 Kernel driver in use: virtio-pci
04:00.0 SCSI storage controller: Red Hat, Inc Virtio block device (rev 01)
 Subsystem: Red Hat, Inc Virtio block device
 Kernel driver in use: virtio-pci
05:00.0 Ethernet controller: Cavium, Inc. THUNDERX Network Interface Controller virtual function (rev 08)
 Subsystem: Cavium, Inc. THUNDERX Network Interface Controller virtual function
 Kernel modules: nicvf
ubuntu@ubuntu-pcitest:~$

tags: added: verification-done-artful
removed: verification-needed-artful
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (4.3 KiB)

This bug was fixed in the package linux - 4.13.0-45.50

---------------
linux (4.13.0-45.50) artful; urgency=medium

  * linux: 4.13.0-45.50 -proposed tracker (LP: #1774124)

  * CVE-2018-3639 (x86)
    - SAUCE: Set generic SSBD feature for Intel cpus

linux (4.13.0-44.49) artful; urgency=medium

  * linux: 4.13.0-44.49 -proposed tracker (LP: #1772951)

  * CVE-2018-3639 (x86)
    - x86/cpu: Make alternative_msr_write work for 32-bit code
    - x86/cpu/AMD: Fix erratum 1076 (CPB bit)
    - x86/bugs: Fix the parameters alignment and missing void
    - KVM: SVM: Move spec control call after restore of GS
    - x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
    - x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
    - x86/cpufeatures: Disentangle SSBD enumeration
    - x86/cpufeatures: Add FEATURE_ZEN
    - x86/speculation: Handle HT correctly on AMD
    - x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
    - x86/speculation: Add virtualized speculative store bypass disable support
    - x86/speculation: Rework speculative_store_bypass_update()
    - x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host}
    - x86/bugs: Expose x86_spec_ctrl_base directly
    - x86/bugs: Remove x86_spec_ctrl_set()
    - x86/bugs: Rework spec_ctrl base and mask logic
    - x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
    - KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
    - x86/bugs: Rename SSBD_NO to SSB_NO
    - KVM: VMX: Expose SSBD properly to guests.

  * [Ubuntu 16.04] kernel: fix rwlock implementation (LP: #1761674)
    - SAUCE: (no-up) s390: fix rwlock implementation

  * CVE-2018-7492
    - rds: Fix NULL pointer dereference in __rds_rdma_map

  * CVE-2018-8781
    - drm: udl: Properly check framebuffer mmap offsets

  * fsnotify: Fix fsnotify_mark_connector race (LP: #1765564)
    - fsnotify: Fix fsnotify_mark_connector race

  * Kernel panic on boot (m1.small in cn-north-1) (LP: #1771679)
    - x86/xen: Reset VCPU0 info pointer after shared_info remap

  * Suspend to idle: Open lid didn't resume (LP: #1771542)
    - ACPI / PM: Do not reconfigure GPEs for suspend-to-idle

  * CVE-2018-1092
    - ext4: fail ext4_iget for root directory if unallocated

  * [SRU][Artful] using vfio-pci on a combination of cn8xxx and some PCI devices
    results in a kernel panic. (LP: #1770254)
    - PCI: Avoid bus reset if bridge itself is broken
    - PCI: Mark Cavium CN8xxx to avoid bus reset
    - PCI: Avoid slot reset if bridge itself is broken

  * Battery drains when laptop is off (shutdown) (LP: #1745646)
    - PCI / PM: Check device_may_wakeup() in pci_enable_wake()

  * perf record crash: refcount_inc assertion failed (LP: #1769027)
    - perf cgroup: Fix refcount usage
    - perf xyarray: Fix wrong processing when closing evsel fd

  * Dell Latitude 5490/5590 BIOS update 1.1.9 causes black screen at boot
    (LP: #1764194)
    - drm/i915/bios: filter out invalid DDC pins from VBT child devices

  * Fix an issue that some PCI devices get incorrectly suspended (LP: #1764684)
    - PCI / PM: Always check PME wakeup capability for runtime wakeup support

  * [SRU][Bionic/Artful] fix false positives in W...

Read more...

Changed in linux (Ubuntu Artful):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.