amd-iommu: kernel BUG & lockup after shutting down KVM guest using PCI passthrough/PCIe bridge

Bug #1375266 reported by Marti
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Unknown
Unknown
linux (Ubuntu)
Fix Released
Medium
Unassigned
Trusty
Fix Released
Medium
Chris J Arges

Bug Description

SRU Justification:
[Impact]
When using a KVM VM and adding certain combinations of PCI/PCI-e devices an oops and freeze can occur when shutting down a VM.

[Fix]
Commit 9b29d3c6510407d91786c1cf9183ff4debb3473a which is upstream in 3.17-rc2 and in stable/3.16.y.
This fix changes how cleanup_domain detaches device, instead of using the list_for_each_entyr_safe macro, it just iterates through the devices and removes the first element.

[Test Case]
Create KVM VM with a specific configuration of PCI/PCIe devices, and shutdown the VM. See https://bugzilla.kernel.org/show_bug.cgi?id=81841 for details.

--

This kernel lockup bug was reported to and fixed upstream: https://bugzilla.kernel.org/show_bug.cgi?id=81841

Please backport the "stable" kernel patch to Ubuntu kernels (at least trusty, which I use in this setup):
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=9b29d3c6510407d91786c1cf9183ff4debb3473a

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-35-generic 3.13.0-35.62
ProcVersionSignature: Ubuntu 3.13.0-35.62-generic 3.13.11.6
Uname: Linux 3.13.0-35-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version k3.13.0-35-generic.
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.14.1-0ubuntu3.4
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC1', '/dev/snd/hwC1D0', '/dev/snd/pcmC1D0c', '/dev/snd/pcmC1D0p', '/dev/snd/pcmC1D1p', '/dev/snd/pcmC1D2c', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D3p', '/dev/snd/pcmC0D7p', '/dev/snd/pcmC0D8p', '/dev/snd/pcmC0D9p', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory: 'iw'
Card0.Amixer.info: Error: [Errno 2] No such file or directory: 'amixer'
Card0.Amixer.values: Error: [Errno 2] No such file or directory: 'amixer'
Card1.Amixer.info: Error: [Errno 2] No such file or directory: 'amixer'
Card1.Amixer.values: Error: [Errno 2] No such file or directory: 'amixer'
Date: Mon Sep 29 16:19:50 2014
HibernationDevice: RESUME=UUID=54378db4-8de4-45f0-b7cb-2d8e1097139d
InstallationDate: Installed on 2014-09-04 (25 days ago)
InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Release amd64 (20140416.2)
MachineType: To Be Filled By O.E.M. To Be Filled By O.E.M.
ProcEnviron:
 LANGUAGE=en_US:en
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/usr/bin/zsh
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-35-generic root=UUID=a8f2d0ae-f43e-47de-9ef9-99fed8c9c78e ro nomdmonddf nomdmonisw nomdmonddf nomdmonisw
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-35-generic N/A
 linux-backports-modules-3.13.0-35-generic N/A
 linux-firmware 1.127.7
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
WifiSyslog:

dmi.bios.date: 01/24/2014
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: P2.90
dmi.board.name: FM2A88X Extreme6+
dmi.board.vendor: ASRock
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: To Be Filled By O.E.M.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrP2.90:bd01/24/2014:svnToBeFilledByO.E.M.:pnToBeFilledByO.E.M.:pvrToBeFilledByO.E.M.:rvnASRock:rnFM2A88XExtreme6+:rvr:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:
dmi.product.name: To Be Filled By O.E.M.
dmi.product.version: To Be Filled By O.E.M.
dmi.sys.vendor: To Be Filled By O.E.M.

Revision history for this message
Marti (intgr) wrote :
Revision history for this message
Chris J Arges (arges) wrote :

This fix was already released in:
Ubuntu-3.16.0-17.23
Marking this Fix Released for Utopic.

Changed in linux (Ubuntu):
assignee: nobody → Chris J Arges (arges)
importance: Undecided → Medium
Changed in linux (Ubuntu Trusty):
importance: Undecided → Medium
status: New → In Progress
assignee: nobody → Chris J Arges (arges)
Changed in linux (Ubuntu):
status: New → In Progress
assignee: Chris J Arges (arges) → nobody
status: In Progress → Fix Released
Revision history for this message
Chris J Arges (arges) wrote :

Sent SRU to Ubuntu Kernel mailing list for inclusion in the 3.13 kernel.

description: updated
Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Trusty):
status: In Progress → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-trusty' to 'verification-done-trusty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-trusty
Revision history for this message
Marti (intgr) wrote :

Wow, amazing what kind of bureaucracy getting a small fix in involves.

tags: added: verification-done-trusty
removed: verification-needed-trusty
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (10.4 KiB)

This bug was fixed in the package linux - 3.13.0-39.66

---------------
linux (3.13.0-39.66) trusty; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1386629

  [ Upstream Kernel Changes ]

  * KVM: x86: Check non-canonical addresses upon WRMSR
    - LP: #1384539
    - CVE-2014-3610
  * KVM: x86: Prevent host from panicking on shared MSR writes.
    - LP: #1384539
    - CVE-2014-3610
  * KVM: x86: Improve thread safety in pit
    - LP: #1384540
    - CVE-2014-3611
  * KVM: x86: Fix wrong masking on relative jump/call
    - LP: #1384545
    - CVE-2014-3647
  * KVM: x86: Warn if guest virtual address space is not 48-bits
    - LP: #1384545
    - CVE-2014-3647
  * KVM: x86: Emulator fixes for eip canonical checks on near branches
    - LP: #1384545
    - CVE-2014-3647
  * KVM: x86: emulating descriptor load misses long-mode case
    - LP: #1384545
    - CVE-2014-3647
  * KVM: x86: Handle errors when RIP is set during far jumps
    - LP: #1384545
    - CVE-2014-3647
  * kvm: vmx: handle invvpid vm exit gracefully
    - LP: #1384544
    - CVE-2014-3646
  * Input: synaptics - gate forcepad support by DMI check
    - LP: #1381815

linux (3.13.0-38.65) trusty; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1379244

  [ Andy Whitcroft ]

  * Revert "SAUCE: scsi: hyper-v storsvc switch up to SPC-3"
    - LP: #1354397
  * [Config] linux-image-extra is additive to linux-image
    - LP: #1375310
  * [Config] linux-image-extra postrm is not needed on purge
    - LP: #1375310

  [ Upstream Kernel Changes ]

  * Revert "KVM: x86: Increase the number of fixed MTRR regs to 10"
    - LP: #1377564
  * Revert "USB: option,zte_ev: move most ZTE CDMA devices to zte_ev"
    - LP: #1377564
  * aufs: bugfix, stop calling security_mmap_file() again
    - LP: #1371316
  * ipvs: fix ipv6 hook registration for local replies
    - LP: #1349768
  * Drivers: add blist flags
    - LP: #1354397
  * sd: fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout
    - LP: #1354397
  * drm/i915/bdw: Add 42ms delay for IPS disable
    - LP: #1374389
  * drm/i915: add null render states for gen6, gen7 and gen8
    - LP: #1374389
  * drm/i915/bdw: 3D_CHICKEN3 has write mask bits
    - LP: #1374389
  * drm/i915/bdw: Disable idle DOP clock gating
    - LP: #1374389
  * drm/i915: call lpt_init_clock_gating on BDW too
    - LP: #1374389
  * drm/i915: shuffle panel code
    - LP: #1374389
  * drm/i915: extract backlight minimum brightness from VBT
    - LP: #1374389
  * drm/i915: respect the VBT minimum backlight brightness
    - LP: #1374389
  * drm/i915/bdw: Apply workarounds in render ring init function
    - LP: #1374389
  * drm/i915/bdw: Cleanup pre prod workarounds
    - LP: #1374389
  * drm/i915: Replace hardcoded cacheline size with macro
    - LP: #1374389
  * drm/i915: Refactor Broadwell PIPE_CONTROL emission into a helper.
    - LP: #1374389
  * drm/i915: Add the WaCsStallBeforeStateCacheInvalidate:bdw workaround.
    - LP: #1374389
  * drm/i915/bdw: Remove BDW preproduction W/As until C stepping.
    - LP: #1374389
  * mptfusion: enable no_write_same for vmware scsi disks
    - LP: #1371591
  * iommu/amd: Fix cleanup_domai...

Changed in linux (Ubuntu Trusty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.