Instability in IVSHMEM after updating QEMU 4.2 -> 5.0

Bug #1901440 reported by Dave G
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
qemu (Ubuntu)
Undecided
Unassigned

Bug Description

Updating Ubuntu from 20.08 to 20.10 which updates QEMU from 4.2 to 5.0 results in the virtual machines freezing when the IVSHMEM interface is active. This workstation typically runs several windows 10 virtual machines that are accessed locally: two using the spice viewer and one that uses an passthrough assigned GPU accessed through a viewer called Looking Glass. Looking Glass uses the IVSHMEM device interface to pass captured frames from the windows virtual machine to the linux host for display by a viewer application.

This workstation was 100% stable under Ubuntu 20.08 (QEMU 4.2). It handled a variety of heavy loads all day it never froze or crashed. It became unstable under Ubuntu 20.10 (QEMU 5.0), seemingly triggered by high levels of SHM activity. I was able to reliably reproduce the problem when playing a video in the looking-glass vm while playing another video in a spice vm. Other scenarios would also trigger this problem less reliably, but this video playback scenario would trigger it after 3-5 minutes of playback.

The result of this new instability would manifest itself by all running vms on the host freezing but the host was not visibly effected. I could find no warnings or errors in any relevant system or QEMU logs. It wasn't just spice, when I accessed the gpu-passthru vm via directly assigned devices it was frozen, still outputting video of the last frame before the crash. All vms would have to be force-shutdown and the host rebooted to regain vm functionality. Just forcing shutdown and restarting a vm would result in showing 'running' status but it would be frozen and inaccessible until system reboot.

I suspect this is a QEMU host / kernel error for several reasons: Having to reboot the host, insensitivity to VM changes including virtio-win version, etc. I suspect it's related to IVSHMEM due to the correlation of the freeze to the looking-glass related activity.

This might be kernel / PCIe / power management related in some way. While experimenting to troubleshoot this issue I was able to trigger the error more quickly by disabling PCIe power management in the BIOS.

The system was 100% stable under QEMU 5.0 when not running the looking-glass vm, quite stable when the looking-glass vm was idle or lightly used, but appeared increasingly unstable as SHM activity increased.

Sorry if this is a bit anecdotal, this is my work machine and unfortunately today I was forced to rollback restore to Ubuntu 20.08 (QEMU 4.2) from backup so I can work on Monday. The system returned to 100% stability after returning to 20.08.

If requested I can restore the Ubuntu 20.10 snapshot to reproduce & gather information as directed.

Revision history for this message
Thomas Huth (th-huth) wrote :

Can you reproduce this with the latest upstream QEMU release (v5.1)? Or did you only try with the versions that ship with Ubuntu?

Revision history for this message
Dave G (player-001) wrote :

This problem occurred with the QEMU 5.0 version that was distributed with the Ubuntu 20.10 update.

Revision history for this message
Thomas Huth (th-huth) wrote :

Ok, so I'm moving this to the Ubuntu bug tracker.

affects: qemu → qemu (Ubuntu)
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Dave,
there is no Ubuntu 20.08 I'd know of I assume you mean 20.04 (Focal Fossa) as that would have qemu 4.2.
We'd need to start sorting out if this is a kernel or qemu issue.
Since you are know rolled back to 20.04 you might give [1] a try that just bumps the core virt components but keeps everything else on 20.04.

Report back if that triggers the issue on Focal. If it does you can then go back to the packages of 21.04 one by one e.g. bios first, then libvirt, ... until we can pinpoint if it indeed is qemu or another package.
If instead it works with these builds then that implies we need to look at the kernel, in that case still speak up here please.

[1]: https://launchpad.net/~canonical-server/+archive/ubuntu/server-backports/

Changed in qemu (Ubuntu):
status: New → Incomplete
Revision history for this message
Dave G (player-001) wrote : Re: [Bug 1901440] Re: Instability in IVSHMEM after updating QEMU 4.2 -> 5.0
Download full text (4.3 KiB)

 Hi Christian,
Thanks of your interest. Yes, meant to say Ubuntu 20.04.1 LTS.
This weekend if I have the time I'm going to install 20.10 on a separate partition and experiment on that.
Would you happen to know if anyone else is reporting similar problems?
If I find anything interesting I'll report back on this email thread.
     On Tuesday, October 27, 2020, 08:30:51 AM PDT, Christian Ehrhardt  <email address hidden> wrote:

 Hi Dave,
there is no Ubuntu 20.08 I'd know of I assume you mean 20.04 (Focal Fossa) as that would have qemu 4.2.
We'd need to start sorting out if this is a kernel or qemu issue.
Since you are know rolled back to 20.04 you might give [1] a try that just bumps the core virt components but keeps everything else on 20.04.

Report back if that triggers the issue on Focal. If it does you can then go back to the packages of 21.04 one by one e.g. bios first, then libvirt, ... until we can pinpoint if it indeed is qemu or another package.
If instead it works with these builds then that implies we need to look at the kernel, in that case still speak up here please.

[1]: https://launchpad.net/~canonical-server/+archive/ubuntu/server-
backports/

** Changed in: qemu (Ubuntu)
      Status: New => Incomplete

--
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1901440

Title:
  Instability in IVSHMEM after updating QEMU 4.2 -> 5.0

Status in qemu package in Ubuntu:
  Incomplete

Bug description:
  Updating Ubuntu from 20.08 to 20.10 which updates QEMU from 4.2 to 5.0
  results in the virtual machines freezing when the IVSHMEM interface is
  active.  This workstation typically runs several windows 10 virtual
  machines that are accessed locally:  two using the spice viewer and
  one that uses an passthrough assigned GPU accessed through a viewer
  called Looking Glass.  Looking Glass uses the IVSHMEM device interface
  to pass captured frames from the windows virtual machine to the linux
  host for display by a viewer application.

  This workstation was 100% stable under Ubuntu 20.08 (QEMU 4.2).  It
  handled a variety of heavy loads all day it never froze or crashed.
  It became unstable under Ubuntu 20.10 (QEMU 5.0), seemingly triggered
  by high levels of SHM activity.  I was able to reliably reproduce the
  problem when playing a video in the looking-glass vm while playing
  another video in a spice vm.  Other scenarios would also trigger this
  problem less reliably, but this video playback scenario would trigger
  it after 3-5 minutes of playback.

  The result of this new instability would manifest itself by all
  running vms on the host freezing but the host was not visibly
  effected.  I could find no warnings or errors in any relevant system
  or QEMU logs.  It wasn't just spice, when I accessed the gpu-passthru
  vm via directly assigned devices it was frozen, still outputting video
  of the last frame before the crash.  All vms would have to be force-
  shutdown and the host rebooted to regain vm functionality.  Just
  forcing shutdown and restarting a vm would result in showing 'running'
  status but it would be frozen and inaccessible until sys...

Read more...

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for qemu (Ubuntu) because there has been no activity for 60 days.]

Changed in qemu (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers