frequent 15-sec guest freeze with ubuntu 22.04 host and guest

Bug #1972914 reported by AlpineCarver
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Released
Undecided
Unassigned
Noble
Fix Released
Undecided
Unassigned
xserver-xorg-video-qxl (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

I'm running a new installation of Ubuntu 22.04 Desktop on a Thinkpad T450s (core i5-5200, 2 cores / 4 vCPUs, 12 GB memory). Using the virt-manager GUI, I performed what I believe is a simple, plain-vanilla installation of an Ubuntu 22.04 guest running under qemu/kvm with 2 vCPUs and 4 GB memory.

I'm seeing very frequent, 15-second freezes of the guest. When it happens, the guest is completely unresponsive. After about 15 seconds, it works normally again, until the next freeze. The duration of the freeze appears to be the same every time. The guest isn't doing much - just open the calculator app and click number buttons. The freeze doesn't happen if I don't interact with the guest (just leave the system monitor running in the guest, so I can see that it's not frozen).

I observe the freeze when I have 2 of the 4 vCPUs dedicated to the guest. I do NOT observe it when I have only one vCPU dedicated to the guest.

Both the host and guest are using only a small fraction of the memory available to them. When the problem happens, the host indicates that CPU usage is very low across all vCPUs. The host appears to be operating normally when the guest is frozen.

I see the following pair of lines in the guest syslog every time the freeze occurs (and only when the freeze occurs):

May 10 13:48:40 qemu-jammy kernel: [ 144.259799] qxl 0000:00:01.0: object_init failed for (8298496, 0x00000001)

May 10 13:48:40 qemu-jammy kernel: [ 144.259819] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO

I don't see anything in the host syslog that correlates with the freeze.

If I choose "Virtio" in the Video drop-down in the virt-manager GUI, with "3D acceleration" UNchecked, the guest works fine, and the freeze never happens. Unfortunately, that loses fractional scaling, which is important to me. If I check "3D acceleration," the guest won't boot.

Revision history for this message
AlpineCarver (acarv) wrote :

Increasing video memory appears to have fixed my problem.

My virtual screen size is 1920x1080 and in virt-manager's Video panel / XML tab, vgamem was set to the default value of 16384. After increasing it to 65536, I'm no longer seeing the freeze (though with only a few minutes of testing, so far).

I don't know enough about this software to know if this means there is no flaw in QXL. If it does, then this bug report can be closed, although the freeze is a confusing failure mode. It would be nice if the overall user experience could be improved somehow. Perhaps the gnome Settings program's Displays panel could detect when a resolution is requested which exceeds available video memory and prevent it, preferably with a useful message to the user.

Revision history for this message
John Hartley (graphdrum) wrote :

Confirming this report on:

Ubuntu 22.04 Host with:
Lenovo Server
Dual 16 Core CPUs (with hyperthreading == 64 CPUS)
384GB RAM
Ubuntu 22.04
Libvirt 8.0.0
QEMU API 8.0.0
Hyervisor QEMU 6.2.0

Ubuntu 22.04 Desktop Guest with:
Q35 VM with OVMF
4 x CPU
8192MB RAM

Trying to run "snap Eclipse". UI performance is so bad that it is unusable.
I have testing the same software configuration with Ubuntu 20.04 Quest with only 2 CPU/4096MB RAM and performance is very snappy and no issues.

On my 22.04 quest I also have a lot of qxl errors in my kern.log:

sudo grep 'drm:qxl_alloc' kern.log
Sep 11 21:08:36 graphit kernel: [ 103.590807] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO
Sep 11 21:11:09 graphit kernel: [ 256.682227] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO
Sep 11 21:11:42 graphit kernel: [ 289.705678] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO
Sep 11 21:12:00 graphit kernel: [ 307.625499] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO
Sep 11 21:12:18 graphit kernel: [ 325.801386] [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO
...
...

I updated libvert vgamem as per your finding and now have workable VM again.

So while there is work around I believe there should be some fix in libvirt/qemu to ensure that VM is configured with vgamem value that results in working VM.

I have posted my testing in details here: https://tips.graphica.com.au/ubuntu-eclipse-snap-is-broken/

Thank you for posting the bug report.

Revision history for this message
Timo Juhani Lindfors (timo-lindfors) wrote :

I'm seeing this on Debian 12 without Wayland or Xorg simply by running

#!/bin/bash

chvt 3
for j in $(seq 80); do
    echo "$(date) starting round $j"
    if [ "$(journalctl --boot | grep "failed to allocate VRAM BO")" != "" ]; then
        echo "bug was reproduced after $j tries"
        exit 1
    fi
    for i in $(seq 100); do
        dmesg > /dev/tty3
    done
done

echo "bug could not be reproduced"
exit 0

This allowed me to run git bisect which identified the following commit:

commit 5a838e5d5825c85556011478abde708251cc0776 (refs/bisect/bad)
Author: Gerd Hoffmann <email address hidden>
Date: Thu Feb 4 15:57:10 2021 +0100

    drm/qxl: simplify qxl_fence_wait

    Now that we have the new release_event wait queue we can just
    use that in qxl_fence_wait() and simplify the code a lot.

    Signed-off-by: Gerd Hoffmann <email address hidden>
    Acked-by: Thomas Zimmermann <email address hidden>
    Link: http://patchwork<email address hidden>

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in xserver-xorg-video-qxl (Ubuntu):
status: New → Confirmed
Revision history for this message
alienheartbeat (reckless-symmetry) wrote :

Same issue:

First there is 1 message like this on journalctl:
[drm:qxl_gem_object_create [qxl]] *ERROR* Failed to allocate GEM object (261100, 1, 4096, -12)
kernel: [drm:qxl_alloc_ioctl [qxl]] *ERROR* qxl_alloc_ioctl: failed to create gem ret=-12

followed by many messages like this:
kernel: [TTM] Buffer eviction failed
kernel: qxl 0000:00:01.0: object_init failed for (3149824, 0x00000001)
kernel: [drm:qxl_alloc_bo_reserved [qxl]] *ERROR* failed to allocate VRAM BO

and then
X connection to :0 broken (explicit kill or server shutdown).
The virt-viewer screen is frozen.

However I can ssh in and reboot.

The crash can happen if I am browsing or deleting a directory in the file manager,
or when the VM is idle.
I return to find it frozen.
It is independent of how much memory I allocate to the VM or to the QXL graphics driver.

Setup
The VMs are on a clean install of Xubuntu 22.04 with the -hwe kernel linux-image-6.5.0-28
running on 2 different Xubuntu 22.04 hosts, one with
the -hwe kernel linux-image-6.5.0-28-generic and one with the standard linux-image-5.15.0-105-generic.

Versions
qemu-system-x86 6.2, libvirt 8.0.0, libspice-vdagent 0.22.1, spice-client-gtk 0.39

It is *possible* that this kernel bug is the same:
https://<email address hidden>/T/

Changing video driver from QXL to VirtIO eliminates the problem.

Revision history for this message
Matthew Ruffell (mruffell) wrote :

Hi everyone,

$ cd Work/kernel/ubuntu-jammy/
~/Work/kernel/ubuntu-jammy$ git log --grep 'Revert "drm/qxl: simplify qxl_fence_wait"'
commit 1b146e3dc802253fd9a6e29e2d3b06d003fe9182
Author: Alex Constantino <email address hidden>
Date: Thu Apr 4 19:14:48 2024 +0100

    Revert "drm/qxl: simplify qxl_fence_wait"

...
~/Work/kernel/ubuntu-jammy$ git describe --contains 1b146e3dc802253fd9a6e29e2d3b06d003fe9182
Ubuntu-5.15.0-115.125~199
~/Work/kernel/ubuntu-jammy$ cd ..
~/Work/kernel$ cd ubuntu-noble/
~/Work/kernel/ubuntu-noble$ git log --grep 'Revert "drm/qxl: simplify qxl_fence_wait"' origin/master-next
commit ee451375fd8b767eb91721fa389b022f1582cb0f
Author: Alex Constantino <email address hidden>
Date: Thu Apr 4 19:14:48 2024 +0100

    Revert "drm/qxl: simplify qxl_fence_wait"
...
~/Work/kernel/ubuntu-noble$ git describe --contains ee451375fd8b767eb91721fa389b022f1582cb0f
Ubuntu-6.8.0-38.38~331

This has been fixed in 5.15.0-115-generic or later, and 6.8.0-38-generic or later.

Let me know if you need any more help.

Thanks,
Matthew

Changed in xserver-xorg-video-qxl (Ubuntu):
status: Confirmed → Invalid
no longer affects: xserver-xorg-video-qxl (Ubuntu Jammy)
no longer affects: xserver-xorg-video-qxl (Ubuntu Noble)
Changed in linux (Ubuntu):
status: New → Fix Released
Changed in linux (Ubuntu Jammy):
status: New → Fix Released
Changed in linux (Ubuntu Noble):
status: New → Fix Released
Juerg Haefliger (juergh)
tags: added: kernel-daily-bug
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.