null pointer access on migration resume of systemrescuecd boot menu with qxl-vga

Bug #1679126 reported by Matthew Stapleton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Fix Released
Undecided
Unassigned

Bug Description

With qemu-2.8.0 up to 2.9.0-rc2 and git master (6954cdc), when resuming from a migration state file created from a VM suspended while showing the System Rescue CD 4.9.2 boot menu and using the QXL VGA device, I get a null point access in pixman_image_get_data called from qemu_spice_create_update (spice-display.c:215). When I added assert(ssd->mirror != NULL) above that line, assert failed. I don't get the crash when using standard VGA or cirrus-vga. I am using gcc-4.9.3 on Gentoo x86_64 with Intel i7-4700HQ CPU and kernel: 4.9.15-gentoo.

Here is the valgrind trace from the git version:
==2634== Thread 1:
==2634== Invalid read of size 4
==3516== at 0x65F3050: pixman_image_get_data (in /usr/lib64/libpixman-1.so.0.34.0)
==3516== by 0x6F0CEB: qemu_spice_create_update (spice-display.c:215)
==3516== by 0x6F1CC7: qemu_spice_display_refresh (spice-display.c:502)
==3516== by 0x58CF77: display_refresh (qxl.c:1948)
==3516== by 0x6E8084: do_safe_dpy_refresh (console.c:1591)
==3516== by 0x6E80D5: dpy_refresh (console.c:1604)
==3516== by 0x6E4508: gui_update (console.c:201)
==3516== by 0x81898E: timerlist_run_timers (qemu-timer.c:536)
==3516== by 0x8189D6: qemu_clock_run_timers (qemu-timer.c:547)
==3516== by 0x818D98: qemu_clock_run_all_timers (qemu-timer.c:662)
==3516== by 0x81952A: main_loop_wait (main-loop.c:514)
==3516== by 0x4ADD29: main_loop (vl.c:1898)

Minimal steps to reproduce:

Compile (debug compile flags are just so valgrind works, the crash occurs with non-debug compile flags as well):
CFLAGS="-g -O0" CXXFLAGS="-g -O0" ./configure --target-list=i386-softmmu,x86_64-softmmu
./configure
make

Start VM and leave it on the System Rescue CD graphical boot menu:
x86_64-softmmu/qemu-system-x86_64 -nodefaults -machine pc -drive file=systemrescuecd-x86-4.9.2.iso,if=none,id=cdrom-cd,readonly=on -device ide-cd,bus=ide.0,drive=cdrom-cd,bootindex=1 -device qxl-vga -monitor unix:monitor.sock,server,nowait -display gtk

Suspend VM and save state:
socat - unix:monitor.sock
  stop
  migrate "exec:cat > vm.state"
  quit

Attempt to resume VM (but this crashes):
x86_64-softmmu/qemu-system-x86_64 -nodefaults -machine pc -drive file=systemrescuecd-x86-4.9.2.iso,if=none,id=cdrom-cd,readonly=on -device ide-cd,bus=ide.0,drive=cdrom-cd,bootindex=1 -device qxl-vga -monitor unix:monitor.sock,server,nowait -display gtk -incoming exec:"cat vm.state"

Revision history for this message
Dr. David Alan Gilbert (dgilbert-h) wrote :

Yep, I can repeat this here on qemu head; crash at:

pixman_image_get_data (image=0x0) at pixman-image.c:845
845 if (image->type == BITS)

(gdb) p image
$1 = (pixman_image_t *) 0x0

Changed in qemu:
status: New → Confirmed
Revision history for this message
Dr. David Alan Gilbert (dgilbert-h) wrote :

I think this is actually anything that's in text mode grub; I've had a RHEL5 and 6 VM do it as well.

Thanks for reporting it.

Revision history for this message
elmarco (marcandre-lureau) wrote :

Interesting, the culprit is:

commit cd958edb1fae85d0c7d1e1acbff82d22724e8d64
Author: Marc-André Lureau <email address hidden>
Date: Fri Aug 26 13:47:11 2016 +0400

    console: skip same-size resize

    virtio-gpu does a set-scanout at each frame (it might be a driver
    regression). qemu_console_resize() recreate a surface even if the size
    didn't change, and this shows up in profiling reports because the
    surface is cleared. With this patch, I get a +15-20% glmark2
    improvement.

    Signed-off-by: Marc-André Lureau <email address hidden>
    Message-id: <email address hidden>
    Signed-off-by: Gerd Hoffmann <email address hidden>

diff --git a/ui/console.c b/ui/console.c
index 3940762851..394786b3c7 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -2101,6 +2101,13 @@ void qemu_console_resize(QemuConsole *s, int width, int height)
     DisplaySurface *surface;

     assert(s->console_type == GRAPHIC_CONSOLE);
+
+ if (s->surface &&
+ pixman_image_get_width(s->surface->image) == width &&
+ pixman_image_get_height(s->surface->image) == height) {
+ return;
+ }

Revision history for this message
Thomas Huth (th-huth) wrote :

The fix has apparently been included here:
https://git.qemu.org/?p=qemu.git;a=commitdiff;h=a703d3aef5991b72a5a45880e7491232b8032f09
... and has been released with QEMU v2.9 already.

Changed in qemu:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.