qemu.git master -> qemu segfaults during tcp migration (and other modes when using MALLOC_PERTURB_=1)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
QEMU |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Relevant qemu.git master commit:
24a6e7f4d91e9ed
When trying to migrate a VM using the TCP protocol, a segfault happened:
21:45:07 INFO | Running qemu command (reformatted):
/home/lmr/
-S \
-name 'virt-tests-vm1' \
-nodefaults \
-chardev socket,
-mon chardev=
-chardev socket,
-device isa-serial,
-chardev socket,
-device isa-debugcon,
-device ich9-usb-
-drive file='/
-device virtio-
-device virtio-
-netdev user,id=
-m 1024 \
-smp 2,maxcpus=
-cpu 'SandyBridge' \
-M pc \
-device usb-tablet,
-vnc :1 \
-vga std \
-rtc base=utc,
-boot order=cdn,
-enable-kvm \
-incoming tcp:0:5200
21:45:08 INFO | [qemu output] qemu-system-x86_64: -device usb-tablet,
21:45:08 DEBUG| VM appears to be alive with PID 2002
21:45:08 DEBUG| (monitor hmp1) Sending command 'info cpus'
21:45:08 DEBUG| (monitor hmp1) Response to 'info cpus'
21:45:08 DEBUG| (monitor hmp1) * CPU #0: pc=0x00000000ff
21:45:08 DEBUG| (monitor hmp1) CPU #1: pc=0x00000000ff
21:45:09 DEBUG| (monitor hmp1) Sending command 'cont'
21:45:09 INFO | Migrating to tcp:0:5200
21:45:09 DEBUG| (monitor hmp1) Sending command 'migrate -d tcp:0:5200'
21:45:10 WARNI| Could not find (qemu) prompt after command 'screendump /dev/shm/
21:45:10 WARNI| VM 'virt-tests-vm1' produced an invalid screendump
21:45:10 INFO | [qemu output] qemu: warning: error while loading state section id 3
21:45:10 INFO | [qemu output] load of migration failed
21:45:10 INFO | [qemu output] /bin/sh: line 1: 1867 Segmentation fault /home/lmr/
We've missed those problems during the last couple of weeks due to problems in our test grid. The problem can be seen running the default test set on virt-test. By default, virt-test does not use MALLOC_PERTURB_=1. When using MALLOC_PERTURB_=1, pretty much all migration modes will fail.
Changed in qemu: | |
status: | Fix Committed → Fix Released |
Problem fixed with this commit, recently pushed to master:
commit 7dda5dc82a776a3 9a7996020c188eb 2a29187117
Author: Paolo Bonzini <email address hidden>
Date: Tue Apr 9 17:43:43 2013 +0200
migration: initialize RAM to zero
Using qemu_memalign only leaves the RAM zero by chance, because libc
will usually use mmap to satisfy our huge requests. But memory will
not be zero when using MALLOC_PERTURB_ with a nonzero value. In the
case of incoming migration, this breaks a recently-introduced
invariant (commit f1c7279, migration: do not sent zero pages in
bulk stage, 2013-03-26).
To fix this, use mmap ourselves to get a well-aligned, always zero
block for the RAM. Mmap-ed memory is easy to "trim" at the sides.
This also removes the need to do something special on valgrind
(see commit c2a8238a, Support running QEMU on Valgrind, 2011-10-31),
thus effectively reverts that patch.
Reviewed-by: Juan Quintela <email address hidden>
Signed-off-by: Paolo Bonzini <email address hidden>
Reviewed-by: Markus Armbruster <email address hidden>
Message-id: <email address hidden>
Signed-off-by: Anthony Liguori <email address hidden>
I'll take the opportunity and also make MALLOC_PERTURB_=1 as default on virt-tests. This will help to avoid such regressions in the future.