Live Migration Causes Performance Issues
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
QEMU |
Fix Released
|
Undecided
|
Unassigned | ||
qemu-kvm (Ubuntu) |
Fix Released
|
High
|
Unassigned | ||
Precise |
Fix Released
|
High
|
Chris J Arges | ||
Quantal |
Invalid
|
High
|
Unassigned | ||
Raring |
Invalid
|
High
|
Unassigned | ||
Saucy |
Fix Released
|
High
|
Unassigned |
Bug Description
SRU Justification
[Impact]
* Users of QEMU that save their memory states using savevm/loadvm or migrate experience worse performance after the migration/loadvm. To workaround these issues VMs must be completely rebooted. Optimally we should be able to restore a VM's memory state an expect no performance issue.
[Test Case]
* savevm/loadvm:
- Create a VM and install a test suite such as lmbench.
- Get numbers right after boot and record them.
- Open up the qemu monitor and type the following:
stop
savevm 0
loadvm 0
c
- Measure performance and record numbers.
- Compare if numbers are within margin of error.
* migrate:
- Create VM, install lmbench, get numbers.
- Open up qemu monitor and type the following:
stop
migrate "exec:dd of=~/save.vm"
quit
- Start a new VM using qemu but add the following argument:
-incoming "exec:dd if=~/save.vm"
- Run performance test and compare.
If performance measured is similar then we pass the test case.
[Regression Potential]
* The fix is a backport of two upstream patches:
ad0b5321f1f7972
211ea74022f5116
One patch allows QEMU to use THP if its enabled.
The other patch changes logic to not memset pages to zero when loading memory for the vm (on an incoming migration).
* I've also run the qa-regression-
[Additional Information]
Kernels from 3.2 onwards are affected, and all have the config: CONFIG_
--
I have 2 physical hosts running Ubuntu Precise. With 1.0+noroms-
I'm seeing a performance degradation after live migration on Precise, but not Lucid. These hosts are managed by libvirt (tested both 0.9.8-2ubuntu17 and 1.0.0-0ubuntu4) in conjunction with OpenNebula. I don't seem to have this problem with lucid guests (running a number of standard kernels, 3.2.5 mainline and backported linux-image-
I first noticed this problem with phoronix doing compilation tests, and then tried lmbench where even simple calls experience performance degradation.
I've attempted to post to the kvm mailing list, but so far the only suggestion was it may be related to transparent hugepages not being used after migration, but this didn't pan out. Someone else has a similar problem here - http://
qemu command line example: /usr/bin/kvm -name one-2 -S -M pc-1.2 -cpu Westmere -enable-kvm -m 73728 -smp 16,sockets=
Disk backend is LVM running on SAN via FC connection (using symlink from /var/lib/
ubuntu-12.04 - first boot
=======
Simple syscall: 0.0527 microseconds
Simple read: 0.1143 microseconds
Simple write: 0.0953 microseconds
Simple open/close: 1.0432 microseconds
Using phoronix pts/compuational
ImageMagick - 31.54s
Linux Kernel 3.1 - 43.91s
Mplayer - 30.49s
PHP - 22.25s
ubuntu-12.04 - post live migration
=======
Simple syscall: 0.0621 microseconds
Simple read: 0.2485 microseconds
Simple write: 0.2252 microseconds
Simple open/close: 1.4626 microseconds
Using phoronix pts/compilation
ImageMagick - 43.29s
Linux Kernel 3.1 - 76.67s
Mplayer - 45.41s
PHP - 29.1s
I don't have phoronix results for 10.04 handy, but they were within 1% of each other...
ubuntu-10.04 - first boot
=======
Simple syscall: 0.0524 microseconds
Simple read: 0.1135 microseconds
Simple write: 0.0972 microseconds
Simple open/close: 1.1261 microseconds
ubuntu-10.04 - post live migration
=======
Simple syscall: 0.0526 microseconds
Simple read: 0.1075 microseconds
Simple write: 0.0951 microseconds
Simple open/close: 1.0413 microseconds
Changed in qemu-kvm (Ubuntu): | |
importance: | Medium → High |
status: | Confirmed → Triaged |
Changed in linux (Ubuntu): | |
status: | Incomplete → Confirmed |
Changed in qemu-kvm (Ubuntu): | |
assignee: | nobody → Chris J Arges (arges) |
Changed in qemu-kvm (Ubuntu): | |
status: | Triaged → In Progress |
no longer affects: | linux (Ubuntu) |
Changed in qemu-kvm (Ubuntu Precise): | |
assignee: | nobody → Chris J Arges (arges) |
Changed in qemu-kvm (Ubuntu Quantal): | |
assignee: | nobody → Chris J Arges (arges) |
Changed in qemu-kvm (Ubuntu Raring): | |
assignee: | nobody → Chris J Arges (arges) |
Changed in qemu-kvm (Ubuntu Precise): | |
importance: | Undecided → High |
Changed in qemu-kvm (Ubuntu Quantal): | |
importance: | Undecided → High |
Changed in qemu-kvm (Ubuntu Raring): | |
importance: | Undecided → High |
Changed in qemu-kvm (Ubuntu Saucy): | |
assignee: | Chris J Arges (arges) → nobody |
status: | In Progress → Fix Released |
Changed in qemu-kvm (Ubuntu Raring): | |
status: | New → Triaged |
Changed in qemu-kvm (Ubuntu Quantal): | |
status: | New → Triaged |
Changed in qemu-kvm (Ubuntu Precise): | |
status: | New → In Progress |
description: | updated |
description: | updated |
description: | updated |
Changed in qemu-kvm (Ubuntu Quantal): | |
assignee: | Chris J Arges (arges) → nobody |
Changed in qemu-kvm (Ubuntu Raring): | |
assignee: | Chris J Arges (arges) → nobody |
Status changed to 'Confirmed' because the bug affects multiple users.