libvirt: blockcommit fails - disk not ready for pivot yet
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
libvirt (Ubuntu) |
Fix Released
|
Medium
|
Unassigned | ||
Xenial |
Fix Released
|
Medium
|
Matthew Ruffell | ||
Artful |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Medium
|
Unassigned |
Bug Description
[Impact]
On xenial, if you manually invoke blockcommit through virsh in libvirt, the command immediately fails with blockcommit supposedly being 100%, and that the disk is not ready for pivot yet:
root@xenial-
Block commit: [100 %]
error: failed to pivot job for disk vda
error: block copy still active: disk 'vda' not ready for pivot yet
However, if you look at the status of the active blockjob, we see that the blockcommit is still active in the background:
root@xenial-
Active Block Commit: [0 %]
root@xenial-
Active Block Commit: [2 %]
root@xenial-
Active Block Commit: [6 %]
This happens until it reaches 100%, where it gets stuck. To un-stick things, you must then manually --abort the blockjob.
root@xenial-
Active Block Commit: [100 %]
This happens in VMs which are experiencing load, and is caused by a race condition in libvirt. Users are not able to commit their snapshots to disk.
[Test Case]
Credit goes to Fabio Martins, who determined how to reproduce this issue.
On a Ubuntu 16.04 host with libvirt 1.3.1-1ubuntu10.27:
1) Create a VG and define a LVM pool:
root@xenial-
<pool type="logical">
<name>LVMpool_
<source>
<device path="/dev/sdb"/>
</source>
<target>
<path>/
</target>
</pool>
# virsh pool-define lvmpool.xml
# virsh pool-start LVMpool_vg
# virsh pool-autostart LVMpool_vg
2) Create a config file to use as a cdrom device with the new VM (will be created in next steps), just to inject a password with cloud-init:
# cat > config <<EOF
> #cloud-config
> password: passw0rd
> chpasswd: { expire: False }
> ssh_pwauth: True
> EOF
# apt install cloud-image-utils
# cloud-localds config.img config
# mv config.img /var/lib/
# chown libvirt-qemu:kvm /var/lib/
# chmod 664 /var/lib/
3) Create one VM using this pool:
# virt-install --connect=
4) Stop the VM
# virsh destroy snapvm
5) Download a Ubuntu cloud image, convert to raw and restore it into the LV used as a disk by our VM:
# wget https:/
# qemu-img convert ./bionic-
# dd if=./bionic-
6) Start the VM and connect to it in another window
# virsh start snapvm
7) Check that the VM is using the LV as the disk:
root@xenial-
Target Source
-------
vda /dev/LVMpool_
hda /var/lib/
8) Create a snapshot and check that the new domblklist points to the snapshot file:
# virsh snapshot-create-as --domain snapvm --diskspec vda,file=
root@xenial-
Target Source
-------
vda /var/lib/
hda /var/lib/
9) Connect to your VM and start an I/O intensive job. In this case I'm starting a 'dd' writing zeroes to a file until it gets to 10GBs:
ubuntu@ubuntu:~$ dd if=/dev/zero of=file.txt count=1024 bs=10240000
10) Back to the host, monitor the snapshot file and let it grow until at list a bit more than 1GB, as in the example below (where we can see the file with 3.9G):
root@xenial-
total 5.2G
-rw-rw-r-- 1 libvirt-qemu kvm 329M Sep 3 03:18 bionic-
-rw-r--r-- 1 root root 10G Sep 3 03:28 bionic-
-rw-rw-r-- 1 libvirt-qemu kvm 366K Sep 3 03:19 config.img
-rw------- 1 libvirt-qemu kvm 3.9G Sep 3 04:41 xenial-snapvm.qcow2
11) Start a blockcommit job with --active --verbose --pivot --wait and we'll hit the error when the job gets to 100%:
root@xenial-
Block commit: [100 %]
error: failed to pivot job for disk vda
error: block copy still active: disk 'vda' not ready for pivot yet
12) The blkjob will continue in the background, and status increments:
root@xenial-
Active Block Commit: [0 %]
root@xenial-
Active Block Commit: [2 %]
root@xenial-
Active Block Commit: [6 %]
13) The blkjob will show it is stuck at 100% until you --abort the blkjob:
root@xenial-
Active Block Commit: [100 %]
I have created a test package with the commits needed to solve the problem, and it is available here:
https:/
What should happen:
If you install the test libvirt-bin and libvirt0 packages from the above ppa, and run through the test case, when blockcommit is invoked, it will not fail immediately, and instead, will continue on until it reaches 100%. Once 100% is reached, the blockjob will complete successfully.
[Regression Potential]
While there are four commits which are required to fix this issue, all of them are fairly minor and only modify the way the current status percentage is counted, and how states are being changed, upon reaching 100% blockcommit. All changes are localised to one file.
Most of the commits are limited to blockcommit, and in event of regression, only blockcommit and by extension, some blockjobs would be impacted.
The commits have been present in upstream for a long time, have been well tested by the community, and are from a release of libvirt with very small delta to the one in xenial (1.3.2 versus 1.3.1 in xenial), I believe there is little risk of regression.
[Other Info]
The following commits were identified in the upstream bug:
https:/
which are also listed in comment #6.
commit 86c4df83b913dd7
Author: Michael Chapman <email address hidden>
Date: Wed Jan 27 13:24:54 2016 +1100
Subject: virsh: improve waiting for block job readiness
commit 8fa216bbb40df33
Author: Michael Chapman <email address hidden>
Date: Wed Jan 27 13:24:53 2016 +1100
Subject: virsh: ensure SIGINT action is reset on all errors
commit 15dee2ef24f2f19
Author: Michael Chapman <email address hidden>
Date: Wed Jan 27 13:24:52 2016 +1100
Subject: virsh: be consistent with style of loop exit
commit 704dfd6b0fafe7e
Author: Michael Chapman <email address hidden>
Date: Wed Jan 27 13:24:51 2016 +1100
Subject: virsh: avoid unnecessary progress updates
These fix the problem, and were introduced in libvirt 1.3.2 upstream. All commits are clean cherry picks, and the code is still present in B, D, E and F.
Changed in libvirt (Ubuntu): | |
status: | New → Incomplete |
status: | Incomplete → New |
summary: |
- libvirt - disk not ready for pivot yet + libvirt: blockcommit fails - disk not ready for pivot yet |
description: | updated |
tags: | added: sts |
Changed in libvirt (Ubuntu Xenial): | |
status: | Won't Fix → In Progress |
importance: | Undecided → Medium |
assignee: | nobody → Matthew Ruffell (mruffell) |
bugfix mentioned : https:/ /bugzilla. redhat. com/show_ bug.cgi? id=1197592