Bug #1816170 “KVM instances fail at creation with: Failed to get...” : Bugs : libvirt package : Ubuntu

Pedro Guimarães (pguimaraes) on 2019-02-15

summary:	- KVM containers fail at creation with: Failed to get shared "write" lock + KVM instances fail at creation with: Failed to get shared "write" lock Is another process using the image?
description:	updated

Pedro Guimarães (pguimaraes) on 2019-02-15

description:

updated

Revision history for this message

Pedro Guimarães (pguimaraes) wrote on 2019-02-25:

#1

I am setting this bug to field-high as we have two projects that can greatly benefit on this feature.

tags:

added: cpe-onsite

Revision history for this message

Dmitrii Shcherbakov (dmitriis) wrote on 2019-03-14:

#2

Download full text (5.2 KiB)

/var/log from a host with a bunch of failed kvm "containers":
https://private-fileshare.canonical.com/~dima/juju-dumps/kvm-host-2019-03-14.tar.gz

Relevant VM logs: /var/log/libvirt/qemu/*

Some VMs have started successfully, some have not:

pgrep -af qemu | grep -ioP 'qemu.*?guest=.*?\s'
qemu-system-x86_64 -enable-kvm -name guest=juju-5ad6c1-6-kvm-0,debug-threads=on
qemu-system-x86_64 -enable-kvm -name guest=juju-5ad6c1-6-kvm-2,debug-threads=on
qemu-system-x86_64 -enable-kvm -name guest=juju-5ad6c1-6-kvm-3,debug-threads=on

root@control-7:~# pgrep -af qemu | grep -ioP 'drive file=.*?\s'
drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-0.qcow,format=qcow2,if=none,id=drive-virtio-disk0
drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-0-ds.iso,format=raw,if=none,id=drive-virtio-disk1
drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-2.qcow,format=qcow2,if=none,id=drive-virtio-disk0
drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-2-ds.iso,format=raw,if=none,id=drive-virtio-disk1
drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-3.qcow,format=qcow2,if=none,id=drive-virtio-disk0
drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-3-ds.iso,format=raw,if=none,id=drive-virtio-disk1

root@control-7:~# ls /var/lib/juju/kvm/guests/
bionic-amd64-backing-file.qcow juju-5ad6c1-6-kvm-1-ds.iso juju-5ad6c1-6-kvm-2.qcow juju-5ad6c1-6-kvm-4-ds.iso juju-5ad6c1-6-kvm-5.qcow
juju-5ad6c1-6-kvm-0-ds.iso juju-5ad6c1-6-kvm-1.qcow juju-5ad6c1-6-kvm-3-ds.iso juju-5ad6c1-6-kvm-4.qcow
juju-5ad6c1-6-kvm-0.qcow juju-5ad6c1-6-kvm-2-ds.iso juju-5ad6c1-6-kvm-3.qcow juju-5ad6c1-6-kvm-5-ds.iso

root@control-7:~# grep -RiP 'Failed' /var/log/libvirt/qemu/
/var/log/libvirt/qemu/juju-5ad6c1-6-kvm-4.log:2019-03-14T19:44:09.095078Z qemu-system-x86_64: -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1: Failed to get shared "write" lock
/var/log/libvirt/qemu/juju-5ad6c1-6-kvm-4.log:2019-03-14 19:44:09.184+0000: shutting down, reason=failed
/var/log/libvirt/qemu/juju-5ad6c1-6-kvm-5.log:2019-03-14T19:44:11.875801Z qemu-system-x86_64: -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1: Failed to get shared "write" lock
/var/log/libvirt/qemu/juju-5ad6c1-6-kvm-5.log:2019-03-14 19:44:12.160+0000: shutting down, reason=failed
/var/log/libvirt/qemu/juju-5ad6c1-6-kvm-1.log:2019-03-14T19:44:09.219746Z qemu-system-x86_64: -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1: Failed to get shared "write" lock
/var/log/libvirt/qemu/juju-5ad6c1-6-kvm-1.log:2019-03-14 19:44:09.497+0000: shutting down, reason=failed

Each VM uses the following files (an ISO and a qcow image):

-drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-0.qcow
-drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-0-ds.iso

ls -il /var/lib/juju/kvm/guests/
total 1519192
125175532 -rw-r--r-- 1 root root 2361393152 Mar 14 19:46 bionic-amd64-backing-file.qcow
125175524 -rw-r--r-- 1 libvirt-qemu kvm 391168 Mar 14 19:44 juju-5ad6c1-6-kvm-0-ds.iso
125175525 -rw-r--r-- 1 libvirt-qemu kvm 393216...

/var/log from a host with a bunch of failed kvm "containers":
https://private-fileshare.canonical.com/~dima/juju-dumps/kvm-host-2019-03-14.tar.gz

Relevant VM logs: /var/log/libvirt/qemu/*

Some VMs have started successfully, some have not:

pgrep -af qemu | grep -ioP 'qemu.*?guest=.*?\s'
qemu-system-x86_64 -enable-kvm -name guest=juju-5ad6c1-6-kvm-0,debug-threads=on 
qemu-system-x86_64 -enable-kvm -name guest=juju-5ad6c1-6-kvm-2,debug-threads=on 
qemu-system-x86_64 -enable-kvm -name guest=juju-5ad6c1-6-kvm-3,debug-threads=on

root@control-7:~# pgrep -af qemu | grep -ioP 'drive file=.*?\s'
drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-0.qcow,format=qcow2,if=none,id=drive-virtio-disk0 
drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-0-ds.iso,format=raw,if=none,id=drive-virtio-disk1 
drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-2.qcow,format=qcow2,if=none,id=drive-virtio-disk0 
drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-2-ds.iso,format=raw,if=none,id=drive-virtio-disk1 
drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-3.qcow,format=qcow2,if=none,id=drive-virtio-disk0 
drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-3-ds.iso,format=raw,if=none,id=drive-virtio-disk1

root@control-7:~# ls /var/lib/juju/kvm/guests/
bionic-amd64-backing-file.qcow  juju-5ad6c1-6-kvm-1-ds.iso  juju-5ad6c1-6-kvm-2.qcow    juju-5ad6c1-6-kvm-4-ds.iso  juju-5ad6c1-6-kvm-5.qcow
juju-5ad6c1-6-kvm-0-ds.iso      juju-5ad6c1-6-kvm-1.qcow    juju-5ad6c1-6-kvm-3-ds.iso  juju-5ad6c1-6-kvm-4.qcow
juju-5ad6c1-6-kvm-0.qcow        juju-5ad6c1-6-kvm-2-ds.iso  juju-5ad6c1-6-kvm-3.qcow    juju-5ad6c1-6-kvm-5-ds.iso

root@control-7:~# grep -RiP 'Failed' /var/log/libvirt/qemu/
/var/log/libvirt/qemu/juju-5ad6c1-6-kvm-4.log:2019-03-14T19:44:09.095078Z qemu-system-x86_64: -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1: Failed to get shared "write" lock
/var/log/libvirt/qemu/juju-5ad6c1-6-kvm-4.log:2019-03-14 19:44:09.184+0000: shutting down, reason=failed
/var/log/libvirt/qemu/juju-5ad6c1-6-kvm-5.log:2019-03-14T19:44:11.875801Z qemu-system-x86_64: -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1: Failed to get shared "write" lock
/var/log/libvirt/qemu/juju-5ad6c1-6-kvm-5.log:2019-03-14 19:44:12.160+0000: shutting down, reason=failed
/var/log/libvirt/qemu/juju-5ad6c1-6-kvm-1.log:2019-03-14T19:44:09.219746Z qemu-system-x86_64: -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1: Failed to get shared "write" lock
/var/log/libvirt/qemu/juju-5ad6c1-6-kvm-1.log:2019-03-14 19:44:09.497+0000: shutting down, reason=failed

Each VM uses the following files (an ISO and a qcow image):

-drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-0.qcow
-drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-0-ds.iso

ls -il /var/lib/juju/kvm/guests/
total 1519192
125175532 -rw-r--r-- 1 root         root 2361393152 Mar 14 19:46 bionic-amd64-backing-file.qcow
125175524 -rw-r--r-- 1 libvirt-qemu kvm      391168 Mar 14 19:44 juju-5ad6c1-6-kvm-0-ds.iso
125175525 -rw-r--r-- 1 libvirt-qemu kvm      393216 Mar 14 19:44 juju-5ad6c1-6-kvm-0.qcow
125175520 -rw-r--r-- 1 root         root     391168 Mar 14 19:44 juju-5ad6c1-6-kvm-1-ds.iso
125175521 -rw-r--r-- 1 root         root     196736 Mar 14 19:44 juju-5ad6c1-6-kvm-1.qcow
125175527 -rw-r--r-- 1 libvirt-qemu kvm      391168 Mar 14 19:44 juju-5ad6c1-6-kvm-2-ds.iso
125175528 -rw-r--r-- 1 libvirt-qemu kvm   469237760 Mar 14 22:09 juju-5ad6c1-6-kvm-2.qcow
125175530 -rw-r--r-- 1 libvirt-qemu kvm      391168 Mar 14 19:46 juju-5ad6c1-6-kvm-3-ds.iso
125175531 -rw-r--r-- 1 libvirt-qemu kvm      393216 Mar 14 19:46 juju-5ad6c1-6-kvm-3.qcow
125175518 -rw-r--r-- 1 root         root     391168 Mar 14 19:44 juju-5ad6c1-6-kvm-4-ds.iso
125175519 -rw-r--r-- 1 root         root     196736 Mar 14 19:44 juju-5ad6c1-6-kvm-4.qcow
125175522 -rw-r--r-- 1 root         root     391168 Mar 14 19:44 juju-5ad6c1-6-kvm-5-ds.iso
125175523 -rw-r--r-- 1 root         root     196736 Mar 14 19:44 juju-5ad6c1-6-kvm-5.qcow

All images have bionic-amd64-backing-file.qcow as a backing file:

qemu-img info /var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-5.qcow
image: /var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-5.qcow
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 196K
cluster_size: 65536
backing file: /var/lib/juju/kvm/guests/bionic-amd64-backing-file.qcow
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

Domains can be manually started:

root@control-7:~# virsh list --all
 Id    Name                           State
----------------------------------------------------
 4     juju-5ad6c1-6-kvm-0            running
 5     juju-5ad6c1-6-kvm-2            running
 6     juju-5ad6c1-6-kvm-3            running
 -     juju-5ad6c1-6-kvm-1            shut off
 -     juju-5ad6c1-6-kvm-4            shut off
 -     juju-5ad6c1-6-kvm-5            shut off

root@control-7:~# virsh start juju-5ad6c1-6-kvm-1
Domain juju-5ad6c1-6-kvm-1 started

The machines that were started initially are stuck in initramfs:

virsh console juju-5ad6c1-6-kvm-0
Connected to domain juju-5ad6c1-6-kvm-0
Escape character is ^]

(initramfs)

Revision history for this message

Dmitrii Shcherbakov (dmitriis) wrote on 2019-03-14:

#3

After starting the VMs that were not due to the above error coming from QEMU:

juju-5ad6c1-6-kvm-0: stuck in initramfs
juju-5ad6c1-6-kvm-1: shows a login prompt (juju-5ad6c1-6-kvm-1 login:)
juju-5ad6c1-6-kvm-2: shows a login prompt
juju-5ad6c1-6-kvm-3: stuck in initramfs
juju-5ad6c1-6-kvm-4: shows a login prompt
juju-5ad6c1-6-kvm-5: shows a login prompt

(initramfs) Gave up waiting for root file system device. Common problems:
- Boot args (cat /proc/cmdline)
- Check rootdelay= (did the system wait long enough?)
- Missing modules (cat /proc/modules; ls /dev)
ALERT! LABEL=cloudimg-rootfs does not exist. Dropping to a shell!

(initramfs) cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.15.0-46-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0

If ./init is ran by hand from the console in this initramfs "Gave up waiting for root file system device" is encountered again.

(initramfs) blkid
/dev/vdb: UUID="2019-03-14-19-44-16-00" LABEL="cidata" TYPE="iso9660"

(initramfs) ls /dev/vd*
/dev/vdb /dev/vda

(initramfs) mount | grep /dev/vd ; echo $?
1

http://manpages.ubuntu.com/manpages/bionic/man8/initramfs-tools.8.html
"If a label is used, as in root=LABEL=rootPart the initrd will search all available devices for a filesystem with the appropriate label, and mount that device as the root filesystem."

Revision history for this message

Bogdan Suchok (bogdan-sk) wrote on 2019-03-15:

#4

subscribed ~field-critical as this bug affects customer deployment.

Revision history for this message

Corey Bryant (corey.bryant) wrote on 2019-03-15:

#5

Targeting qemu for input from that end.

Revision history for this message

Dmitrii Shcherbakov (dmitriis) wrote on 2019-03-15:

#6

From what I see, the fix would be to use --force-share when invoking `qemu-img create -b <backing-image-path> -f qcow2 <imgpath>` as of qemu 2.10 which is currently not the case:

https://github.com/juju/juju/blob/juju-2.5.2/container/kvm/wrappedcmds.go#L474-L485

out, err := params.runCmd(
  "qemu-img",
  "create",
  "-b", backingPath,
  "-f", "qcow2",
  imgPath,
  fmt.Sprintf("%dG", params.RootDisk))

https://git.qemu.org/?p=qemu.git;a=commit;h=335e9937844449a3fc172dd5e64549ad7d5ad1c2

git --no-pager tag --contains=335e9937844449a3fc172dd5e64549ad7d5ad1c2 | grep -v rc
v2.10.0
v2.10.1
v2.10.2
v2.11.0
v2.11.1
v2.11.2
v2.12.0
v2.12.1
v3.0.0
v3.1.0

Revision history for this message

Christian Ehrhardt  (paelzer) wrote on 2019-03-15:

#7

Hi,
looking at this from qemu's POV.
But it seems you already identified most of the related history, I'll add one more (bug 1716028) to the description and then summarize.

description:

updated

Revision history for this message

Christian Ehrhardt  (paelzer) wrote on 2019-03-15:

#8

Yes qemu had too much cases in the past where it came to data corruption for using the disk multiple times.
That has caused quite some pain (as you found in the old bugs) but turned out to be much more reliable.

The different entry points use different options to eventually achieve the same which is to explicitly allow sharing the disks as needed in the rare cases that this applies:
1. manual qemu usage: cmdline needs to set share-rw=on
2. usage through libvirt: sets share-rw=on if "<sharable>" is set, see bug 1716028 and refs from there)
3. tools like qemu-img: need to set --force-share as you already identified
4. openstack: well this used #2 (and has to set sharable as needed and #3 which is fixed (by bug 1718295 and bug 1718133 that you already found)

That should be in all of the virt-stack that is on Bionic and later (where the qemu with the change is) and in the related Cloud Archive versions.

The TL;DR is:
- images really shared (like an ISO) need to marked sharable
- things not really meant to be opened multiple times e.g. same root disk file (that only worked by chance before) should not be used that way

I don't see in the comments above what would be left, are there charms using qemu-img that need to be adapted (that might be what comment #6 says) or what exactly is the missing bit here?

For qemu/libvirt all that I know to be needed for this is fixed released - setting accordingly unless we identify something new that we will need.

Changed in libvirt (Ubuntu):
status:	New → Fix Released
Changed in qemu (Ubuntu):
status:	New → Fix Released

Revision history for this message

Dmitrii Shcherbakov (dmitriis) wrote on 2019-03-15:

#9

Christian,

In this case Juju (not charms) uses the following command in its code to create a VM image based on a common cloud image:

out, err := params.runCmd(
  "qemu-img",
  "create",
  "-b", backingPath,
  "-f", "qcow2",
  imgPath,
  fmt.Sprintf("%dG", params.RootDisk))

This is done when charms are requested to be deployed into a VM provisioned by a juju machine agent (--to kvm:<host-machine-id> syntax analogous to --to lxd:<host-machine-id>).

The reported problem specifically occurs when multiple machines are provisioned on the same machine.

My understanding is:

1) that Redirect-on-Write type of snapshot is safe to do (even if multiple copies of qemu-img are using the same backing image at the same moment) because the backing image does not change in progress;
2) it is safe to run multiple VMs using different snapshots that use the same base image (because of the Redirect-on-Write behavior).
https://wiki.qemu.org/Documentation/CreateSnapshot

I initially thought that it might be a race between multiple parallel `qemu-img create -b bfile -f qcow2 <targetimg>` invocations but the errors are present in the libvirt domain log so it looks like I might be wrong.

Given that each VM has its own snapshot based on the cloud image and its own ISO, I am not sure why the write lock is necessary:

-drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-0.qcow
-drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-0-ds.iso

Any thoughts about that?

I included `ls -il /var/lib/juju/kvm/guests/` output in comment #2 to show if there were any hard links (none based on what I see in the first column).

The patch for snapshots and backing images seems to be included in qemu/bionic already so the qemu-img -b (...) invocation shouldn't be a problem:

➜ qemu git:(ubuntu/bionic) grep -RiP ' Open backing image in force share mode'
debian/patches/ubuntu/qemu-stable-2.11.1.patch:Subject: [PATCH 09/80] block: Open backing image in force share mode for size

https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg02816.html
https://git.qemu.org/?p=qemu.git;a=commit;h=cc954f01e3c004aad081aa36736a17e842b80211

git --no-pager tag --contains=cc954f01e3c004aad081aa36736a17e842b80211 | grep -v rc
v2.12.0
v2.12.1
v3.0.0
v3.1.0

Christian,

In this case Juju (not charms) uses the following command in its code to create a VM image based on a common cloud image:

out, err := params.runCmd(
  "qemu-img",
  "create",
  "-b", backingPath,
  "-f", "qcow2",
  imgPath,
  fmt.Sprintf("%dG", params.RootDisk))

This is done when charms are requested to be deployed into a VM provisioned by a juju machine agent (--to kvm:<host-machine-id> syntax analogous to --to lxd:<host-machine-id>).

The reported problem specifically occurs when multiple machines are provisioned on the same machine.

My understanding is:

1) that Redirect-on-Write type of snapshot is safe to do (even if multiple copies of qemu-img are using the same backing image at the same moment) because the backing image does not change in progress;
2) it is safe to run multiple VMs using different snapshots that use the same base image (because of the Redirect-on-Write behavior).
https://wiki.qemu.org/Documentation/CreateSnapshot

I initially thought that it might be a race between multiple parallel `qemu-img create -b bfile -f qcow2 <targetimg>` invocations but the errors are present in the libvirt domain log so it looks like I might be wrong.

Given that each VM has its own snapshot based on the cloud image and its own ISO, I am not sure why the write lock is necessary:

-drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-0.qcow
-drive file=/var/lib/juju/kvm/guests/juju-5ad6c1-6-kvm-0-ds.iso

Any thoughts about that?

I included `ls -il /var/lib/juju/kvm/guests/` output in comment #2 to show if there were any hard links (none based on what I see in the first column).

The patch for snapshots and backing images seems to be included in qemu/bionic already so the qemu-img -b (...) invocation shouldn't be a problem:

➜  qemu git:(ubuntu/bionic) grep -RiP ' Open backing image in force share mode'              
debian/patches/ubuntu/qemu-stable-2.11.1.patch:Subject: [PATCH 09/80] block: Open backing image in force share mode for size

https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg02816.html
https://git.qemu.org/?p=qemu.git;a=commit;h=cc954f01e3c004aad081aa36736a17e842b80211

git --no-pager tag --contains=cc954f01e3c004aad081aa36736a17e842b80211 | grep -v rc
v2.12.0
v2.12.1
v3.0.0
v3.1.0

Revision history for this message

Dmitrii Shcherbakov (dmitriis) wrote on 2019-03-17:

#10

Download full text (7.5 KiB)

I have done some more lab testing and here is what I found:

1) If one VM is provisioned via Juju and then after the provisioning is fully complete (i.e. a VM is started) other VMs are created successfully.

This means that there is no issue with using the same base image for snapshots of multiple VMs at the same time and that QEMU does not consider that a backing image somehow needs to be write-lock protected when multiple VMs are started.

juju add-machine # i
juju deploy cs:ubuntu -n 1 --to kvm:
# wait until 0/kvm/ starts
juju add-unit ubuntu --to kvm:
# everything is ok

2) If multiple VMs are provisioned at the same time (without any prior VM provisioning) issues start to appear:

# juju.container and juju.container.kvm tracing is enabled
juju model-config logging-config
<root>=WARNING;juju.container.kvm=TRACE;juju.container=TRACE;unit=DEBUG

juju add-machine # 
juju deploy cs:ubuntu -n 6 --to kvm:,kvm:,kvm:,kvm:,kvm:,kvm:
# errors: https://paste.ubuntu.com/p/DrsVw6xhzf/

https://private-fileshare.canonical.com/~dima/juju-dumps/kvm-host-var-log-17-03-2019.tar.gz (/var/log from the kvm host)

Intermediate Juju status shows that image downloads for multiple VMs on the same machine start concurrently:

https://paste.ubuntu.com/p/CXKPCdDZnm/ (multiple consecutive juju status commands showing that there is no synchronization between cloud image downloads)

ubuntu@maas:~$ juju status
#...
Machine State DNS Inst id Series AZ Message
3 started 10.232.24.16 node-2712d152-5daf-11e4-93f3-525400a6ab07 bionic default Deployed
3/kvm/0 pending pending bionic starting
3/kvm/1 pending pending bionic copying http://cloud-images.ubuntu.com/server/releases/bionic/release-20190307/ubuntu-18.04-server-cloudimg-amd64.img 1% (576KB/s)
3/kvm/2 pending pending bionic copying http://cloud-images.ubuntu.com/server/releases/bionic/release-20190307/ubuntu-18.04-server-cloudimg-amd64.img 47% (5.2MB/s)
3/kvm/3 pending pending bionic copying http://cloud-images.ubuntu.com/server/releases/bionic/release-20190307/ubuntu-18.04-server-cloudimg-amd64.img 34% (4.6MB/s)
3/kvm/4 pending pending bionic starting
3/kvm/5 pending pending bionic copying http://cloud-images.ubuntu.com/server/releases/bionic/release-20190307/ubuntu-18.04-server-cloudimg-amd64.img 37% (2.7MB/s)

From the machine log I can see that concurrent image downloads also result in concurrent executions of `qemu-img convert` to the same (!!!) destination file:

machine.log:
https://paste.ubuntu.com/p/fkmDv69pR8/

2019-03-17 13:48:05 DEBUG juju.container.kvm run.go:19 qemu-img [convert -f qcow2 /tmp/juju-kvm-ubuntu-18.04-server-cloudimg-amd64.img-720678751 /var/lib/juju/kvm/guests/bionic-amd64-backing-file.qcow]
2019-03-17 13:48:09 DEBUG juju.container.kvm...

I have done some more lab testing and here is what I found:

1) If one VM is provisioned via Juju and then after the provisioning is fully complete (i.e. a VM is started) other VMs are created successfully.

This means that there is no issue with using the same base image for snapshots of multiple VMs at the same time and that QEMU does not consider that a backing image somehow needs to be write-lock protected when multiple VMs are started.

juju add-machine # i
juju deploy cs:ubuntu -n 1 --to kvm:
# wait until 0/kvm/ starts
juju add-unit ubuntu --to kvm:
# everything is ok

2) If multiple VMs are provisioned at the same time (without any prior VM provisioning) issues start to appear:

# juju.container and juju.container.kvm tracing is enabled
juju model-config logging-config
<root>=WARNING;juju.container.kvm=TRACE;juju.container=TRACE;unit=DEBUG

juju add-machine # 
juju deploy cs:ubuntu -n 6 --to kvm:,kvm:,kvm:,kvm:,kvm:,kvm:
# errors: https://paste.ubuntu.com/p/DrsVw6xhzf/

https://private-fileshare.canonical.com/~dima/juju-dumps/kvm-host-var-log-17-03-2019.tar.gz (/var/log from the kvm host)

Intermediate Juju status shows that image downloads for multiple VMs on the same machine start concurrently:

https://paste.ubuntu.com/p/CXKPCdDZnm/ (multiple consecutive juju status commands showing that there is no synchronization between cloud image downloads)

ubuntu@maas:~$ juju status
#...
Machine  State    DNS           Inst id                                    Series  AZ       Message
3        started  10.232.24.16  node-2712d152-5daf-11e4-93f3-525400a6ab07  bionic  default  Deployed
3/kvm/0  pending                pending                                    bionic           starting
3/kvm/1  pending                pending                                    bionic           copying http://cloud-images.ubuntu.com/server/releases/bionic/release-20190307/ubuntu-18.04-server-cloudimg-amd64.img 1% (576KB/s)
3/kvm/2  pending                pending                                    bionic           copying http://cloud-images.ubuntu.com/server/releases/bionic/release-20190307/ubuntu-18.04-server-cloudimg-amd64.img 47% (5.2MB/s)
3/kvm/3  pending                pending                                    bionic           copying http://cloud-images.ubuntu.com/server/releases/bionic/release-20190307/ubuntu-18.04-server-cloudimg-amd64.img 34% (4.6MB/s)
3/kvm/4  pending                pending                                    bionic           starting
3/kvm/5  pending                pending                                    bionic           copying http://cloud-images.ubuntu.com/server/releases/bionic/release-20190307/ubuntu-18.04-server-cloudimg-amd64.img 37% (2.7MB/s)

From the machine log I can see that concurrent image downloads also result in concurrent executions of `qemu-img convert` to the same (!!!) destination file:

machine.log:
https://paste.ubuntu.com/p/fkmDv69pR8/

2019-03-17 13:48:05 DEBUG juju.container.kvm run.go:19 qemu-img [convert -f qcow2 /tmp/juju-kvm-ubuntu-18.04-server-cloudimg-amd64.img-720678751 /var/lib/juju/kvm/guests/bionic-amd64-backing-file.qcow]
2019-03-17 13:48:09 DEBUG juju.container.kvm run.go:19 qemu-img [convert -f qcow2 /tmp/juju-kvm-ubuntu-18.04-server-cloudimg-amd64.img-198086139 /var/lib/juju/kvm/guests/bionic-amd64-backing-file.qcow]
2019-03-17 13:48:10 DEBUG juju.container.kvm run.go:19 qemu-img [convert -f qcow2 /tmp/juju-kvm-ubuntu-18.04-server-cloudimg-amd64.img-727014373 /var/lib/juju/kvm/guests/bionic-amd64-backing-file.qcow]

# ...

2019-03-17 13:48:12 DEBUG juju.container.kvm sync.go:241 qemu-image convert output: 
2019-03-17 13:48:12 DEBUG juju.container.kvm sync.go:253 close /tmp/juju-kvm-ubuntu-18.04-server-cloudimg-amd64.img-720678751: file already closed
2019-03-17 13:48:12 DEBUG juju.container.kvm sync.go:257 got "remove /tmp/juju-kvm-ubuntu-18.04-server-cloudimg-amd64.img-720678751: no such file or directory" removing "/tmp/juju-kvm-ubuntu-18.04-server-cloudimg-amd64.img-720678751"
2019-03-17 13:48:12 DEBUG juju.container.kvm container.go:84 create the machine juju-487449-3-kvm-3
2019-03-17 13:48:12 DEBUG juju.container.kvm run_linux.go:24 running: genisoimage [-output /var/lib/juju/kvm/guests/juju-487449-3-kvm-3-ds.iso -volid cidata -joliet -rock user-data meta-data network-config]

What I noticed as well is that VMs are started concurrently to image operations for other VMs.

# start juju-487449-3-kvm-3
2019-03-17 13:48:14 DEBUG juju.container.kvm run.go:19 virsh [start juju-487449-3-kvm-3]

# in parallel, another image is created for kvm/5: qemu-img create ... juju-487449-3-kvm-5.qcow
2019-03-17 13:48:16 DEBUG juju.container.kvm run_linux.go:24 running: qemu-img [create -b /var/lib/juju/kvm/guests/bionic-amd64-backing-file.qcow -f qcow2 /var/lib/juju/kvm/guests/juju-487449-3-kvm-5.qcow 8G]
2019-03-17 13:48:16 DEBUG juju.container.kvm run_linux.go:25 running as uid: 64055, gid: 116
2019-03-17 13:48:16 DEBUG juju.container.kvm run.go:21 output: error: Failed to start domain juju-487449-3-kvm-3
error: internal error: qemu unexpectedly closed the monitor: 2019-03-17T13:48:16.305889Z qemu-system-x86_64: warning: host doesn't support requested feature: CPUID.80000001H:ECX.svm [bit 2]
2019-03-17T13:48:16.320495Z qemu-system-x86_64: -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1: Failed to get shared "write" lock
Is another process using the image?

2019-03-17 13:48:16 ERROR juju.provisioner.kvm kvm-broker.go:152 failed to start container: kvm container creation failed: failed to start domain "juju-487449-3-kvm-3": exit status 1

# after that qemu-img create completes for kvm/5
2019-03-17 13:48:17 DEBUG juju.container.kvm run_linux.go:36 output: Formatting '/var/lib/juju/kvm/guests/juju-487449-3-kvm-5.qcow', fmt=qcow2 size=8589934592 backing_file=/var/lib/juju/kvm/guests/bionic-amd64-backing-file.qcow cluster_size=65536 lazy_refcounts=off refcount_bits=16

During provisioning retry attempts a different error occurs for the data store iso which is surfaced in juju status as "exit status 13" (https://paste.ubuntu.com/p/DrsVw6xhzf/):

2019-03-17 13:48:48 TRACE juju.container directory.go:24 create directory: /var/lib/juju/containers/juju-487449-3-kvm-3
2019-03-17 13:48:48 TRACE juju.container.kvm kvm.go:178 write cloud-init
2019-03-17 13:48:48 TRACE juju.container.kvm kvm.go:217 create the container, constraints: 
2019-03-17 13:48:48 DEBUG juju.container.kvm container.go:57 synchronise images for amd64 bionic released 
2019-03-17 13:48:48 DEBUG juju.container.kvm container.go:84 create the machine juju-487449-3-kvm-3
2019-03-17 13:48:48 DEBUG juju.container.kvm run_linux.go:24 running: genisoimage [-output /var/lib/juju/kvm/guests/juju-487449-3-kvm-3-ds.iso -volid cidata -joliet -rock user-data meta-data network-config]
2019-03-17 13:48:48 DEBUG juju.container.kvm run_linux.go:25 running as uid: 64055, gid: 116
2019-03-17 13:48:48 DEBUG juju.container.kvm run_linux.go:36 output: I: -input-charset not specified, using utf-8 (detected in locale settings)
genisoimage: Permission denied. Unable to open disc image file '/var/lib/juju/kvm/guests/juju-487449-3-kvm-3-ds.iso'.

2019-03-17 13:48:48 ERROR juju.provisioner.kvm kvm-broker.go:152 failed to start container: kvm container creation failed: failed to write data source volume for "juju-487449-3-kvm-3": exit status 13
2019-03-17 13:48:48 WARNING juju.provisioner provisioner_task.go:1350 failed to start machine 3/kvm/3 (kvm container creation failed: failed to write data source volume for "juju-487449-3-kvm-3": exit status 13), retrying in 10s (8 more attempts)

Tim Penhey (thumper) on 2019-03-17

Changed in juju:
status:	New → Triaged
importance:	Undecided → High
assignee:	nobody → Richard Harding (rharding)

Revision history for this message

Christian Ehrhardt  (paelzer) wrote on 2019-03-18:

#11

Answering on commend #9 to be complete.
#1 creating a snapshot should be safe, and in fact an qemu-img create works fine on a base file
that currently has a running VM associated

#2 running multiple VMs with images referring to the same backing file is safe as well

Yes if each has an own disk (.qcow) and own .iso then the locks will be necessary but it should
be able to get the logs as nothing else holds them.

Also yes the patch for qemu-img is in Bionic by 2.11.1

Ok in comment #10 you came to the same conclusions (\o/) and identified an issue in the concurrent pull&convert of the initial image - that would make sense as the convert will need the write lock (which also denies further read locks) and the concurrent actions potentially corrupt the file anyway.

So yeah, I'd conclude similarly to you and would assume that any "image synchronize + convert" actions by Juju should be mutually exclusive to each other at least per target-img file.

You already assigned to Rick for Juju which I think is the right next step, thanks!

Joseph Phillips (manadart) on 2019-03-18

Changed in juju:
status:	Triaged → In Progress
assignee:	Richard Harding (rharding) → Joseph Phillips (manadart)

Revision history for this message

Joseph Phillips (manadart) wrote on 2019-03-18:

#12

I have proposed https://github.com/juju/juju/pull/9898 to address this.

Many thanks to all involved in reproducing/delving this issue. Really nice to have the analysis side handed to the core team on a plate :)

Changed in juju:
milestone:	none → 2.5.3

Joseph Phillips (manadart) on 2019-03-18

Changed in juju:
status:	In Progress → Fix Committed

Joseph Phillips (manadart) on 2019-03-18

Changed in juju:
status:	Fix Committed → In Progress

Joseph Phillips (manadart) on 2019-03-19

Changed in juju:
status:	In Progress → Fix Committed

Revision history for this message

Dmitrii Shcherbakov (dmitriis) wrote on 2019-03-19:

#13

Tried the first branch:

https://github.com/juju/juju/pull/9898

Had positive results for images themselves but there seems to be another issue with data store ISOs as described in the comment above https://bugs.launchpad.net/juju/+bug/1816170/comments/10 and in the comment here https://github.com/juju/juju/pull/9898#issuecomment-474345689

Revision history for this message

Dmitrii Shcherbakov (dmitriis) wrote on 2019-03-19:

#14

Tried the latest PR:

https://github.com/juju/juju/pull/9912

➜ juju git:(2.5) ✗ git --no-pager branch --contains=f73e98eeb6971ade1da020e6e58448469fc0ad00
* 2.5

➜ juju git:(2.5) ✗ grep -RiP 1ca489a5e41e9bcb51f0ba8d76616a785c9ce843
parts/juju/state/pull: source-commit: 1ca489a5e41e9bcb51f0ba8d76616a785c9ce843

Managed to reproduce the behavior in #13 on one node:

https://pastebin.canonical.com/p/fvcvjz69Bb/

Canonical Juju QA Bot (juju-qa-bot) on 2019-03-26

Changed in juju:
status:	Fix Committed → Fix Released

Ubuntu
libvirt package

KVM instances fail at creation with: Failed to get shared "write" lock Is another process using the image?

Bug Description

Other bug subscribers

Remote bug watches

	Status	Importance	Assigned to	Milestone
Canonical Juju	Fix Released	High	Joseph Phillips	Canonical Juju 2.5.3
libvirt (Ubuntu)	Fix Released	Undecided	Unassigned
qemu (Ubuntu)	Fix Released	Undecided	Unassigned

Ubuntulibvirt package

KVM instances fail at creation with: Failed to get shared "write" lock Is another process using the image?

Bug Description

Other bug subscribers

Remote bug watches

Ubuntu
libvirt package