copy-storage migrations fail on nbd-server-add

Bug #1764373 reported by Christian Ehrhardt 
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
qemu (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

In Bionic I see an issue where a migration with --copy-storage-* options leads to an issue:

+ lxc exec testkvm-bionic-from -- virsh migrate --live --copy-storage-inc kvmguest-bionic-normal qemu+ssh://10.220.91.67/system
error: internal error: unable to execute QEMU command 'nbd-server-add': Block node is read-only

This came up when revamping the virt tests (and not before), so it might be a setup issue in the new environment. But OTOH it sould be checked at some point, therefore the bug report.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

There is an apparmor deny that could be related, but the path looks just wrong.

audit: type=1400 audit(1523957176.480:37835): apparmor="DENIED"
namespace="root//lxd-testkvm-bionic-tononshared_<var-lib-lxd>"
pid=8721 comm="qemu-system-x86"
fsuid=64055 ouid=64055
profile="libvirt-1c67131a-7177-4f49-9840-f1092310890d"
denied_mask="wr"
  operation="open"
  name="/0"
  requested_mask="wr"

Name = /0 ????

The deny is on the migration target.

There are no "magic" devices that could point to /0

root@testkvm-bionic-from:~# virsh domblklist kvmguest-bionic-normal
Target Source
------------------------------------------------
vda /var/lib/uvtool/libvirt/images/kvmguest-bionic-normal.qcow
vdb /var/lib/uvtool/libvirt/images/kvmguest-bionic-normal-ds.qcow

root@testkvm-bionic-from:~# virsh domblkinfo kvmguest-bionic-normal vda
Capacity: 8589934592
Allocation: 30347264
Physical: 30343168

root@testkvm-bionic-from:~# virsh domblkinfo kvmguest-bionic-normal vdb
Capacity: 378880
Allocation: 458752
Physical: 458752

For now I'd assume it is more libvirt (or qemu) getting strange ideas about which paths to use than a missing apparmor rule.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Target has the original disk paths from a pre-sync.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Receiving qemu is called as:

qemu-system-x86_64
 -enable-kvm
 -name guest=kvmguest-bionic-normal,debug-threads=on
 -S
 -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-5-kvmguest-bionic-norm/master-key.aes
 -machine pc-i440fx-bionic,accel=kvm,usb=off,dump-guest-core=off
 -m 512
 -realtime mlock=off
 -smp 1,sockets=1,cores=1,threads=1
 -uuid 1c67131a-7177-4f49-9840-f1092310890d
 -no-user-config
 -nodefaults
 -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-5-kvmguest-bionic-norm/monitor.sock,server,nowait
 -mon chardev=charmonitor,id=monitor,mode=control
 -rtc base=utc
 -no-shutdown
 -boot strict=on
 -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
 -drive file=/var/lib/uvtool/libvirt/images/kvmguest-bionic-normal.qcow,format=qcow2,if=none,id=drive-virtio-disk0
 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
 -drive file=/var/lib/uvtool/libvirt/images/kvmguest-bionic-normal-ds.qcow,format=qcow2,if=none,id=drive-virtio-disk1
 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1
 -netdev tap,fd=26,id=hostnet0
 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:30:c3:6c,bus=pci.0,addr=0x3
 -chardev pty,id=charserial0
 -device isa-serial,chardev=charserial0,id=serial0
 -vnc 127.0.0.1:0
 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2
 -incoming defer
 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
 -msg timestamp=on

No disk specified as /0 or anything like it.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I found this warning in the guest:
qemu-system-x86_64: -chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)

While maybe not directly related, that would have a /0 in the path at least.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Discussing on IRC we found that the message above really could be related.
As pts is a mount point, so it might complain on /0 in /dev/pts

devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,mode=620,ptmxmode=666,max=1024)

As bonus complexity this is in LXD Containers, so it might work on BM Hosts but only trigger in containers due to some stacking issue around /dev/pts?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The profile for qemu uses:
  #include <abstractions/consoles>

And that has:
  /dev/pts/[0-9]* rw,

But the strace doesn't show denies on pts, so is this a red herring?
     0.000034 stat("/dev/pts/0", {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0 <0.000017>
     0.000052 openat(AT_FDCWD, "/dev/pts/0", O_RDWR|O_NOCTTY) = 11 <0.000019>
     0.000330 ioctl(11, TCGETS, {B38400 opost isig icanon echo ...}) = 0 <0.000105>
     0.000139 ioctl(11, TCGETS, {B38400 opost isig icanon echo ...}) = 0 <0.000010>
     0.000034 ioctl(11, SNDCTL_TMR_CONTINUE or TCSETSF, {B38400 -opost -isig -icanon -echo ...}) = 0 <0.000013>
     0.000037 ioctl(11, TCGETS, {B38400 -opost -isig -icanon -echo ...}) = 0 <0.000010>
     0.000034 ioctl(10, TCGETS, {B38400 -opost -isig -icanon -echo ...}) = 0 <0.000011>
     0.000033 ioctl(10, TIOCGPTN, [0]) = 0 <0.000010>
     0.000033 stat("/dev/pts/0", {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0 <0.000016>
     0.000045 close(11) = 0 <0.000013>

The only Permission denied thou (at all) is on /dev/pts/0 with this call:
0.000055 ioctl(10, TIOCGPTPEER, 0x102) = -1 EACCES (Permission denied) <0.000025>

Maybe try to get gdb on it, and break on syscall open?
That would give the BT as well, but I think we are not fast enough to catch it in time.
I tried I could not catch it fast enough in a few tries (directly hooked gdb in a pidof loop).

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The message "-chardev pty,id=charserial0: char device redirected to /dev/pts/0 (label charserial0)" appears on the source as well (and there things work just fine).

Never the less I created a guest without serial&console to not need this.
Migrating this guest then results in:
- no such error about /dev/pts/0 being redirected
- no apparmor issue anymore (so the message above and apparmor are tied)
- also the above only happens in containers and console is still working (=no issue other than the dmesg entry)
- BUT the migration still fails.

So the former thoughts were a red herring, a valid issue but a red herring still in regard to tracking the migration issue.
I was able to show that now the /dev/pts/0 open is gone
I'll break out the /dev/pts/0 issue into an extra bug, then I'll refocus on this issue here.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Reported bug 1764715 for the apparmor mis-detect of the path.

Now thinking again what/why this is breaking on:
  error: internal error: unable to execute QEMU command 'nbd-server-add': Block node is read-only

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Download full text (4.0 KiB)

bdrv_check_perm is the qemu function that emits the message.

Qemu runs as # id 64055
uid=64055(libvirt-qemu) gid=115(kvm) groups=115(kvm),64055(libvirt-qemu)

The images are made root:root by qemu itself.
I made them, but it is not the classic file ownership that is missing:
-rw-rw-rw- 1 libvirt-qemu kvm 458752 Apr 17 08:44 /var/lib/uvtool/libvirt/images/kvmguest-bionic-normal-ds.qcow
-rw-rw-rw- 1 libvirt-qemu kvm 226361344 Apr 17 10:45 /var/lib/uvtool/libvirt/images/kvmguest-bionic-normal.qcow

I see one EAGAIN and eventually one EPERM in (only the EPERM):
Thread 1 "qemu-system-x86" hit Breakpoint 1, bdrv_check_perm (bs=bs@entry=0x556a76be7150, q=q@entry=0x0, cumulative_perms=3,
    cumulative_shared_perms=23, ignore_children=ignore_children@entry=0x0, errp=errp@entry=0x7ffd515f59b8) at ./block.c:1636
1636 in ./block.c
(gdb) p *bs
$12 = {open_flags = 10242, read_only = false, encrypted = false, sg = false, probed = false, force_share = false, implicit = false,
  drv = 0x556a74c48be0 <bdrv_qcow2>, opaque = 0x556a76bf1fd0, aio_context = 0x556a76bbe1e0, aio_notifiers = {lh_first = 0x0},
  walking_aio_notifiers = false, filename = "/var/lib/uvtool/libvirt/images/kvmguest-bionic-normal.qcow", '\000' <repeats 4037 times>,
  backing_file = "/var/lib/uvtool/libvirt/images/x-uvt-b64-Y29tLnVidW50dS5jbG91ZC5kYWlseTpzZXJ2ZXI6MTguMDQ6YW1kNjQgMjAxODA0MTY=", '\000' <repeats 3986 times>, backing_format = "qcow2\000\000\000\000\000\000\000\000\000\000", full_open_options = 0x556a76c0b510,
  exact_filename = "/var/lib/uvtool/libvirt/images/kvmguest-bionic-normal.qcow", '\000' <repeats 4037 times>, backing = 0x556a76ba4a00,
  file = 0x556a76ba3e00, bl = {request_alignment = 1, max_pdiscard = 0, pdiscard_alignment = 65536, max_pwrite_zeroes = 0,
    pwrite_zeroes_alignment = 65536, opt_transfer = 0, max_transfer = 0, min_mem_alignment = 1, opt_mem_alignment = 4096, max_iov = 1024},
  supported_write_flags = 0, supported_zero_flags = 4, node_name = "#block144", '\000' <repeats 22 times>, node_list = {
    tqe_next = 0x556a76c0d550, tqe_prev = 0x556a76bf07b0}, bs_list = {tqe_next = 0x556a76bed6f0, tqe_prev = 0x556a74c46990 <all_bdrv_states>},
  monitor_list = {tqe_next = 0x0, tqe_prev = 0x0}, refcnt = 1, op_blockers = {{lh_first = 0x0} <repeats 16 times>}, job = 0x0,
  inherits_from = 0x0, children = {lh_first = 0x556a76ba4a00}, parents = {lh_first = 0x556a76b8d600}, options = 0x556a76be4970,
  explicit_options = 0x556a76bea3f0, detect_zeroes = BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF, backing_blocker = 0x556a76c12de0,
  total_sectors = 16777216, before_write_notifiers = {notifiers = {lh_first = 0x0}}, write_threshold_offset = 0, write_threshold_notifier = {
    notify = 0x0, node = {le_next = 0x0, le_prev = 0x0}}, dirty_bitmap_mutex = {lock = {__data = {__lock = 0, __count = 0, __owner = 0,
        __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>,
      __align = 0}, initialized = true}, dirty_bitmaps = {lh_first = 0x0}, wr_highest_offset = {value = 0}, copy_on_read = 0, in_flight = 0,
  serialising_in_flight = 0, wakeup = false, io_plugged = 0, enable_write_cache = 0...

Read more...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The files are the same on both, so we really need to find the qemu path that decides this isn't writable.

Same checks on what seems the same file work before.
Comparing arguments:
Working call:
cumulative_perms=cumulative_perms@entry=1,
cumulative_shared_perms=cumulative_shared_perms@entry=31,
... in *bs
drv = 0x556a74c4 <bdrv_file>

The failing one is on qcow.
cumulative_perms=3,
cumulative_shared_perms=23,
... in *bs
drv = 0x556a74c4 <bdrv_qcow2>

Now this is "a stack" of backing files.
I'm sure this is passed in order, need to track the real first breaking one in the stack.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

The one that issues the message is at
#0 bdrv_check_perm (bs=bs@entry=0x55ca4c7a7150, q=0x0, q@entry=0xf9574d6356918c00, cumulative_perms=3, cumulative_shared_perms=23,
    ignore_children=ignore_children@entry=0x0, errp=errp@entry=0x7ffdba216258) at ./block.c:1645
#1 0x000055ca4a0645b5 in bdrv_check_update_perm (bs=bs@entry=0x55ca4c7a7150, q=0xf9574d6356918c00, q@entry=0x0,
    new_used_perm=new_used_perm@entry=3, new_shared_perm=new_shared_perm@entry=23, ignore_children=ignore_children@entry=0x0,
    errp=errp@entry=0x7ffdba216258) at ./block.c:1841
#2 0x000055ca4a064d47 in bdrv_root_attach_child (child_bs=child_bs@entry=0x55ca4c7a7150, child_name=child_name@entry=0x55ca4a204ae2 "root",
    child_role=child_role@entry=0x55ca4a701b80 <child_root>, perm=3, shared_perm=23, opaque=opaque@entry=0x55ca4c832c60, errp=0x7ffdba216258)
    at ./block.c:2041
#3 0x000055ca4a0a5796 in blk_insert_bs (blk=blk@entry=0x55ca4c832c60, bs=bs@entry=0x55ca4c7a7150, errp=errp@entry=0x7ffdba216258)
    at ./block/block-backend.c:728
#4 0x000055ca4a0c6347 in nbd_export_new (bs=bs@entry=0x55ca4c7a7150, dev_offset=dev_offset@entry=0, size=size@entry=-1,
    nbdflags=<optimized out>, close=close@entry=0x0, writethrough=writethrough@entry=false, on_eject_blk=0x55ca4c7a6ef0, errp=0x7ffdba216258)
    at ./nbd/server.c:1091
#5 0x000055ca49e7be85 in qmp_nbd_server_add (device=0x55ca4c8501b0 "drive-virtio-disk0", has_writable=<optimized out>,
    writable=<optimized out>, errp=0x7ffdba216258) at ./blockdev-nbd.c:192
#6 0x000055ca49e89625 in qmp_marshal_nbd_server_add (args=<optimized out>, ret=<optimized out>, errp=0x7ffdba2162c8) at qmp-marshal.c:2198
#7 0x000055ca4a11e387 in do_qmp_dispatch (errp=0x7ffdba2162c0, request=0x55ca4cd57c70, cmds=0x55ca4a78aee0 <qmp_commands>)
    at ./qapi/qmp-dispatch.c:104
#8 qmp_dispatch (cmds=0x55ca4a78aee0 <qmp_commands>, request=request@entry=0x55ca4cd57c70) at ./qapi/qmp-dispatch.c:131

Lets extract the nbd call and try to do this without a migration.
qemu-system-x86_64 -S -enable-kvm -m 512 -no-user-config -nodefaults -monitor stdio -drive file=/var/lib/uvtool/libvirt/images/kvmguest-bionic-normal.qcow,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
qemu-system-x86_64: -drive file=/var/lib/uvtool/libvirt/images/kvmguest-bionic-normal.qcow,format=qcow2,if=none,id=drive-virtio-disk0: Failed to get "write" lock

Haaaaaaar - I know whats going on.
This is not properly unshared anymore.
The two containers use the same backing FS and therefore the second can't lock it.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Test scripts fixed, this was eventually fallout from our change in container initialization.
It now does the unsharing after restore from the snapshot.

With that in place it now works like a charm again.
  21.11.0 (14:05:09): Test migration options without shared storage of a bionic guest testkvm-bionic-noupd/testkvm-bionic-to
  21.12.0 (14:05:11): Test live migration (extra option '--copy-storage-all') of a bionic guest testkvm-bionic-noupd/testkvm-bionic-tononshared
    21.12.1 (14:05:11): live migration (extra option '--copy-storage-all') testkvm-bionic-noupd -> testkvm-bionic-tononshared => Pass
    21.12.2 (14:05:24): Check if guest kvmguest-bionic-normal on testkvm-bionic-tononshared is alive => Pass
    21.12.3 (14:05:33): live migration back (extra option '--copy-storage-all') testkvm-bionic-tononshared -> testkvm-bionic-noupd => Pass
    21.12.4 (14:06:04): Check if guest kvmguest-bionic-normal on testkvm-bionic-noupd is alive => Pass
  21.13.0 (14:06:12): Test live migration (extra option '--copy-storage-inc') of a bionic guest testkvm-bionic-noupd/testkvm-bionic-tononshared
    21.13.1 (14:06:12): live migration (extra option '--copy-storage-inc') testkvm-bionic-noupd -> testkvm-bionic-tononshared => Pass
    21.13.2 (14:06:25): Check if guest kvmguest-bionic-normal on testkvm-bionic-tononshared is alive => Pass
    21.13.3 (14:06:35): live migration back (extra option '--copy-storage-inc') testkvm-bionic-tononshared -> testkvm-bionic-noupd => Pass
    21.13.4 (14:06:59): Check if guest kvmguest-bionic-normal on testkvm-bionic-noupd is alive => Pass

Changed in qemu (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.