Comment 9 for bug 1837869

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

#0 raw_handle_perm_lock (bs=0x560aad12b550, op=RAW_PL_PREPARE, new_perm=11, new_shared=21, errp=0x7ffc6ff54090) at ./block/file-posix.c:722
#1 0x0000560aac2b9e8e in bdrv_check_perm (bs=bs@entry=0x560aad12b550, q=0x560aadd834a0, q@entry=0x90af93e91d77c000, cumulative_perms=11, cumulative_shared_perms=<optimized out>,
    ignore_children=ignore_children@entry=0x560aad0c7e70, errp=errp@entry=0x7ffc6ff54090) at ./block.c:1655
#2 0x0000560aac2b9d05 in bdrv_check_update_perm (bs=0x560aad12b550, q=0x90af93e91d77c000, q@entry=0x560aadd834a0, new_used_perm=new_used_perm@entry=11,
    new_shared_perm=new_shared_perm@entry=21, ignore_children=ignore_children@entry=0x560aad0c7e70, errp=errp@entry=0x7ffc6ff54090) at ./block.c:1841
#3 0x0000560aac2b9f54 in bdrv_child_check_perm (errp=0x7ffc6ff54090, ignore_children=0x560aad0c7e70, shared=<optimized out>, perm=11, q=0x560aadd834a0, c=0x560aad0566a0) at ./block.c:1854
#4 bdrv_check_perm (bs=0x560aad125250, bs@entry=0xb, q=0x560aadd834a0, q@entry=0x90af93e91d77c000, cumulative_perms=1, cumulative_shared_perms=21,
    ignore_children=ignore_children@entry=0x560aad03e700, errp=0x7ffc6ff54090, errp@entry=0x15) at ./block.c:1671
#5 0x0000560aac2b9d05 in bdrv_check_update_perm (bs=0xb, q=0x90af93e91d77c000, q@entry=0x560aadd834a0, new_used_perm=new_used_perm@entry=1, new_shared_perm=new_shared_perm@entry=21,
    ignore_children=ignore_children@entry=0x560aad03e700, errp=0x15, errp@entry=0x7ffc6ff54090) at ./block.c:1841
#6 0x0000560aac2b9f54 in bdrv_child_check_perm (errp=0x7ffc6ff54090, ignore_children=0x560aad03e700, shared=<optimized out>, perm=1, q=0x560aadd834a0, c=0x560aad06e800) at ./block.c:1854
#7 bdrv_check_perm (bs=0x560aad105750, bs@entry=0x1, q=0x560aadd834a0, q@entry=0x90af93e91d77c000, cumulative_perms=1, cumulative_shared_perms=21,
    ignore_children=ignore_children@entry=0x560aad03e570, errp=0x7ffc6ff54090, errp@entry=0x15) at ./block.c:1671
#8 0x0000560aac2b9d05 in bdrv_check_update_perm (bs=0x1, q=0x90af93e91d77c000, q@entry=0x560aadd834a0, new_used_perm=new_used_perm@entry=1, new_shared_perm=new_shared_perm@entry=21,
    ignore_children=ignore_children@entry=0x560aad03e570, errp=0x15, errp@entry=0x7ffc6ff54090) at ./block.c:1841
#9 0x0000560aac2b9f54 in bdrv_child_check_perm (errp=0x7ffc6ff54090, ignore_children=0x560aad03e570, shared=<optimized out>, perm=1, q=0x560aadd834a0, c=0x560aad03e200) at ./block.c:1854
#10 bdrv_check_perm (bs=0x560aad0e53d0, q=q@entry=0x560aadd834a0, cumulative_perms=1, cumulative_shared_perms=21, ignore_children=ignore_children@entry=0x0, errp=errp@entry=0x7ffc6ff54090)
    at ./block.c:1671
#11 0x0000560aac2bb7ea in bdrv_reopen_prepare (reopen_state=reopen_state@entry=0x560aadd83418, queue=queue@entry=0x560aadd834a0, errp=errp@entry=0x7ffc6ff54090) at ./block.c:3111
#12 0x0000560aac2bb94f in bdrv_reopen_multiple (ctx=<optimized out>, bs_queue=0x560aadd834a0, errp=errp@entry=0x7ffc6ff540f0) at ./block.c:2887
#13 0x0000560aac2bbacf in bdrv_reopen (bs=bs@entry=0x560aad0e53d0, bdrv_flags=<optimized out>, errp=errp@entry=0x7ffc6ff541f0) at ./block.c:2928
#14 0x0000560aac306f3e in commit_active_start (job_id=job_id@entry=0x0, bs=bs@entry=0x560aadf47890, base=base@entry=0x560aad0e53d0, creation_flags=creation_flags@entry=0,
    speed=speed@entry=0, on_error=on_error@entry=BLOCKDEV_ON_ERROR_REPORT, filter_node_name=0x0, cb=0x0, opaque=0x0, auto_complete=false, errp=0x7ffc6ff541f0) at ./block/mirror.c:1311
#15 0x0000560aac0cf30a in qmp_block_commit (has_job_id=<optimized out>, job_id=0x0, device=<optimized out>, has_base=<optimized out>, base=<optimized out>, has_top=<optimized out>,
    top=0x0, has_backing_file=false, backing_file=0x0, has_speed=false, speed=0, has_filter_node_name=false, filter_node_name=0x0, errp=0x7ffc6ff54288) at ./blockdev.c:3150
#16 0x0000560aac0d9d2e in qmp_marshal_block_commit (args=<optimized out>, ret=<optimized out>, errp=0x7ffc6ff54348) at qmp-marshal.c:168

(gdb) p s->lock_fd
$2 = 18

root@b:~# ll /proc/5942/fd/18
lr-x------ 1 root root 64 Jul 30 06:45 /proc/5942/fd/18 -> /root/base.qcow2

(gdb) p op
$3 = RAW_PL_PREPARE

(gdb) p ~s->shared_perm | ~new_shared
$7 = 18446744073709551594
(gdb) p ~(s->shared_perm) | ~new_shared
$8 = 18446744073709551594

goes into
raw_apply_lock_bytes (s=s@entry=0x560aad112ab0, perm_lock_bits=11, shared_perm_lock_bits=18446744073709551594, unlock=unlock@entry=false, errp=errp@entry=0x7ffc6ff54090)

qemu_lock_fd (fd=18, start=100, len=1, exclusive=false) at ./util/osdep.c:255

qemu_lock_fcntl (fd=18, start=100, len=1, fl_type=0) at ./util/osdep.c:238

The first round on RAW_LOCK_PERM_BASE all succeed.
Value was 11 which is

BLK_PERM_CONSISTENT_READ
BLK_PERM_WRITE
BLK_PERM_RESIZE

Now on the shared offset it wants shared_perm_lock_bits=18446744073709551594
That seems like an error, maybe by gdb displaying ?
Almost looks like the inverse (but later code path shows it is the absolute value e.g. index 0 is skipped):
Absolute would be:
BLK_PERM_WRITE
BLK_PERM_RESIZE

raw_apply_lock_bytes completes and returns 0
raw_handle_perm_lock continues

It enters raw_check_lock_bytes which verifies the success of the former action.
It essentially has the same loops but with qemu_lock_fd_test instead of qemu_lock_fd

The backtrace shows that it passes the child relationships of the images

bdrv_reopen_prepare -> bdrv_child_check_perm -> bdrv_check_update_perm -> bdrv_child_check_perm (loop back to bdrv_check_update_perm)
Until it reaches the base backing store.

The failing one is (as expected by the error) the write lock, but now we know it is the shared one:
(gdb) p off
$58 = 201

As we discussed here before, the assumption is that it won't need the write lock here.
Interestingly the error messages are off by one, the check for RAW_LOCK_SHARED_BASE reports
  "Failed to get \"%s\" lock",
And the check to RAW_LOCK_PERM_BASE
  "Failed to get shared \"%s\" lock",

Shouldn't that be vice-versa?
Anyway we know in which code to look for now.