test-mic can fail / writes to qemu-nbd device do not persist

Bug #1741096 reported by Scott Moser on 2018-01-03
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-utils
Undecided
Unassigned
cloud-utils (Ubuntu)
Medium
Unassigned

Bug Description

### Whats going on here?
## test-mic is a test of mount-image-callback in cloud-image-utils
##
## It populates some disk images and then uses mic to write some
## files (info.txt and data.txt)
## then calls mic again to verify the files are there.
##
## the failure i'm seeing is that one of the files written during
## the first mic is not present in the second.
##
## mic does:
## mount via qemu-nbd, run command, umount, qemu-nbd disconnect
##

this recreates at least 1 in 3 times in my local tests here.

$ uname -r
4.13.0-17-generic

$ dpkg-query --show qemu-utils
qemu-utils 1:2.10+dfsg-0ubuntu5

$ bzr branch lp:cloud-utils cloud-utils
$ cd cloud-utils

## the output below has some debug info added
$ sudo PATH=$PWD/bin:$PATH ./test/test-mic 2>&1 | tee out.log
testing partition 1 image
testing partition 2 image
testing --system-mounts on pt1
testing unexpected mount get unmounted.
writing hunks to disk image /tmp/test-mic.zGTR4a/disk-mbr.img
partitioning MBR disk image /tmp/test-mic.zGTR4a/disk-mbr.img
partitioning GPT disk image /tmp/test-mic.zGTR4a/disk-gpt.img
MBR: testing partition 1 in /tmp/test-mic.zGTR4a/disk-mbr.img
pid for /dev/nbd0 is 30004
connected /tmp/test-mic.zGTR4a/disk-mbr.img (raw) 1 to /dev/nbd0. waiting for device.
mounted /dev/nbd0p1 via qemu-nbd /dev/nbd0 at /tmp/mount-image-callback.0ikR3l/mp
invoking: MOUNTPOINT=/tmp/mount-image-callback.0ikR3l/mp sh -c echo 'foo1' > data.txt
cmd returned 0. unmounting /tmp/mount-image-callback.0ikR3l/mp
disconnecting /dev/nbd0
pid for /dev/nbd0 is 30077
connected /tmp/test-mic.zGTR4a/disk-mbr.img (raw) 1 to /dev/nbd0. waiting for device.
mounted /dev/nbd0p1 via qemu-nbd /dev/nbd0 at /tmp/mount-image-callback.2wgbbV/mp
invoking: MOUNTPOINT=/tmp/mount-image-callback.2wgbbV/mp sh -c
      r=0; for f in "$@"; do
      cat $f || { echo "$f: cat failed rc=$?"; r=99; }; done;
      exit $r sh-extract info.txt data.txt
cmd returned 0. unmounting /tmp/mount-image-callback.2wgbbV/mp
disconnecting /dev/nbd0
MBR: testing partition 2 in /tmp/test-mic.zGTR4a/disk-mbr.img
pid for /dev/nbd0 is 30153
connected /tmp/test-mic.zGTR4a/disk-mbr.img (raw) 2 to /dev/nbd0. waiting for device.
mounted /dev/nbd0p2 via qemu-nbd /dev/nbd0 at /tmp/mount-image-callback.9PowSH/mp
invoking: MOUNTPOINT=/tmp/mount-image-callback.9PowSH/mp sh -c echo 'foo2' > data.txt
cmd returned 0. unmounting /tmp/mount-image-callback.9PowSH/mp
disconnecting /dev/nbd0
pid for /dev/nbd0 is 30222
connected /tmp/test-mic.zGTR4a/disk-mbr.img (raw) 2 to /dev/nbd0. waiting for device.
mounted /dev/nbd0p2 via qemu-nbd /dev/nbd0 at /tmp/mount-image-callback.Az1hhv/mp
invoking: MOUNTPOINT=/tmp/mount-image-callback.Az1hhv/mp sh -c
      r=0; for f in "$@"; do
      cat $f || { echo "$f: cat failed rc=$?"; r=99; }; done;
      exit $r sh-extract info.txt data.txt
cat: data.txt: No such file or directory
cmd returned 99. unmounting /tmp/mount-image-callback.Az1hhv/mp
disconnecting /dev/nbd0
MBR:
== expected on 2 ==
partition 2
foo2
== found on 2 ==
partition 2
data.txt: cat failed rc=1
MBR: failed testing partition 2 on /tmp/test-mic.zGTR4a/disk-mbr.img
smoser@milhouse:~/src/cloud-utils/trunk$ sudo PATH=$PWD/bin:$PATH ./test/test-mic 2>&1 | tee out.log
testing partition 1 image
testing partition 2 image
testing --system-mounts on pt1
testing unexpected mount get unmounted.
writing hunks to disk image /tmp/test-mic.SCaQUe/disk-mbr.img
partitioning MBR disk image /tmp/test-mic.SCaQUe/disk-mbr.img
partitioning GPT disk image /tmp/test-mic.SCaQUe/disk-gpt.img
MBR: testing partition 1 in /tmp/test-mic.SCaQUe/disk-mbr.img
pid for /dev/nbd0 is 30674
connected /tmp/test-mic.SCaQUe/disk-mbr.img (raw) 1 to /dev/nbd0. waiting for device.
mounted /dev/nbd0p1 via qemu-nbd /dev/nbd0 at /tmp/mount-image-callback.0bIcqZ/mp
invoking: MOUNTPOINT=/tmp/mount-image-callback.0bIcqZ/mp sh -c echo 'foo1' > data.txt
cmd returned 0. unmounting /tmp/mount-image-callback.0bIcqZ/mp
disconnecting /dev/nbd0
waiting on pidfile for /dev/nbd0 in /sys/block/nbd0/pid
.
pid for /dev/nbd0 is 30757
connected /tmp/test-mic.SCaQUe/disk-mbr.img (raw) 1 to /dev/nbd0. waiting for device.
mounted /dev/nbd0p1 via qemu-nbd /dev/nbd0 at /tmp/mount-image-callback.ZNIbzM/mp
invoking: MOUNTPOINT=/tmp/mount-image-callback.ZNIbzM/mp sh -c
      r=0; for f in "$@"; do
      cat $f || { echo "$f: cat failed rc=$?"; r=99; }; done;
      exit $r sh-extract info.txt data.txt
cat: data.txt: No such file or directory
cmd returned 99. unmounting /tmp/mount-image-callback.ZNIbzM/mp
disconnecting /dev/nbd0
MBR:
== expected on 1 ==
partition 1
foo1
== found on 1 ==
partition 1
data.txt: cat failed rc=1
MBR: failed testing partition 1 on /tmp/test-mic.SCaQUe/disk-mbr.img

### new data in dmesg during run
[1300609.359437] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300609.506371] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300609.619183] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300609.752534] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300609.889312] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300610.022198] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300610.225399] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300610.374885] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300610.634821] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300610.830573] EXT4-fs (loop3): mounted filesystem with ordered data mode. Opts: (null)
[1300612.701430] nbd0: p1 p2
[1300612.713894] EXT4-fs (nbd0p1): mounted filesystem with ordered data mode. Opts: (null)
[1300612.798977] block nbd0: NBD_DISCONNECT
[1300612.799003] block nbd0: shutting down sockets
[1300612.800582] print_req_error: 22 callbacks suppressed
[1300612.800584] print_req_error: I/O error, dev nbd0, sector 18954
[1300612.800590] EXT4-fs warning (device nbd0p1): ext4_end_bio:322: I/O error 10 writing to inode 15 (offset 0 size 0 starting block 9478)
[1300612.800592] Buffer I/O error on device nbd0p1, logical block 8453
[1300612.800614] print_req_error: I/O error, dev nbd0, sector 100354
[1300612.800616] buffer_io_error: 17 callbacks suppressed
[1300612.800617] Buffer I/O error on dev nbd0p1, logical block 49153, lost sync page write
[1300612.800622] JBD2: Error -5 detected when updating journal superblock for nbd0p1-8.
[1300612.800623] Aborting journal on device nbd0p1-8.
[1300612.800630] print_req_error: I/O error, dev nbd0, sector 100354
[1300612.800632] Buffer I/O error on dev nbd0p1, logical block 49153, lost sync page write
[1300612.800636] JBD2: Error -5 detected when updating journal superblock for nbd0p1-8.
[1300612.800643] JBD2: Detected IO errors while flushing file data on nbd0p1-8
[1300612.800664] print_req_error: I/O error, dev nbd0, sector 0
[1300612.800693] print_req_error: I/O error, dev nbd0, sector 0
[1300612.800708] print_req_error: I/O error, dev nbd0, sector 0
[1300612.800723] print_req_error: I/O error, dev nbd0, sector 0
[1300612.800815] print_req_error: I/O error, dev nbd0, sector 2050
[1300612.800817] Buffer I/O error on dev nbd0p1, logical block 1, lost sync page write
[1300612.837164] nbd0: p1 p2
[1300612.953403] EXT4-fs (nbd0p1): INFO: recovery required on readonly filesystem
[1300612.953405] EXT4-fs (nbd0p1): write access will be enabled during recovery
[1300612.953745] EXT4-fs (nbd0p1): recovery complete
[1300612.954785] EXT4-fs (nbd0p1): mounted filesystem with ordered data mode. Opts: (null)
[1300613.011345] block nbd0: NBD_DISCONNECT
[1300613.011378] block nbd0: shutting down sockets
[1300613.015916] nbd0: detected capacity change from 0 to 316669952
[1300613.016004] print_req_error: I/O error, dev nbd0, sector 0
[1300613.016010] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016027] print_req_error: I/O error, dev nbd0, sector 0
[1300613.016030] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016039] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016044] ldm_validate_partition_table(): Disk read failed.
[1300613.016050] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016058] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016066] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016074] Buffer I/O error on dev nbd0, logical block 0, async page read
[1300613.016078] Dev nbd0: unable to read RDB block 0
[1300613.016136] nbd0: unable to read partition table

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: cloud-utils 0.30-0ubuntu2
ProcVersionSignature: Ubuntu 4.13.0-17.20-generic 4.13.8
Uname: Linux 4.13.0-17-generic x86_64
NonfreeKernelModules: zfs zunicode zavl zcommon znvpair
ApportVersion: 2.20.8-0ubuntu5
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Wed Jan 3 11:47:00 2018
EcryptfsInUse: Yes
InstallationDate: Installed on 2015-07-23 (895 days ago)
InstallationMedia: Ubuntu 15.10 "Wily Werewolf" - Alpha amd64 (20150722.1)
PackageArchitecture: all
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: cloud-utils
UpgradeStatus: No upgrade log present (probably fresh install)

Scott Moser (smoser) wrote :
Changed in cloud-utils (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
Scott Moser (smoser) wrote :

I've played a bit. Attached patch here has logging done with logger in an effort to serialize things.

Then, I have 'go.sh' (see below) that just loops over running test-mic until it fails.
Then you get bookend messages in /var/log/syslog. I've collected a fail here.

Scott Moser (smoser) wrote :

go.sh looks like

#!/bin/sh
[ "$(id -u)" = "0" ] || { echo "not root" 1>&2; exit 1; }
i=0
while true; do
   logger --tag=xtest-mic-$i "start xtest-mic-$i"
   echo -n .
   {
   read up idle </proc/uptime
   echo $up
   ./test/test-mic
   r=$?
   read up idle </proc/uptime
   echo $up
   echo $r
   } >out.log 2>&1
   logger --tag=xtest-mic-$i "end xtest-mic-$i $r"
   [ $r -eq 0 ] || { echo; echo "failed xtest-mic-$i"; exit 1; }
   i=$(($i+1))
done

Scott Moser (smoser) wrote :

run like
$ sudo PATH=$PWD/bin:$PATH ./go.sh
....
failed xtest-mic-3
smoser@milhouse:~/src/cloud-utils/trunk$ cat out.log
19536.67
cat: data.txt: No such file or directory
GPT:
== expected on 1 ==
partition 1
foo1
== found on 1 ==
partition 1
data.txt: cat failed rc=1
19543.86
1

Scott Moser (smoser) wrote :

heres another log of /var/log/syslog. http://paste.ubuntu.com/26315330/
I'm not certain if the order of entries in /varlog/syslog can be 100% trusted or not.

If it can be, then we definitely see oddities.
 * mount-image-callback definitly unmounts before running qemu-nbd --disconnect.
 * the 'detected capacity change' seems odd.
 * during qemu-nbd --disconnect we see Buffer I/O errors.

tags: added: patch
Scott Moser (smoser) wrote :

I considered the fact that maybe the partitioned image that test-mic was
creating was somehow invalid. To test that, I modifed mount-image-callback
to set the mount device in MOUNT_DEV, and then did:

cp "$pt1" /tmp/part1.img
cp "$pt2" /tmp/part2.img
mount-image-callback --read-only --part=1 "$img_mbr" -- sh -c 'dd if=$MOUNT_DEV of=/tmp/part1.mbr.img'
mount-image-callback --read-only --part=2 "$img_mbr" -- sh -c 'dd if=$MOUNT_DEV of=/tmp/part2.mbr.img'
mount-image-callback --read-only --part=1 "$img_gpt" -- sh -c 'dd if=$MOUNT_DEV of=/tmp/part1.gpt.img'
mount-image-callback --read-only --part=2 "$img_gpt" -- sh -c 'dd if=$MOUNT_DEV of=/tmp/part2.gpt.img'

then an 'md5sum' shows that the 'dd'd image is identical to the partition image.

It seems to me that that validates the partition table and the data. I was
somewhat concerned that maybe my partition/sector math in test-mic was
truncating the partition or filesystem. that does not seem to be the case.

So, it seems that this is either an issue in
 a.) qemu-nbd
 b.) kernel
 c.) mount-image-callback assumptions on qemu-nbd usage.

for 'c', the assumption is that this is valid:
DEV=/dev/nbd0 # unused nbd device
BNAME=$(basename $DEV)
PARTNUM=2
PART_PATH=${DEV}p${PARTNUM}
FILE=/tmp/my-2partition-mbr.img
 1.) qemu-nbd --format=raw --connect $DEV $FILE
 2.) wait until a 'pid' file appears in /sys/block/$BNAME/pid)
 3.) wait until $PART_PATH exists
 4.) mount $PART_PATH to MOUNTPOINT
 5.) execute user code that operates on on MOUNTPOINT
 6.) umount MOUNTPOINT (and anything under it)
 7.) qemu-disconnect --disconnect $DEV

I have assumed that 'umount' would block until all file descriptors
are flushed to the backing /dev/nbd device, and that 'qemu-disconnect'
would block until any writes already sent were written through to the
backing file.

Scott Moser (smoser) wrote :

Attaching a recreate script with loop included.
It should run correctly just about anywhere, with mount-image-callback installed.

must be root.
to run with a different/modified mount-image-callback, just add it to your PATH.

Scott Moser (smoser) wrote :

the recreate script demonstrates failure on xenial, artful and bionic in a VM.
xenial needs a change to mount-image-callback to run 'blockdev' on the nbd device. i'm not actually sure how i've never seen that bug before.

with the change in place, all 3 can be made to show the issue. artful and bionic vms are actually somewhat reliable, but if you run two copies of the script concurrently it will fail.

=== modified file 'bin/mount-image-callback'
--- bin/mount-image-callback 2018-01-03 15:44:47 +0000
+++ bin/mount-image-callback 2018-01-04 17:19:37 +0000
@@ -316,6 +316,7 @@
        fi

        i=0
+ [ -b "$mdev" ] || blockdev --rereadpt $nbd
        while :; do
                [ -b "$mdev" ] && break
                i=$(($i+1))

Scott Moser (smoser) wrote :

this actually seems to fix the issue and I cannot recreate with it in place.
 http://paste.ubuntu.com/26320756/

I suspect what this is showing is that we were operating on nbd devices that
existed from a previous run.

Scott Moser (smoser) wrote :

that had some noise:
 === modified file 'bin/mount-image-callback'
--- bin/mount-image-callback 2018-01-03 15:44:47 +0000
+++ bin/mount-image-callback 2018-01-04 18:08:43 +0000
@@ -284,6 +284,10 @@

  debug 1 "connected $img_in ($fmt) ${rwmode} to $nbd. waiting for device."
  i=0
+ local out=""
+ out=$(blockdev --rereadpt "$nbd" 2>&1) ||
+ debug 1 "blockdev rereadpt $nbd failed"
+ udevadm settle
  while i=$(($i+1)):; do
   get_partition "$nbd" && nptnum="$_RET" && break
   [ $i -eq 40 ] && {

Scott Moser (smoser) wrote :

The fix that I've applied is to add the '2.5' to the list below.

 1.) qemu-nbd --format=raw --connect $DEV $FILE
 2.) wait until a 'pid' file appears in /sys/block/$BNAME/pid)
 2.5) run 'blockdev --rereadpt' and 'udevadm settle'
 3.) wait until $PART_PATH exists
 4.) mount $PART_PATH to MOUNTPOINT
 5.) execute user code that operates on on MOUNTPOINT
 6.) umount MOUNTPOINT (and anything under it)
 7.) qemu-disconnect --disconnect $DEV

note that 'udevadm settle' is not entirely unlike 'sleep 1', but if you call blockdev --rereadpt, then you do need to settle afterward.

Fetched cloud utils on a new (bionic) system.
Running go.sh just to see if I can trigger the case at all on my system.

I could and got to the
cat: data.txt: No such file or directory

I can confirm the IO errors, overall I see:
xtest-mic-0: start xtest-mic-0
[4998071.265687] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[4998071.372544] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[4998071.485091] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[4998071.593007] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[4998071.716770] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[4998072.500496] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[4998074.532148] EXT4-fs (nbd1p1): mounted filesystem with ordered data mode. Opts: (null)
[4998074.549530] print_req_error: I/O error, dev nbd0, sector 0
[4998074.607975] print_req_error: I/O error, dev nbd0, sector 0
[4998074.633572] Buffer I/O error on dev nbd0, logical block 0, async page read
[4998074.841685] print_req_error: I/O error, dev nbd0, sector 0
[4998074.867993] Buffer I/O error on dev nbd0, logical block 0, async page read
[4998074.907662] block nbd10: shutting down sockets
[4998074.909787] Buffer I/O error on dev nbd10p2, logical block 90113, lost sync page write
[4998074.909796] JBD2: Error -5 detected when updating journal superblock for nbd10p2-8.
[4998074.941009] EXT4-fs (nbd1p2): write access will be enabled during recovery
[4998075.276217] block nbd1: NBD_DISCONNECT
[4998075.276643] nbd1: detected capacity change from 0 to 316669952
[4998075.276796] Dev nbd1: unable to read RDB block 0

I think I need to break down the test to the most simple form that triggers it and check then.

Only now saw your updates as I opened yesterday and didn't refresh since then.
With the workaround in cloud utils known consider it low for qmeu-nbd to find more background details.

Changed in cloud-utils (Ubuntu):
importance: Medium → Wishlist
importance: Wishlist → Medium
Changed in qemu (Ubuntu):
importance: Undecided → Low
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-utils - 0.30-0ubuntu3

---------------
cloud-utils (0.30-0ubuntu3) bionic; urgency=medium

  * sync to trunk at 322
    - mount-image-callback: run blockdev and udevadm settle on nbd devices.
      (LP: #1741096, #1741300)
    - mount-image-callback: Drop support for mounting lxd containers.
      (LP: #1715994)

 -- Scott Moser <email address hidden> Thu, 04 Jan 2018 14:17:26 -0500

Changed in cloud-utils (Ubuntu):
status: Confirmed → Fix Released
Scott Moser (smoser) wrote :

I'm dropping the qemu-nbd. i'm not certain that there is actually anything that was wrong there.

no longer affects: qemu (Ubuntu)

This bug is believed to be fixed in cloud-utils in version 0.31. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in cloud-utils:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers