Thanks for checking this out. Basically just create a 16.04 VM and resize it (e.g. from D1 to D2). Look at mount/blkid ouput in between and after to see the difference:
azure config mode arm
azure vm quick-create bug1611074 reprovm centralus linux Canonical:UbuntuServer:16.04.0-LTS:latest $USER -M ~/.ssh/id_rsa.pub -z Standard_D1
ssh to machine, `mount|grep '/dev/sd'` should show something like this:
/dev/sda1 on / type ext4 (rw,relatime,discard,data=ordered)
/dev/sdb1 on /mnt type ext4 (rw,relatime,data=ordered)
Now, resize VM, which forces re-creation of the resource disk (formatted NTFS)
azure vm set bug1611074 reprovm -z Standard_D2
ssh to machine, `mount|grep '/dev/sd'` now shows this:
/dev/sda1 on / type ext4 (rw,relatime,discard,data=ordered)
/dev/sdb1 on /mnt type fuseblk (rw,relatime,user_id=0,group_id=0,allow_other,blksize=4096)
And `blkid` will show
/dev/sda1: LABEL="cloudimg-rootfs" UUID="b2e47a31-37fe-4914-b333-bd1c2a2dacae" TYPE="ext4" PARTUUID="c74ad4d8-01"
/dev/sdb1: LABEL="Temporary Storage" UUID="B82692572692170A" TYPE="ntfs" PARTUUID="4041cb24-01"
There's a slight chance that it doesn't repro, I noticed that there's a race between the scsi initialization or udev and the code in cloud-init that determines whether it should take /dev/disk/azure/resource or /dev/disk/azure/resource-part1. This code checks for the existence of the latter and if it exists, uses that. Sometimes this check fails, which leads to the resource disk not being prepared or mounted properly. The incorrect fstab entry prevents mount on the resized VM, which then allows for reformat to ext4.
If you run into this, just resize again to any size and it should repro then.
-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Dan Watkins
Sent: Wednesday, August 24, 2016 6:21 AM
To: Paul Meyer <email address hidden>
Subject: [Bug 1611074] Re: Reformatting of ephemeral drive fails on resize of Azure VM
Hi Paul,
Could you give me steps that I can follow to reproduce this issue (ideally using the Azure CLI)? That'll make it easier for us to test fixes.
Title:
Reformatting of ephemeral drive fails on resize of Azure VM
Status in cloud-init:
New
Status in cloud-init package in Ubuntu:
New
Bug description:
After resizing a 16.04 VM on Azure, the VM is presented with a new
ephemeral drive (of a different size), which initially is NTFS
formatted. Cloud-init tries to format the appropriate partition ext4,
but fails because it is mounted. Cloud-init has unmount logic for
exactly this case in the get_data call on the Azure data source, but
this is never called because fresh cache is found.
Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: start: init-network/check-cache: attempting to read from cache [trust]
Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Reading from /var/lib/cloud/instance/obj.pkl (quiet=False)
Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Read 5950 bytes from /var/lib/cloud/instance/obj.pkl
Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] stages.py[DEBUG]: restored from cache: DataSourceAzureNet [seed=/dev/sr0]
Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: finish: init-network/check-cache: SUCCESS: restored from cache: DataSourceAzureNet [seed=/dev/sr0]
...
Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Creating file system None on /dev/sdb1
Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_setup.py[DEBUG]: Using cmd: /sbin/mkfs.ext4 /dev/sdb1
Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Running command ['/sbin/mkfs.ext4', '/dev/sdb1'] with allowed return codes [0] (shell=False, capture=True)
Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Creating fs for /dev/disk/cloud/azure_resource took 0.052 seconds
Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[WARNING]: Failed during filesystem operation#012Failed to exec of '['/sbin/mkfs.ext4', '/dev/sdb1']':#012Unexpected error while running command.#012Command: ['/sbin/mkfs.ext4', '/dev/sdb1']#012Exit code: 1#012Reason: -#012Stdout: ''#012Stderr: 'mke2fs 1.42.13 (17-May-2015)\n/dev/sdb1 is mounted; will not make a filesystem here!\n'
$ lsb_release -rd
Description: Ubuntu 16.04.1 LTS
Release: 16.04
$ cat /etc/cloud/build.info
build_name: server
serial: 20160721
~$ dpkg -l cloud-init
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-================================-=====================-=====================-=====================================================================
ii cloud-init 0.7.7~bzr1256-0ubuntu all Init scripts for cloud instances
We're seeing ~100% repro of this bug on resize, where the only success
cases are caused by another bug that messes up fstab and prevents
mounting of the drive.
Hi Dan,
Thanks for checking this out. Basically just create a 16.04 VM and resize it (e.g. from D1 to D2). Look at mount/blkid ouput in between and after to see the difference:
azure config mode arm UbuntuServer: 16.04.0- LTS:latest $USER -M ~/.ssh/id_rsa.pub -z Standard_D1
azure vm quick-create bug1611074 reprovm centralus linux Canonical:
ssh to machine, `mount|grep '/dev/sd'` should show something like this: discard, data=ordered) data=ordered)
/dev/sda1 on / type ext4 (rw,relatime,
/dev/sdb1 on /mnt type ext4 (rw,relatime,
Now, resize VM, which forces re-creation of the resource disk (formatted NTFS)
azure vm set bug1611074 reprovm -z Standard_D2
ssh to machine, `mount|grep '/dev/sd'` now shows this: discard, data=ordered) user_id= 0,group_ id=0,allow_ other,blksize= 4096)
/dev/sda1 on / type ext4 (rw,relatime,
/dev/sdb1 on /mnt type fuseblk (rw,relatime,
And `blkid` will show cloudimg- rootfs" UUID="b2e47a31- 37fe-4914- b333-bd1c2a2dac ae" TYPE="ext4" PARTUUID= "c74ad4d8- 01" 692170A" TYPE="ntfs" PARTUUID= "4041cb24- 01"
/dev/sda1: LABEL="
/dev/sdb1: LABEL="Temporary Storage" UUID="B82692572
There's a slight chance that it doesn't repro, I noticed that there's a race between the scsi initialization or udev and the code in cloud-init that determines whether it should take /dev/disk/ azure/resource or /dev/disk/ azure/resource- part1. This code checks for the existence of the latter and if it exists, uses that. Sometimes this check fails, which leads to the resource disk not being prepared or mounted properly. The incorrect fstab entry prevents mount on the resized VM, which then allows for reformat to ext4.
If you run into this, just resize again to any size and it should repro then.
-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Dan Watkins
Sent: Wednesday, August 24, 2016 6:21 AM
To: Paul Meyer <email address hidden>
Subject: [Bug 1611074] Re: Reformatting of ephemeral drive fails on resize of Azure VM
Hi Paul,
Could you give me steps that I can follow to reproduce this issue (ideally using the Azure CLI)? That'll make it easier for us to test fixes.
Thanks,
Dan
-- /na01.safelinks .protection. outlook. com/?url= https%3a% 2f%2fbugs. launchpad. net%2fbugs% 2f1611074& data=01% 7c01%7cpaul. meyer%40microso ft.com% 7c6eb31b5a8b874 09b9e4308d3cc22 e364%7c72f988bf 86f141af91ab2d7 cd011db47% 7c1&sdata= zwkk7WQtS% 2bFOKRtcR1gKcRv aejefD32xAo% 2bP8IWbxEE% 3d
You received this bug notification because you are subscribed to the bug report.
https:/
Title:
Reformatting of ephemeral drive fails on resize of Azure VM
Status in cloud-init:
New
Status in cloud-init package in Ubuntu:
New
Bug description:
After resizing a 16.04 VM on Azure, the VM is presented with a new
ephemeral drive (of a different size), which initially is NTFS
formatted. Cloud-init tries to format the appropriate partition ext4,
but fails because it is mounted. Cloud-init has unmount logic for
exactly this case in the get_data call on the Azure data source, but
this is never called because fresh cache is found.
Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: start: init-network/ check-cache: attempting to read from cache [trust] cloud/instance/ obj.pkl (quiet=False) cloud/instance/ obj.pkl check-cache: SUCCESS: restored from cache: DataSourceAzureNet [seed=/dev/sr0] setup.py[ DEBUG]: Creating file system None on /dev/sdb1 setup.py[ DEBUG]: Using cmd: /sbin/mkfs.ext4 /dev/sdb1 cloud/azure_ resource took 0.052 seconds mkfs.ext4' , '/dev/sdb1' ]':#012Unexpect ed error while running command. #012Command: ['/sbin/mkfs.ext4', '/dev/sdb1' ]#012Exit code: 1#012Reason: -#012Stdout: ''#012Stderr: 'mke2fs 1.42.13 (17-May- 2015)\n/ dev/sdb1 is mounted; will not make a filesystem here!\n'
Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Reading from /var/lib/
Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Read 5950 bytes from /var/lib/
Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] stages.py[DEBUG]: restored from cache: DataSourceAzureNet [seed=/dev/sr0]
Jun 27 19:07:47 azubuntu1604arm [CLOUDINIT] handlers.py[DEBUG]: finish: init-network/
...
Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_
Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] cc_disk_
Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Running command ['/sbin/mkfs.ext4', '/dev/sdb1'] with allowed return codes [0] (shell=False, capture=True)
Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[DEBUG]: Creating fs for /dev/disk/
Jun 27 19:07:48 azubuntu1604arm [CLOUDINIT] util.py[WARNING]: Failed during filesystem operation#012Failed to exec of '['/sbin/
$ lsb_release -rd build.info Unknown/ Install/ Remove/ Purge/Hold Not/Inst/ Conf-files/ Unpacked/ halF-conf/ Half-inst/ trig-aWait/ Trig-pend /Reinst- required (Status,Err: uppercase=bad) ======= ======= ======= ======= ===-=== ======= ======= ====-== ======= ======= =====-= ======= ======= ======= ======= ======= ======= ======= ======= ======= ===== 0ubuntu all Init scripts for cloud instances
Description: Ubuntu 16.04.1 LTS
Release: 16.04
$ cat /etc/cloud/
build_name: server
serial: 20160721
~$ dpkg -l cloud-init
Desired=
| Status=
|/ Err?=(none)
||/ Name Version Architecture Description
+++-=
ii cloud-init 0.7.7~bzr1256-
We're seeing ~100% repro of this bug on resize, where the only success
cases are caused by another bug that messes up fstab and prevents
mounting of the drive.
To manage notifications about this bug go to: /na01.safelinks .protection. outlook. com/?url= https%3a% 2f%2fbugs. launchpad. net%2fcloud- init%2f% 2bbug%2f1611074 %2f%2bsubscript ions&data= 01%7c01% 7cpaul. meyer%40microso ft.com% 7c6eb31b5a8b874 09b9e4308d3cc22 e364%7c72f988bf 86f141af91ab2d7 cd011db47% 7c1&sdata= gUjpuEjliVojCo9 5ZlCZebDO% 2fzNvqnX46A2LIb 3eMsw%3d
https:/