Azure: cloud-init skips formatting the resource disk (ephemeral0) when there is additional data disks' configuration in user-data

Bug #1879552 reported by Anh Vo (MSFT)
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Expired
Undecided
Unassigned

Bug Description

Deploying a bionic VM on Azure (Canonical:UbuntunServer:18.04-LTS:latest) with VM Size Standard_DS1_V2 and an additional datadisk with the following config

#cloud-config

disk_setup:
  /dev/disk/azure/scsi1/lun0:
    table_type: gpt
    layout: True
    overwrite: True

fs_setup:
  - device: /dev/disk/azure/scsi1/lun0
    partition: 1
    filesystem: ext4

mounts:
 - [ /dev/disk/azure/scsi1/lun0, /datadisk1, "ext4", "defaults,nofail,discard", "0", "0" ]

Expected Result:
+ Resource disk (ephemeral0) is formatted as ext4 and mounted to /mnt (which is what happens without attaching the data disk)

Actual Result:
+ Resource disk is partitioned but not formatted, got mounted to /mnt as ntfs

I used this command to create a VM with datadisk and passing in custom data
 az vm create -g <resource_group> -n vmname --image Canonical:UbuntuServer:18.04-LTS:latest --admin-username adminuser --ssh-key-value @/home/user/.ssh/key.pub --boot-diagnostics-storage storage_account --size Standard_DS1_V2 --custom-data ./customdata.yml --data-disk-sizes-gb 32

I have attached the cloud-init log and the custom data

Tags: azure
Revision history for this message
Anh Vo (MSFT) (vtqanh) wrote :
Revision history for this message
Ryan Harper (raharper) wrote :

Hi Anh,

The issue is with built-in config merging with user-data. The Azure
datasource uses this built-in config:

#cloud-config
disk_setup:
  ephemeral0:
     table_type: gpt
     layout: [100]
     overwrite: True

fs_setup:
 - filesystem: DEFAULT_FS
   device: ephemeral0.1

mounts:
 - [ /dev/ephemeral0, /mnt, auto, "defaults,noexec" ]

And in your example the user provides this config

#cloud-config
disk_setup:
  /dev/disk/azure/scsi1/lun0:
    table_type: gpt
    layout: True
    overwrite: True

fs_setup:
  - device: /dev/disk/azure/scsi1/lun0
    partition: 1
    filesystem: ext4

mounts:
 - [ /dev/disk/azure/scsi1/lun0, /datadisk1, "ext4", "defaults,nofail,discard", "0", "0" ]

The Azure Datasource will merge these configs together like so:

util.mergemanydict([userdata, builtin]) and the combined config looks like
this:

disk_setup:
    /dev/disk/azure/scsi1/lun0:
        layout: true
        overwrite: true
        table_type: gpt
    ephemeral0:
        layout:
        - 100
        overwrite: true
        table_type: gpt
fs_setup:
- device: /dev/disk/azure/scsi1/lun0
    filesystem: ext4
    partition: 1
mounts:
- - /dev/disk/azure/scsi1/lun0
    - /datadisk1
    - ext4
    - defaults,nofail,discard
    - '0'
    - '0'

As you can see the fs_setup and mounts are lists, and the default merging
of lists is replacement; the user-data's fs_setup and mounts will override
the built-in config; disk_setup is a dictionary, which by default will merge
missing keys.

This is expected behavior, reserving full user control over the built-in config.
At this time, the only remedy is for users to replicate the built-in config in
their user-data if they would like the ephemeral disk configured the same
way as it would without supplying disk configuration.

#cloud-config
disk_setup:
  ephemeral0:
     table_type: gpt
     layout: [100]
     overwrite: True
  /dev/disk/azure/scsi1/lun0:
    table_type: gpt
    layout: True
    overwrite: True

fs_setup:
  - device: ephemeral0.1
    filesystem: DEFAULT_FS
  - device: /dev/disk/azure/scsi1/lun0
    partition: 1
    filesystem: ext4

mounts:
 - [ /dev/ephemeral0, /mnt, auto, "defaults,noexec" ]
 - [ /dev/disk/azure/scsi1/lun0, /datadisk1, "ext4", "defaults,nofail,discard", "0", "0" ]

Revision history for this message
Ryan Harper (raharper) wrote :

I'd like to explore a couple of options here:

One possible solution is to have fs_setup and mounts work with a dictionary format (as well as supporting lists)

% yprint built-in-new.cfg
disk_setup:
    ephemeral0:
        layout:
        - 100
        overwrite: true
        table_type: gpt
fs_setup:
    ephemeral0.1:
        device: ephemeral0.1
        filesystem: DEFAULT_FS
mounts:
    mnt:
    - - /dev/ephemeral0
        - /mnt
        - auto
        - defaults,noexec

% yprint user-data-new.cfg
disk_setup:
    /dev/disk/azure/scsi1/lun0:
        layout: true
        overwrite: true
        table_type: gpt
fs_setup:
    lun0:
        device: /dev/disk/azure/scsi1/lun0
        filesystem: ext4
        partition: 1
mounts:
    datadisk1:
    - - /dev/disk/azure/scsi1/lun0
        - /datadisk1
        - ext4
        - defaults,nofail,discard
        - '0'
        - '0'

>>> print(yaml.dump(merged, default_flow_style=False, indent=4))
disk_setup:
    /dev/disk/azure/scsi1/lun0:
        layout: true
        overwrite: true
        table_type: gpt
    ephemeral0:
        layout:
        - 100
        overwrite: true
        table_type: gpt
fs_setup:
    ephemeral0.1:
        device: ephemeral0.1
        filesystem: DEFAULT_FS
    lun0:
        device: /dev/disk/azure/scsi1/lun0
        filesystem: ext4
        partition: 1
mounts:
    datadisk1:
    - - /dev/disk/azure/scsi1/lun0
        - /datadisk1
        - ext4
        - defaults,nofail,discard
        - '0'
        - '0'
    mnt:
    - - /dev/ephemeral0
        - /mnt
        - auto
        - defaults,noexec

Changed in cloud-init:
status: New → Confirmed
Revision history for this message
Ryan Harper (raharper) wrote :

Another option would be to make it easier for users to indicate they want the defaults in addition to their changes:

disk_setup:
  builtin: true
  /dev/disk/azure/scsi1/lun0: {...}

fs_setup:
  - builtin
  - device: /dev/disk/azure/scsi1/lun0
    partition: 1
    filesystem: ext4

mounts:
 - builtin
 - [ /dev/disk/azure/scsi1/lun0, /datadisk1, "ext4", "defaults,nofail,discard", "0", "0" ]

Revision history for this message
Anh Vo (MSFT) (vtqanh) wrote :

Would changing the fs_setup and mounts to take dictionary affect existing user-data out there?
I do like the option of using "builtin" to allow users to keep whatever default setup that the cloud providers have (because it might have changed between different cloud-init versions and the users might be keeping the same user-data for some time without realizing things have been changed and they miss out on some optimization from the platform)

Revision history for this message
Dan Watkins (oddbloke) wrote : Re: [Bug 1879552] Re: Azure: cloud-init skips formatting the resource disk (ephemeral0) when there is additional data disks' configuration in user-data

On Tue, May 19, 2020 at 08:14:54PM -0000, Ryan Harper wrote:
> Another option would be to make it easier for users to indicate they
> want the defaults in addition to their changes:
>
> disk_setup:
> builtin: true
> /dev/disk/azure/scsi1/lun0: {...}
>
>
> fs_setup:
> - builtin
> - device: /dev/disk/azure/scsi1/lun0
> partition: 1
> filesystem: ext4
>
> mounts:
> - builtin
> - [ /dev/disk/azure/scsi1/lun0, /datadisk1, "ext4", "defaults,nofail,discard", "0", "0" ]

An aside: we use "default" for a similar concept for users[0] and I
think that word works in this case too ("include the default disk setup
and this additional setup" is a perfectly understandable thing to say,
for example); I would suggest using it for consistency across the
interface we provide to users.

[0] https://cloudinit.readthedocs.io/en/latest/topics/modules.html#users-and-groups

Revision history for this message
James Falcon (falcojr) wrote :
Changed in cloud-init:
status: Confirmed → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.