cloud-init very slow to set password on arm64 cloud image

Bug #1846355 reported by Anisse Astier on 2019-10-02
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Undecided
Unassigned

Bug Description

Note: this might be related to bug #1818197, but I'm not sure, so I'm opening another issue. Do not hesitate to mark as duplicate.

Using qemu-system-aarch64, with native kvm or with tcg on x86, it seems the default cloud images (I've tried with eoan and disco) take up to 148 seconds to finish booting (with kvm) or 590 seconds (with tcg). It's impossible to login before the end of this delay.

Here are traces from a qemu boot:
$ cloud-init analyze show
-- Boot Record 01 --
The total time elapsed since completing an event is printed after the "@" character.
The time the event takes is printed after the "+" character.

Starting stage: init-local
|`->no cache found @00.29600s +00.01200s
|`->found local data from DataSourceNoCloud @00.53200s +04.27000s
Finished stage: (init-local) 14.81700 seconds

Starting stage: init-network
|`->restored from cache with run check: DataSourceNoCloud [seed=/dev/sr0][dsmode=net] @69.94700s +00.32200s
|`->setting up datasource @73.46100s +00.01400s
|`->reading and applying user-data @73.95400s +00.23800s
|`->reading and applying vendor-data @74.19500s +00.00600s
|`->activating datasource @75.53100s +00.14900s
|`->config-migrator ran successfully @76.81200s +00.02200s
|`->config-seed_random ran successfully @76.84300s +00.03800s
|`->config-bootcmd ran successfully @76.89100s +00.00800s
|`->config-write-files ran successfully @76.90900s +00.03600s
|`->config-growpart ran successfully @76.95500s +03.55800s
|`->config-resizefs ran successfully @80.53200s +01.20400s
|`->config-disk_setup ran successfully @81.76300s +00.08000s
|`->config-mounts ran successfully @81.85700s +00.07600s
|`->config-set_hostname ran successfully @81.94600s +00.04900s
|`->config-update_hostname ran successfully @82.00500s +00.05300s
|`->config-update_etc_hosts ran successfully @82.06900s +00.01000s
|`->config-ca-certs ran successfully @82.08900s +00.03700s
|`->config-rsyslog ran successfully @82.13600s +00.03600s
|`->config-users-groups ran successfully @82.18200s +01.81000s
|`->config-ssh ran successfully @84.00900s +04.76000s
Finished stage: (init-network) 20.79900 seconds

Starting stage: modules-config
|`->config-emit_upstart ran successfully @468.99800s +00.01500s
|`->config-snap ran successfully @469.02500s +00.04000s
|`->config-snap_config ran successfully @469.07600s +00.03500s
|`->config-ssh-import-id ran successfully @469.12200s +00.04100s
|`->config-locale ran successfully @469.18300s +00.06300s
|`->config-set-passwords ran successfully @469.25800s +02.65100s
|`->config-grub-dpkg ran successfully @471.93200s +06.08300s
|`->config-apt-pipelining ran successfully @478.04600s +00.05200s
|`->config-apt-configure ran successfully @478.10900s +04.43300s
|`->config-ubuntu-advantage ran successfully @482.59600s +00.09700s
|`->config-ntp ran successfully @482.70600s +00.03700s
|`->config-timezone ran successfully @482.75400s +00.03500s
|`->config-disable-ec2-metadata ran successfully @482.80100s +00.01400s
|`->config-runcmd ran successfully @482.82700s +00.03400s
|`->config-byobu ran successfully @482.87400s +00.03700s
Finished stage: (modules-config) 14.58300 seconds

Starting stage: modules-final
|`->config-snappy ran successfully @504.88200s +00.07500s
|`->config-package-update-upgrade-install ran successfully @504.96700s +00.03100s
|`->config-fan ran successfully @505.00900s +00.03600s
|`->config-landscape ran successfully @505.05600s +00.03000s
|`->config-lxd ran successfully @505.09600s +00.03300s
|`->config-ubuntu-drivers ran successfully @505.14000s +00.03300s
|`->config-puppet ran successfully @505.18300s +00.03300s
|`->config-chef ran successfully @505.22600s +00.03500s
|`->config-mcollective ran successfully @505.27100s +00.03300s
|`->config-salt-minion ran successfully @505.31500s +00.03200s
|`->config-rightscale_userdata ran successfully @505.35800s +00.03200s
|`->config-scripts-vendor ran successfully @505.40100s +00.03100s
|`->config-scripts-per-once ran successfully @505.44300s +00.03300s
|`->config-scripts-per-boot ran successfully @505.48700s +00.00900s
|`->config-scripts-per-instance ran successfully @505.50600s +00.03600s
|`->config-scripts-user ran successfully @505.55400s +00.03200s
|`->config-ssh-authkey-fingerprints ran successfully @505.59600s +00.11900s
|`->config-keys-to-console ran successfully @505.72600s +01.57500s
|`->config-phone-home ran successfully @507.33000s +00.05500s
|`->config-final-message ran successfully @507.39600s +00.21700s
|`->config-power-state-change ran successfully @507.62300s +00.03200s
Finished stage: (modules-final) 03.40900 seconds

Total Time: 53.60800 seconds

1 boot records analyzed

$ systemd-analyze blame
    5min 32.717s snapd.seeded.service
    1min 18.889s dev-vda1.device
     1min 3.376s systemd-udev-settle.service
         52.841s cloud-init.service
         47.654s cloud-init-local.service
         36.746s cloud-config.service
         30.937s apparmor.service
         28.002s pollinate.service
         24.492s cloud-final.service
         19.029s networkd-dispatcher.service
         16.215s systemd-logind.service
         15.985s accounts-daemon.service
         15.590s systemd-udev-trigger.service
         12.687s apport.service
         11.720s systemd-resolved.service
         11.424s systemd-journald.service
         11.007s grub-common.service
         10.479s systemd-timesyncd.service
         10.285s rsyslog.service
          9.391s keyboard-setup.service
          9.139s snap.lxd.activate.service
          5.782s systemd-networkd.service
          4.268s grub-initrd-fallback.service

It might be related to snapd since it's the one that takes 5min+ in my case.

How to reproduce ?
The script below should be enough to reproduce the issue:

#!/bin/bash
# supposes you have installed qemu-efi-aarch64, qemu-system-arm and genisoimage
set -euo pipefail

IMG=eoan-server-cloudimg-arm64.img
CLOUDINIT=./cloudinit.iso

wget -O "$IMG" https://cloud-images.ubuntu.com/eoan/current/eoan-server-cloudimg-arm64.img
echo -e '#cloud-config\nsystem_info:\n default_user:\n name: qemu\npassword: qemu\nchpasswd: { expire: False }\nssh_pwauth: True\n' > user-data
echo "local-hostname: qemu-machine" > meta-data
genisoimage -quiet -input-charset utf-8 -output "$CLOUDINIT" -volid cidata -joliet -rock user-data meta-data
rm -f user-data meta-data

qemu-system-aarch64 -machine virt -m 768 -cpu cortex-a72 -smp 2 -accel tcg,thread=multi -nographic -nodefaults \
 -bios /usr/share/qemu-efi-aarch64/QEMU_EFI.fd \
 -drive "index=0,media=disk,id=disk0,if=none,file=$IMG" -device virtio-blk,drive=disk0 \
 -device virtio-scsi-pci,id=scsi0 -drive "file=$CLOUDINIT,if=none,driver=raw,readonly=on,id=cdrom1" -device scsi-cd,bus=scsi0.0,drive=cdrom1 \
 -serial mon:stdio

Anisse Astier (anisse) wrote :

To clarify, login is impossible, *by password*. Setting an ssh key works, but login by password requires waiting for the end of the delay.

Robert C Jennings (rcj) on 2019-10-04
description: updated
Robert C Jennings (rcj) wrote :

I've changed bug from cloud-images to cloud-init to discuss unit-ordering and dependencies that are at play in this bug report.

affects: cloud-images → cloud-init
Anisse Astier (anisse) wrote :

One possible fix would be to set the user password at the local stage (like the ssh key or hostname), instead of the config stage.

Anisse Astier (anisse) wrote :

Changing the stage of the password setting might be a breaking change though; and it wouldn't solve the issue that the first boot of snap.seeded is very slow.

Dan Watkins (daniel-thewatkins) wrote :

Hi Anisse,

Thanks for using cloud-init, and for taking the time to file a bug! I'm going to mark this as a duplicate of bug 1834190, as the source of the issue is snapd.seeded being unacceptably slow. Apologies for that, we're hoping to see some improvements from that team in the near future.

Unfortunately, I don't think there's currently a great way to work around this problem other than using an SSH key. In the baked-in configuration (which you can see in /etc/cloud/cloud.cfg), `users-groups` is one of the last things to run in the init stage (right before `ssh`, which is what configures the SSH keys). As such, there isn't really any way for you to interject any manual password configuration.

One option would be to modify the image to either (a) modify cloud.cfg to move set-passwords, (b) add your own systemd unit which does the password configuration (and has After=cloud-init-local.service), or (c) disable snapd.seeded. (None of these are options that we'd necessarily be able to provide much support for, I'm afraid.)

Sorry to be the bearer of bad news, hopefully the snapd team will deliver soon!

Thanks again,

Dan

Anisse Astier (anisse) wrote :

Makes sense, thanks.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers