ubuntu-18.04-server-cloudimg-arm64.img often fails to set the cloud-localds password under QEMU

Bug #1818197 reported by Ciro Santilli
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-images
Invalid
Undecided
Unassigned
cloud-init
Expired
Medium
Unassigned

Bug Description

On an Ubuntu 18.10 host, QEMU 2.12.0, I try to run the cloudimage with the script shown at the end of this description.

Result: many times, I end up on the user / password prompt, but the password I chose with cloud-localds was not set.

It is not deterministic, but it happens very often. Sometimes password worked. And very rarely, it shows a stack trace on the terminal, but most times just fails silently.

The test script:

```
set -eux

# Parameters.
id=ubuntu-18.04-server-cloudimg-arm64
img="${id}.img"
img_snapshot="${id}.img.snapshot.qcow2"
flash0="${id}-flash0.img"
flash1="${id}-flash1.img"
user_data="${id}-user-data"
user_data_img="${user_data}.img"

# Install dependencies.
pkgs='cloud-image-utils qemu-system-arm qemu-efi'
if ! dpkg -s $pkgs >/dev/null 2>&1; then
  sudo apt-get install $pkgs
fi

# Get the image.
if [ ! -f "$img" ]; then
  wget "https://cloud-images.ubuntu.com/releases/18.04/release/${img}"
fi

# Create snapshot.
if [ ! -f "$img_snapshot" ]; then
  qemu-img \
    create \
    -b "$img" \
    -f qcow2 \
    "$img_snapshot" \
    1T \
  ;
fi

# Set the password.
if [ ! -f "$user_data" ]; then
  cat >"$user_data" <<EOF
#cloud-config
password: asdfqwer
chpasswd: { expire: False }
ssh_pwauth: True
EOF
  cloud-localds "$user_data_img" "$user_data"
fi

# Firmware.
if [ ! -f "$flash0" ]; then
  dd if=/dev/zero of="$flash0" bs=1M count=64
  dd if=/usr/share/qemu-efi/QEMU_EFI.fd of="$flash0" conv=notrunc
fi
if [ ! -f "$flash1" ]; then
  dd if=/dev/zero of="$flash1" bs=1M count=64
fi

# Run.
qemu-system-aarch64 \
  -machine virt \
  -cpu cortex-a57 \
  -device rtl8139,netdev=net0 \
  -device virtio-blk-device,drive=hd0 \
  -drive "if=none,file=${img_snapshot},id=hd0" \
  -drive "file=${user_data_img},format=raw" \
  -m 2G \
  -netdev user,id=net0 \
  -nographic \
  -pflash "$flash0" \
  -pflash "$flash1" \
  -smp 2 \
  "$@" \
;
```

A stack trace that I saw once (rare):

```
# [ 113.002366] cloud-init[528]: Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'init' at Thu, 17 Jan 2019 00:22:16 +0000. Up 105.40 seconds.
# [ 113.020759] cloud-init[528]: ci-info: +++++++++++++++++++++++++++Net device info++++++++++++++++++++++++++++
# [ 113.031208] cloud-init[528]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
# [ 113.041449] cloud-init[528]: ci-info: | Device | Up | Address | Mask | Scope | Hw-Address |
# [ 113.051615] cloud-init[528]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
# [ 113.061778] cloud-init[528]: ci-info: | enp0s1 | False | . | . | . | 52:54:00:12:34:56 |
# [ 113.071307] cloud-init[528]: ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | host | . |
# [ 113.088462] cloud-init[528]: ci-info: | lo | True | ::1/128 | . | host | . |
# [ 113.097037] cloud-init[528]: ci-info: +--------+-------+-----------+-----------+-------+-------------------+
# [ 113.100513] cloud-init[528]: ci-info: +++++++++++++++++++Route IPv6 info+++++++++++++++++++
# [ 113.103872] cloud-init[528]: ci-info: +-------+-------------+---------+-----------+-------+
# [ 113.107605] cloud-init[528]: ci-info: | Route | Destination | Gateway | Interface | Flags |
# [ 113.111566] cloud-init[528]: ci-info: +-------+-------------+---------+-----------+-------+
# [ 113.115000] cloud-init[528]: ci-info: +-------+-------------+---------+-----------+-------+
# [ 113.118441] cloud-init[528]: 2019-01-17 00:22:23,983 - util.py[WARNING]: failed stage init
# [ 113.238277] cloud-init[528]: failed run of stage init
# [ 113.252444] cloud-init[528]: ------------------------------------------------------------
# [ 113.255576] cloud-init[528]: Traceback (most recent call last):
# [ 113.268398] cloud-init[528]: File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 658, in status_wrapper
# [ 113.272674] cloud-init[528]: ret = functor(name, args)
# [ 113.285055] cloud-init[528]: File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 362, in main_init
# [ 113.288945] cloud-init[528]: init.apply_network_config(bring_up=bool(mode != sources.DSMODE_LOCAL))
# [ 113.293122] cloud-init[528]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 671, in apply_network_config
# [ 113.304953] cloud-init[528]: return self.distro.apply_network_config(netcfg, bring_up=bring_up)
# [ 113.309150] cloud-init[528]: File "/usr/lib/python3/dist-packages/cloudinit/distros/__init__.py", line 178, in apply_network_config
# [ 113.322181] cloud-init[528]: dev_names = self._write_network_config(netconfig)
# [ 113.327226] cloud-init[528]: File "/usr/lib/python3/dist-packages/cloudinit/distros/debian.py", line 114, in _write_network_config
# [ 113.341030] cloud-init[528]: return self._supported_write_network_config(netconfig)
# [ 113.353698] cloud-init[528]: File "/usr/lib/python3/dist-packages/cloudinit/distros/__init__.py", line 93, in _supported_write_network_config
# [ 113.358591] cloud-init[528]: renderer.render_network_config(network_config)
# [ 113.376713] cloud-init[528]: File "/usr/lib/python3/dist-packages/cloudinit/net/renderer.py", line 56, in render_network_config
# [ 113.381516] cloud-init[528]: templates=templates, target=target)
# [ 113.385465] cloud-init[528]: File "/usr/lib/python3/dist-packages/cloudinit/net/netplan.py", line 210, in render_network_state
# [ 113.389682] cloud-init[528]: self._netplan_generate(run=self._postcmds)
# [ 113.393637] cloud-init[528]: File "/usr/lib/python3/dist-packages/cloudinit/net/netplan.py", line 217, in _netplan_generate
# [ 113.397718] cloud-init[528]: util.subp(self.NETPLAN_GENERATE, capture=True)
# [ 113.401648] cloud-init[528]: File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2067, in subp
# [ 113.405796] cloud-init[528]: cmd=args)
# [ 113.411233] cloud-init[528]: cloudinit.util.ProcessExecutionError: Unexpected error while running command.
# [ 113.416694] cloud-init[528]: Command: ['netplan', 'generate']
# [ 113.420785] cloud-init[528]: Exit code: -11
# [ 113.424647] cloud-init[528]: Reason: -
# [ 113.428686] cloud-init[528]: Stdout:
# [ 113.431914] cloud-init[528]: Stderr:
# [ 113.435345] cloud-init[528]: ------------------------------------------------------------
```

Revision history for this message
Ryan Harper (raharper) wrote :

I wonder what the config netplan generate complained about.

If you can set a password on your snapshot image, you'll be able to login even when cloud-init fails.

This will get you a shell inside your image:

sudo mount-image-callback --system-mounts /path/to/qcow2-snapshot -- chroot _MOUNTPOINT_ /bin/bash

From there, you can run passwd and set the root pw for the image,

Then boot via qemu, and if it fails, you can login on console with the set password.

From there, running: 'cloud-init collect-logs' and attaching the file would be useful for debugging.

If you are in there, looking at /etc/netplan/50-cloud-init.yaml and re-running netplan --debug generate to see the error and attach that as well.

Changed in cloud-init:
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Steve Dodd (anarchetic) wrote :

Having just been battling with this - config-set-passwords is run relatively late (config stage), which seems to mean the network has to be up properly before it is executed..
if you get dropped into maintenance mode, or manually edit the image to provide a log in, "cloud-init status" and "cloud-init analyze show" will tell you what is going on.

TL;DR : relying on "password:" to create a user account you can log in with to diagnose network problems, isn't going to work :/

Revision history for this message
Ryan Harper (raharper) wrote : Re: [Bug 1818197] Re: ubuntu-18.04-server-cloudimg-arm64.img often fails to set the cloud-localds password under QEMU

On Sun, Aug 4, 2019 at 1:21 PM Steve Dodd <email address hidden>
wrote:

> Having just been battling with this - config-set-passwords is run
> relatively late (config stage), which seems to mean the network has to be
> up properly before it is executed..
> if you get dropped into maintenance mode, or manually edit the image to
> provide a log in, "cloud-init status" and "cloud-init analyze show" will
> tell you what is going on.
>
> TL;DR : relying on "password:" to create a user account you can log in
> with to diagnose network problems, isn't going to work :/
>

Sorry for being unclear; I was suggesting to modify the image _before_ your
boot,

# on your arm64 host
1. wget
http://cloud-images.ubuntu.com/daily/server/bionic/current/bionic-server-cloudimg-arm64.img

2. sudo apt install cloud-image-utils
3. sudo mount-image-callback bionic-server-cloudimg-arm64.img -- chroot
_MOUNTPOINT_ /bin/bash

This drops you into a bash shell inside the cloud-image that's not yet
booted, from here you can
 set a root password with:

4. passwd

Then type 'exit' to leave the chroot, and then boot this modified image
along with your
cloud-config or anything else. Then you can login as root with the
password you set.
And collect some debugging data.

> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1818197
>
> Title:
> ubuntu-18.04-server-cloudimg-arm64.img often fails to set the cloud-
> localds password under QEMU
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/cloud-images/+bug/1818197/+subscriptions
>

Revision history for this message
James Falcon (falcojr) wrote :
Changed in cloud-init:
status: Incomplete → Expired
Éric St-Jean (esj)
Changed in cloud-images:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.