Problem description:
If we try to deploy a single-NIC machine via MAAS, configuring an Open vSwitch bridge as the primary/PXE interface, the machine will install and boot Ubuntu 20.04 but it cannot finish the whole configuration (e.g. copying of SSH keys) and cannot be accessed/controlled via MAAS. It ends up in a "Failed" state.
This is because systemd-network-wait-online.service fails (for some reason), before netplan can fully setup and configure the OVS bridge. Because of broken networking cloud-init cannot complete its final stages, like setup of SSH keys or signaling its state back to MAAS. If we wait a little longer the OVS bridge will actually come online and networking is working – SSH not being setup and MAAS state still "Failed", though.
Steps to reproduce:
* Setup a (virtual) MAAS systemd, e.g. inside a LXD container using a KVM host, as described here: https://discourse.maas.io/t/setting-up-a-flexible-virtual-maas-test-environment/142
* Install & setup maas[-cli] snap from 2.9/beta channel (instead of the deb/PPA from the discourse post)
* Configure netplan PPA+key for testing via "Settings" -> "Package repos": https://launchpad.net/~slyon/+archive/ubuntu/ovs
* Prepare curtin preseed in /var/snap/maas/current/preseeds/curtin_userdata, inside the LXD container (so you can access the broken machine afterwards):
======================
#cloud-config
debconf_selections:
maas: |
{{for line in str(curtin_preseed).splitlines()}}
{{line}}
{{endfor}}
late_commands:
maas: [wget, '--no-proxy', '{{node_disable_pxe_url}}', '--post-data', '{{node_disable_pxe_data}}', '-O', '/dev/null']
90_create_user: ["curtin", "in-target", "--", "sh", "-c", "sudo useradd test -g 0 -G sudo"]
92_set_user_password: ["curtin", "in-target", "--", "sh", "-c", "echo 'test:test' | sudo chpasswd"]
94_cat: ["curtin", "in-target", "--", "sh", "-c", "cat /etc/passwd"]
======================
* Compose a new virtual machine via MAAS' "KVM" menu, named e.g. "test1"
* Watch it being commissioned via MAAS' "Machines" menu
* Once it's ready select your machine (e.g. "test1.maas") -> Network
* Select the single network interface (e.g. "ens4") -> Create bridge
* Choose "Bridge type: Open vSwitch (ovs)", Select "Subnet" and "IP mode", save.
* Deploy machine to Ubuntu 20.04 via "Take action" button
The machine will install the OS and boot, but will end up in a "Failed" state inside MAAS due to network/OVS not being setup correctly. MAAS/SSH has no control over it. You can access the (broken) machine via serial console from the KVM-host (i.e. LXD container) via "virsh console test1" using the "test:test" credentials.
Problem description:
If we try to deploy a single-NIC machine via MAAS, configuring an Open vSwitch bridge as the primary/PXE interface, the machine will install and boot Ubuntu 20.04 but it cannot finish the whole configuration (e.g. copying of SSH keys) and cannot be accessed/controlled via MAAS. It ends up in a "Failed" state.
This is because systemd- network- wait-online. service fails (for some reason), before netplan can fully setup and configure the OVS bridge. Because of broken networking cloud-init cannot complete its final stages, like setup of SSH keys or signaling its state back to MAAS. If we wait a little longer the OVS bridge will actually come online and networking is working – SSH not being setup and MAAS state still "Failed", though.
Steps to reproduce: /discourse. maas.io/ t/setting- up-a-flexible- virtual- maas-test- environment/ 142 /launchpad. net/~slyon/ +archive/ ubuntu/ ovs maas/current/ preseeds/ curtin_ userdata, inside the LXD container (so you can access the broken machine afterwards): ======= ======= = preseed) .splitlines( )}} disable_ pxe_url} }', '--post-data', '{{node_ disable_ pxe_data} }', '-O', '/dev/null'] user_password: ["curtin", "in-target", "--", "sh", "-c", "echo 'test:test' | sudo chpasswd"] ======= ======= =
* Setup a (virtual) MAAS systemd, e.g. inside a LXD container using a KVM host, as described here:
https:/
* Install & setup maas[-cli] snap from 2.9/beta channel (instead of the deb/PPA from the discourse post)
* Configure netplan PPA+key for testing via "Settings" -> "Package repos":
https:/
* Prepare curtin preseed in /var/snap/
=======
#cloud-config
debconf_selections:
maas: |
{{for line in str(curtin_
{{line}}
{{endfor}}
late_commands:
maas: [wget, '--no-proxy', '{{node_
90_create_user: ["curtin", "in-target", "--", "sh", "-c", "sudo useradd test -g 0 -G sudo"]
92_set_
94_cat: ["curtin", "in-target", "--", "sh", "-c", "cat /etc/passwd"]
=======
* Compose a new virtual machine via MAAS' "KVM" menu, named e.g. "test1"
* Watch it being commissioned via MAAS' "Machines" menu
* Once it's ready select your machine (e.g. "test1.maas") -> Network
* Select the single network interface (e.g. "ens4") -> Create bridge
* Choose "Bridge type: Open vSwitch (ovs)", Select "Subnet" and "IP mode", save.
* Deploy machine to Ubuntu 20.04 via "Take action" button
The machine will install the OS and boot, but will end up in a "Failed" state inside MAAS due to network/OVS not being setup correctly. MAAS/SSH has no control over it. You can access the (broken) machine via serial console from the KVM-host (i.e. LXD container) via "virsh console test1" using the "test:test" credentials.