2020-10-08 08:34:19 |
Lukas Märdian |
bug |
|
|
added bug |
2020-10-08 08:39:10 |
Lukas Märdian |
description |
Problem description:
If we try to deploy a single-NIC machine via MAAS, configuring an Open vSwitch bridge as the primary/PXE interface, the machine will install and boot Ubuntu 20.04 but it cannot finish the whole configuration (e.g. copying of SSH keys) and cannot be accessed/controlled via MAAS. It ends up in a "Failed" state.
This is because systemd-network-wait-online.service fails (for some reason), before netplan can fully setup and configure the OVS bridge. Because of broken networking cloud-init cannot complete its final stages, like setup of SSH keys or signaling its state back to MAAS. If we wait a little longer the OVS bridge will actually come online and networking is working – SSH not being setup and MAAS state still "Failed", though.
Steps to reproduce:
* Setup a (virtual) MAAS systemd, e.g. inside a LXD container using a KVM host, as described here:
https://discourse.maas.io/t/setting-up-a-flexible-virtual-maas-test-environment/142
* Install & setup maas[-cli] snap from 2.9/beta channel (instead of the deb/PPA from the discourse post)
* Configure netplan PPA+key for testing via "Settings" -> "Package repos":
https://launchpad.net/~slyon/+archive/ubuntu/ovs
* Prepare curtin preseed in /var/snap/maas/current/preseeds/curtin_userdata, inside the LXD container (so you can access the broken machine afterwards):
======================
#cloud-config
debconf_selections:
maas: |
{{for line in str(curtin_preseed).splitlines()}}
{{line}}
{{endfor}}
late_commands:
maas: [wget, '--no-proxy', '{{node_disable_pxe_url}}', '--post-data', '{{node_disable_pxe_data}}', '-O', '/dev/null']
90_create_user: ["curtin", "in-target", "--", "sh", "-c", "sudo useradd test -g 0 -G sudo"]
92_set_user_password: ["curtin", "in-target", "--", "sh", "-c", "echo 'test:test' | sudo chpasswd"]
94_cat: ["curtin", "in-target", "--", "sh", "-c", "cat /etc/passwd"]
======================
* Compose a new virtual machine via MAAS' "KVM" menu, named e.g. "test1"
* Watch it being commissioned via MAAS' "Machines" menu
* Once it's ready select your machine (e.g. "test1.maas") -> Network
* Select the single network interface (e.g. "ens4") -> Create bridge
* Choose "Bridge type: Open vSwitch (ovs)", Select "Subnet" and "IP mode", save.
* Deploy machine to Ubuntu 20.04 via "Take action" button
The machine will install the OS and boot, but will end up in a "Failed" state inside MAAS due to network/OVS not being setup correctly. MAAS/SSH has no control over it. You can access the (broken) machine via serial console from the KVM-host (i.e. LXD container) via "virsh console test1" using the "test:test" credentials. |
Problem description:
If we try to deploy a single-NIC machine via MAAS, configuring an Open vSwitch bridge as the primary/PXE interface, the machine will install and boot Ubuntu 20.04 but it cannot finish the whole configuration (e.g. copying of SSH keys) and cannot be accessed/controlled via MAAS. It ends up in a "Failed" state.
This is because systemd-network-wait-online.service fails (for some reason), before netplan can fully setup and configure the OVS bridge. Because of broken networking cloud-init cannot complete its final stages, like setup of SSH keys or signaling its state back to MAAS. If we wait a little longer the OVS bridge will actually come online and networking is working – SSH not being setup and MAAS state still "Failed", though.
Steps to reproduce:
* Setup a (virtual) MAAS system, e.g. inside a LXD container using a KVM host, as described here:
https://discourse.maas.io/t/setting-up-a-flexible-virtual-maas-test-environment/142
* Install & setup maas[-cli] snap from 2.9/beta channel (instead of the deb/PPA from the discourse post)
* Configure netplan PPA+key for testing via "Settings" -> "Package repos":
https://launchpad.net/~slyon/+archive/ubuntu/ovs
* Prepare curtin preseed in /var/snap/maas/current/preseeds/curtin_userdata, inside the LXD container (so you can access the broken machine afterwards):
======================
#cloud-config
debconf_selections:
maas: |
{{for line in str(curtin_preseed).splitlines()}}
{{line}}
{{endfor}}
late_commands:
maas: [wget, '--no-proxy', '{{node_disable_pxe_url}}', '--post-data', '{{node_disable_pxe_data}}', '-O', '/dev/null']
90_create_user: ["curtin", "in-target", "--", "sh", "-c", "sudo useradd test -g 0 -G sudo"]
92_set_user_password: ["curtin", "in-target", "--", "sh", "-c", "echo 'test:test' | sudo chpasswd"]
94_cat: ["curtin", "in-target", "--", "sh", "-c", "cat /etc/passwd"]
======================
* Compose a new virtual machine via MAAS' "KVM" menu, named e.g. "test1"
* Watch it being commissioned via MAAS' "Machines" menu
* Once it's ready select your machine (e.g. "test1.maas") -> Network
* Select the single network interface (e.g. "ens4") -> Create bridge
* Choose "Bridge type: Open vSwitch (ovs)", Select "Subnet" and "IP mode", save.
* Deploy machine to Ubuntu 20.04 via "Take action" button
The machine will install the OS and boot, but will end up in a "Failed" state inside MAAS due to network/OVS not being setup correctly. MAAS/SSH has no control over it. You can access the (broken) machine via serial console from the KVM-host (i.e. LXD container) via "virsh console test1" using the "test:test" credentials. |
|
2020-10-08 08:47:55 |
Lukas Märdian |
attachment added |
|
syslog https://bugs.launchpad.net/netplan/+bug/1898997/+attachment/5419434/+files/syslog |
|
2020-10-08 10:32:47 |
Lukas Märdian |
attachment added |
|
cloud-init.log https://bugs.launchpad.net/netplan/+bug/1898997/+attachment/5419464/+files/cloud-init.log |
|
2020-10-08 10:32:58 |
Lukas Märdian |
attachment added |
|
cloud-init-output.log https://bugs.launchpad.net/netplan/+bug/1898997/+attachment/5419465/+files/cloud-init-output.log |
|
2020-10-08 10:33:07 |
Lukas Märdian |
attachment added |
|
ovs-vswitchd.log https://bugs.launchpad.net/netplan/+bug/1898997/+attachment/5419466/+files/ovs-vswitchd.log |
|
2020-10-08 11:05:13 |
Lukas Märdian |
attachment added |
|
50-cloud-init.yaml https://bugs.launchpad.net/netplan/+bug/1898997/+attachment/5419471/+files/50-cloud-init.yaml |
|
2020-10-08 11:13:33 |
Lukas Märdian |
attachment added |
|
systemd-analyze.svg https://bugs.launchpad.net/netplan/+bug/1898997/+attachment/5419472/+files/systemd-analyze.svg |
|
2020-10-08 16:37:01 |
Björn Tillenius |
bug |
|
|
added subscriber Björn Tillenius |
2020-10-08 20:11:55 |
Matthieu Clemenceau |
bug |
|
|
added subscriber Matthieu Clemenceau |
2020-10-09 14:39:33 |
Lukas Märdian |
netplan: status |
New |
In Progress |
|
2020-10-09 14:39:37 |
Lukas Märdian |
netplan: assignee |
|
Lukas Märdian (slyon) |
|
2020-10-09 15:02:51 |
Adam Collard |
bug |
|
|
added subscriber Adam Collard |
2020-10-12 14:51:25 |
Lukas Märdian |
bug task added |
|
cloud-init |
|
2020-10-12 14:51:32 |
Lukas Märdian |
cloud-init: status |
New |
In Progress |
|
2020-10-13 07:57:01 |
Lukas Märdian |
attachment added |
|
cloud-init_ssh_failure.log https://bugs.launchpad.net/cloud-init/+bug/1898997/+attachment/5421597/+files/cloud-init_ssh_failure.log |
|
2020-10-14 10:30:59 |
Lukas Märdian |
description |
Problem description:
If we try to deploy a single-NIC machine via MAAS, configuring an Open vSwitch bridge as the primary/PXE interface, the machine will install and boot Ubuntu 20.04 but it cannot finish the whole configuration (e.g. copying of SSH keys) and cannot be accessed/controlled via MAAS. It ends up in a "Failed" state.
This is because systemd-network-wait-online.service fails (for some reason), before netplan can fully setup and configure the OVS bridge. Because of broken networking cloud-init cannot complete its final stages, like setup of SSH keys or signaling its state back to MAAS. If we wait a little longer the OVS bridge will actually come online and networking is working – SSH not being setup and MAAS state still "Failed", though.
Steps to reproduce:
* Setup a (virtual) MAAS system, e.g. inside a LXD container using a KVM host, as described here:
https://discourse.maas.io/t/setting-up-a-flexible-virtual-maas-test-environment/142
* Install & setup maas[-cli] snap from 2.9/beta channel (instead of the deb/PPA from the discourse post)
* Configure netplan PPA+key for testing via "Settings" -> "Package repos":
https://launchpad.net/~slyon/+archive/ubuntu/ovs
* Prepare curtin preseed in /var/snap/maas/current/preseeds/curtin_userdata, inside the LXD container (so you can access the broken machine afterwards):
======================
#cloud-config
debconf_selections:
maas: |
{{for line in str(curtin_preseed).splitlines()}}
{{line}}
{{endfor}}
late_commands:
maas: [wget, '--no-proxy', '{{node_disable_pxe_url}}', '--post-data', '{{node_disable_pxe_data}}', '-O', '/dev/null']
90_create_user: ["curtin", "in-target", "--", "sh", "-c", "sudo useradd test -g 0 -G sudo"]
92_set_user_password: ["curtin", "in-target", "--", "sh", "-c", "echo 'test:test' | sudo chpasswd"]
94_cat: ["curtin", "in-target", "--", "sh", "-c", "cat /etc/passwd"]
======================
* Compose a new virtual machine via MAAS' "KVM" menu, named e.g. "test1"
* Watch it being commissioned via MAAS' "Machines" menu
* Once it's ready select your machine (e.g. "test1.maas") -> Network
* Select the single network interface (e.g. "ens4") -> Create bridge
* Choose "Bridge type: Open vSwitch (ovs)", Select "Subnet" and "IP mode", save.
* Deploy machine to Ubuntu 20.04 via "Take action" button
The machine will install the OS and boot, but will end up in a "Failed" state inside MAAS due to network/OVS not being setup correctly. MAAS/SSH has no control over it. You can access the (broken) machine via serial console from the KVM-host (i.e. LXD container) via "virsh console test1" using the "test:test" credentials. |
Problem description:
If we try to deploy a single-NIC machine via MAAS, configuring an Open vSwitch bridge as the primary/PXE interface, the machine will install and boot Ubuntu 20.04 but it cannot finish the whole configuration (e.g. copying of SSH keys) and cannot be accessed/controlled via MAAS. It ends up in a "Failed" state.
This is because systemd-network-wait-online.service fails (for some reason), before netplan can fully setup and configure the OVS bridge. Because of broken networking cloud-init cannot complete its final stages, like setup of SSH keys or signaling its state back to MAAS. If we wait a little longer the OVS bridge will actually come online and networking is working – SSH not being setup and MAAS state still "Failed", though.
Steps to reproduce:
* Setup a (virtual) MAAS system, e.g. inside a LXD container using a KVM host, as described here:
https://discourse.maas.io/t/setting-up-a-flexible-virtual-maas-test-environment/142
* Install & setup maas[-cli] snap from 2.9/beta channel (instead of the deb/PPA from the discourse post)
* Configure netplan PPA+key for testing via "Settings" -> "Package repos":
https://launchpad.net/~slyon/+archive/ubuntu/ovs
* Prepare curtin preseed in /var/snap/maas/current/preseeds/curtin_userdata, inside the LXD container (so you can access the broken machine afterwards):
======================
#cloud-config
debconf_selections:
maas: |
{{for line in str(curtin_preseed).splitlines()}}
{{line}}
{{endfor}}
late_commands:
maas: [wget, '--no-proxy', '{{node_disable_pxe_url}}', '--post-data', '{{node_disable_pxe_data}}', '-O', '/dev/null']
90_create_user: ["curtin", "in-target", "--", "sh", "-c", "sudo useradd test -g 0 -G sudo"]
92_set_user_password: ["curtin", "in-target", "--", "sh", "-c", "echo 'test:test' | sudo chpasswd"]
94_cat: ["curtin", "in-target", "--", "sh", "-c", "cat /etc/passwd"]
98_cloud_init: ["curtin", "in-target", "--", "apt-get", "-y", "install", "cloud-init"]
======================
* Compose a new virtual machine via MAAS' "KVM" menu, named e.g. "test1"
* Watch it being commissioned via MAAS' "Machines" menu
* Once it's ready select your machine (e.g. "test1.maas") -> Network
* Select the single network interface (e.g. "ens4") -> Create bridge
* Choose "Bridge type: Open vSwitch (ovs)", Select "Subnet" and "IP mode", save.
* Deploy machine to Ubuntu 20.04 via "Take action" button
The machine will install the OS and boot, but will end up in a "Failed" state inside MAAS due to network/OVS not being setup correctly. MAAS/SSH has no control over it. You can access the (broken) machine via serial console from the KVM-host (i.e. LXD container) via "virsh console test1" using the "test:test" credentials. |
|
2020-10-14 15:45:51 |
Matthieu Clemenceau |
tags |
|
fr-721 |
|
2020-10-15 07:12:06 |
Lukas Märdian |
netplan: status |
In Progress |
Fix Committed |
|
2020-10-15 10:21:31 |
Lukas Märdian |
netplan: status |
Fix Committed |
Fix Released |
|
2020-10-15 10:58:45 |
Lukas Märdian |
description |
Problem description:
If we try to deploy a single-NIC machine via MAAS, configuring an Open vSwitch bridge as the primary/PXE interface, the machine will install and boot Ubuntu 20.04 but it cannot finish the whole configuration (e.g. copying of SSH keys) and cannot be accessed/controlled via MAAS. It ends up in a "Failed" state.
This is because systemd-network-wait-online.service fails (for some reason), before netplan can fully setup and configure the OVS bridge. Because of broken networking cloud-init cannot complete its final stages, like setup of SSH keys or signaling its state back to MAAS. If we wait a little longer the OVS bridge will actually come online and networking is working – SSH not being setup and MAAS state still "Failed", though.
Steps to reproduce:
* Setup a (virtual) MAAS system, e.g. inside a LXD container using a KVM host, as described here:
https://discourse.maas.io/t/setting-up-a-flexible-virtual-maas-test-environment/142
* Install & setup maas[-cli] snap from 2.9/beta channel (instead of the deb/PPA from the discourse post)
* Configure netplan PPA+key for testing via "Settings" -> "Package repos":
https://launchpad.net/~slyon/+archive/ubuntu/ovs
* Prepare curtin preseed in /var/snap/maas/current/preseeds/curtin_userdata, inside the LXD container (so you can access the broken machine afterwards):
======================
#cloud-config
debconf_selections:
maas: |
{{for line in str(curtin_preseed).splitlines()}}
{{line}}
{{endfor}}
late_commands:
maas: [wget, '--no-proxy', '{{node_disable_pxe_url}}', '--post-data', '{{node_disable_pxe_data}}', '-O', '/dev/null']
90_create_user: ["curtin", "in-target", "--", "sh", "-c", "sudo useradd test -g 0 -G sudo"]
92_set_user_password: ["curtin", "in-target", "--", "sh", "-c", "echo 'test:test' | sudo chpasswd"]
94_cat: ["curtin", "in-target", "--", "sh", "-c", "cat /etc/passwd"]
98_cloud_init: ["curtin", "in-target", "--", "apt-get", "-y", "install", "cloud-init"]
======================
* Compose a new virtual machine via MAAS' "KVM" menu, named e.g. "test1"
* Watch it being commissioned via MAAS' "Machines" menu
* Once it's ready select your machine (e.g. "test1.maas") -> Network
* Select the single network interface (e.g. "ens4") -> Create bridge
* Choose "Bridge type: Open vSwitch (ovs)", Select "Subnet" and "IP mode", save.
* Deploy machine to Ubuntu 20.04 via "Take action" button
The machine will install the OS and boot, but will end up in a "Failed" state inside MAAS due to network/OVS not being setup correctly. MAAS/SSH has no control over it. You can access the (broken) machine via serial console from the KVM-host (i.e. LXD container) via "virsh console test1" using the "test:test" credentials. |
Problem description:
If we try to deploy a single-NIC machine via MAAS, configuring an Open vSwitch bridge as the primary/PXE interface, the machine will install and boot Ubuntu 20.04 but it cannot finish the whole configuration (e.g. copying of SSH keys) and cannot be accessed/controlled via MAAS. It ends up in a "Failed" state.
This is because systemd-network-wait-online.service fails (for some reason), before netplan can fully setup and configure the OVS bridge. Because of broken networking cloud-init cannot complete its final stages, like setup of SSH keys or signaling its state back to MAAS. If we wait a little longer the OVS bridge will actually come online and networking is working – SSH not being setup and MAAS state still "Failed", though.
Steps to reproduce:
* Setup a (virtual) MAAS system, e.g. inside a LXD container using a KVM host, as described here:
https://discourse.maas.io/t/setting-up-a-flexible-virtual-maas-test-environment/142
* Install & setup maas[-cli] snap from 2.9/beta channel (instead of the deb/PPA from the discourse post)
* Configure netplan PPA+key for testing via "Settings" -> "Package repos":
https://launchpad.net/~slyon/+archive/ubuntu/ovs
* Prepare curtin preseed in /var/snap/maas/current/preseeds/curtin_userdata, inside the LXD container (so you can access the broken machine afterwards):
======================
#cloud-config
debconf_selections:
maas: |
{{for line in str(curtin_preseed).splitlines()}}
{{line}}
{{endfor}}
late_commands:
maas: [wget, '--no-proxy', '{{node_disable_pxe_url}}', '--post-data', '{{node_disable_pxe_data}}', '-O', '/dev/null']
90_create_user: ["curtin", "in-target", "--", "sh", "-c", "sudo useradd test -g 0 -G sudo"]
92_set_user_password: ["curtin", "in-target", "--", "sh", "-c", "echo 'test:test' | sudo chpasswd"]
94_cat: ["curtin", "in-target", "--", "sh", "-c", "cat /etc/passwd"]
98_cloud_init: ["curtin", "in-target", "--", "apt-get", "-y", "install", "cloud-init"]
======================
* Compose a new virtual machine via MAAS' "KVM" menu, named e.g. "test1"
* Watch it being commissioned via MAAS' "Machines" menu
* Once it's ready select your machine (e.g. "test1.maas") -> Network
* Select the single network interface (e.g. "ens4") -> Create bridge
* Choose "Bridge type: Open vSwitch (ovs)", Select "Subnet" and "IP mode", save.
* Deploy machine to Ubuntu 20.04 via "Take action" button
The machine will install the OS and boot, but will end up in a "Failed" state inside MAAS due to network/OVS not being setup correctly. MAAS/SSH has no control over it. You can access the (broken) machine via serial console from the KVM-host (i.e. LXD container) via "virsh console test1" using the "test:test" credentials.
=== SRU/Focal/netplan.io ===
[Impact]
This update contains bug-fixes and packaging improvements and we would like to make sure all of our supported customers have access to these improvements.
The notable ones are:
* Setup OVS early in network-pre.target to avoid delays (LP: #1898997)
See the changelog entry below for a full list of changes and bugs.
[Test Case]
The following development and SRU process was followed:
https://wiki.ubuntu.com/NetplanUpdates
Netplan contains an extensive integration test suite that is ran using
the SRU package for each releases. This test suite's results are available here:
http://autopkgtest.ubuntu.com/packages/n/netplan.io
A successful run is required before the proposed netplan package
can be let into -updates.
The netplan team will be in charge of attaching the artifacts and console
output of the appropriate run to the bug. Netplan team members will not
mark ‘verification-done’ until this has happened.
[Regression Potential]
In order to mitigate the regression potential, the results of the
aforementioned integration tests are attached to this bug.
Focal:
focal_amd64_log
focal_arm64_log
focal_armhf_log
focal_ppc64el_log
focal_s390x_log
[Discussion]
To fully fix the MAAS/OVS problem, cloud-init needs to be updated as well. The fixes to netplan.io and cloud-init can be applied indenepdently, though.
[Changelog]
- Setup OVS early in network-pre.target to avoid delays (LP: #1898997)
- Suggest openvswitch-switch runtime dependency
- Improve stability of autopkgtests |
|
2020-10-15 10:59:29 |
Lukas Märdian |
bug |
|
|
added subscriber Ubuntu Stable Release Updates Team |
2020-10-16 16:03:35 |
Steve Langasek |
bug task added |
|
netplan.io (Ubuntu) |
|
2020-10-16 16:04:24 |
Steve Langasek |
nominated for series |
|
Ubuntu Groovy |
|
2020-10-16 16:04:24 |
Steve Langasek |
bug task added |
|
netplan.io (Ubuntu Groovy) |
|
2020-10-16 16:04:24 |
Steve Langasek |
nominated for series |
|
Ubuntu Focal |
|
2020-10-16 16:04:24 |
Steve Langasek |
bug task added |
|
netplan.io (Ubuntu Focal) |
|
2020-10-19 07:46:37 |
Lukas Märdian |
netplan.io (Ubuntu Groovy): status |
New |
Fix Released |
|
2020-10-21 08:49:27 |
Lukas Märdian |
merge proposal linked |
|
https://code.launchpad.net/~slyon/netplan/+git/ubuntu/+merge/392290 |
|
2020-10-21 08:51:38 |
Lukas Märdian |
bug |
|
|
added subscriber Ubuntu Sponsors Team |
2020-10-22 10:23:28 |
Lukas Märdian |
removed subscriber Ubuntu Sponsors Team |
|
|
|
2020-10-22 16:09:50 |
Lukas Märdian |
cloud-init: status |
In Progress |
Confirmed |
|
2020-10-27 21:16:22 |
Brian Murray |
netplan.io (Ubuntu Focal): status |
New |
Fix Committed |
|
2020-10-27 21:16:27 |
Brian Murray |
bug |
|
|
added subscriber SRU Verification |
2020-10-27 21:16:32 |
Brian Murray |
tags |
fr-721 |
fr-721 verification-needed verification-needed-focal |
|
2020-10-28 10:38:17 |
Lukas Märdian |
description |
Problem description:
If we try to deploy a single-NIC machine via MAAS, configuring an Open vSwitch bridge as the primary/PXE interface, the machine will install and boot Ubuntu 20.04 but it cannot finish the whole configuration (e.g. copying of SSH keys) and cannot be accessed/controlled via MAAS. It ends up in a "Failed" state.
This is because systemd-network-wait-online.service fails (for some reason), before netplan can fully setup and configure the OVS bridge. Because of broken networking cloud-init cannot complete its final stages, like setup of SSH keys or signaling its state back to MAAS. If we wait a little longer the OVS bridge will actually come online and networking is working – SSH not being setup and MAAS state still "Failed", though.
Steps to reproduce:
* Setup a (virtual) MAAS system, e.g. inside a LXD container using a KVM host, as described here:
https://discourse.maas.io/t/setting-up-a-flexible-virtual-maas-test-environment/142
* Install & setup maas[-cli] snap from 2.9/beta channel (instead of the deb/PPA from the discourse post)
* Configure netplan PPA+key for testing via "Settings" -> "Package repos":
https://launchpad.net/~slyon/+archive/ubuntu/ovs
* Prepare curtin preseed in /var/snap/maas/current/preseeds/curtin_userdata, inside the LXD container (so you can access the broken machine afterwards):
======================
#cloud-config
debconf_selections:
maas: |
{{for line in str(curtin_preseed).splitlines()}}
{{line}}
{{endfor}}
late_commands:
maas: [wget, '--no-proxy', '{{node_disable_pxe_url}}', '--post-data', '{{node_disable_pxe_data}}', '-O', '/dev/null']
90_create_user: ["curtin", "in-target", "--", "sh", "-c", "sudo useradd test -g 0 -G sudo"]
92_set_user_password: ["curtin", "in-target", "--", "sh", "-c", "echo 'test:test' | sudo chpasswd"]
94_cat: ["curtin", "in-target", "--", "sh", "-c", "cat /etc/passwd"]
98_cloud_init: ["curtin", "in-target", "--", "apt-get", "-y", "install", "cloud-init"]
======================
* Compose a new virtual machine via MAAS' "KVM" menu, named e.g. "test1"
* Watch it being commissioned via MAAS' "Machines" menu
* Once it's ready select your machine (e.g. "test1.maas") -> Network
* Select the single network interface (e.g. "ens4") -> Create bridge
* Choose "Bridge type: Open vSwitch (ovs)", Select "Subnet" and "IP mode", save.
* Deploy machine to Ubuntu 20.04 via "Take action" button
The machine will install the OS and boot, but will end up in a "Failed" state inside MAAS due to network/OVS not being setup correctly. MAAS/SSH has no control over it. You can access the (broken) machine via serial console from the KVM-host (i.e. LXD container) via "virsh console test1" using the "test:test" credentials.
=== SRU/Focal/netplan.io ===
[Impact]
This update contains bug-fixes and packaging improvements and we would like to make sure all of our supported customers have access to these improvements.
The notable ones are:
* Setup OVS early in network-pre.target to avoid delays (LP: #1898997)
See the changelog entry below for a full list of changes and bugs.
[Test Case]
The following development and SRU process was followed:
https://wiki.ubuntu.com/NetplanUpdates
Netplan contains an extensive integration test suite that is ran using
the SRU package for each releases. This test suite's results are available here:
http://autopkgtest.ubuntu.com/packages/n/netplan.io
A successful run is required before the proposed netplan package
can be let into -updates.
The netplan team will be in charge of attaching the artifacts and console
output of the appropriate run to the bug. Netplan team members will not
mark ‘verification-done’ until this has happened.
[Regression Potential]
In order to mitigate the regression potential, the results of the
aforementioned integration tests are attached to this bug.
Focal:
focal_amd64_log
focal_arm64_log
focal_armhf_log
focal_ppc64el_log
focal_s390x_log
[Discussion]
To fully fix the MAAS/OVS problem, cloud-init needs to be updated as well. The fixes to netplan.io and cloud-init can be applied indenepdently, though.
[Changelog]
- Setup OVS early in network-pre.target to avoid delays (LP: #1898997)
- Suggest openvswitch-switch runtime dependency
- Improve stability of autopkgtests |
Problem description:
If we try to deploy a single-NIC machine via MAAS, configuring an Open vSwitch bridge as the primary/PXE interface, the machine will install and boot Ubuntu 20.04 but it cannot finish the whole configuration (e.g. copying of SSH keys) and cannot be accessed/controlled via MAAS. It ends up in a "Failed" state.
This is because systemd-network-wait-online.service fails (for some reason), before netplan can fully setup and configure the OVS bridge. Because of broken networking cloud-init cannot complete its final stages, like setup of SSH keys or signaling its state back to MAAS. If we wait a little longer the OVS bridge will actually come online and networking is working – SSH not being setup and MAAS state still "Failed", though.
Steps to reproduce:
* Setup a (virtual) MAAS system, e.g. inside a LXD container using a KVM host, as described here:
https://discourse.maas.io/t/setting-up-a-flexible-virtual-maas-test-environment/142
* Install & setup maas[-cli] snap from 2.9/beta channel (instead of the deb/PPA from the discourse post)
* Configure netplan PPA+key for testing via "Settings" -> "Package repos":
https://launchpad.net/~slyon/+archive/ubuntu/ovs
* Prepare curtin preseed in /var/snap/maas/current/preseeds/curtin_userdata, inside the LXD container (so you can access the broken machine afterwards):
======================
#cloud-config
debconf_selections:
maas: |
{{for line in str(curtin_preseed).splitlines()}}
{{line}}
{{endfor}}
late_commands:
maas: [wget, '--no-proxy', '{{node_disable_pxe_url}}', '--post-data', '{{node_disable_pxe_data}}', '-O', '/dev/null']
90_create_user: ["curtin", "in-target", "--", "sh", "-c", "sudo useradd test -g 0 -G sudo"]
92_set_user_password: ["curtin", "in-target", "--", "sh", "-c", "echo 'test:test' | sudo chpasswd"]
94_cat: ["curtin", "in-target", "--", "sh", "-c", "cat /etc/passwd"]
98_cloud_init: ["curtin", "in-target", "--", "apt-get", "-y", "install", "cloud-init"]
======================
* Compose a new virtual machine via MAAS' "KVM" menu, named e.g. "test1"
* Watch it being commissioned via MAAS' "Machines" menu
* Once it's ready select your machine (e.g. "test1.maas") -> Network
* Select the single network interface (e.g. "ens4") -> Create bridge
* Choose "Bridge type: Open vSwitch (ovs)", Select "Subnet" and "IP mode", save.
* Deploy machine to Ubuntu 20.04 via "Take action" button
The machine will install the OS and boot, but will end up in a "Failed" state inside MAAS due to network/OVS not being setup correctly. MAAS/SSH has no control over it. You can access the (broken) machine via serial console from the KVM-host (i.e. LXD container) via "virsh console test1" using the "test:test" credentials.
=== SRU/Focal/netplan.io ===
[Impact]
This update contains bug-fixes and packaging improvements and we would like to make sure all of our supported customers have access to these improvements.
The notable ones are:
* Setup OVS early in network-pre.target to avoid delays (LP: #1898997)
See the changelog entry below for a full list of changes and bugs.
[Test Case]
The following development and SRU process was followed:
https://wiki.ubuntu.com/NetplanUpdates
Netplan contains an extensive integration test suite that is ran using
the SRU package for each releases. This test suite's results are available here:
http://autopkgtest.ubuntu.com/packages/n/netplan.io
A successful run is required before the proposed netplan package
can be let into -updates.
The netplan team will be in charge of attaching the artifacts and console
output of the appropriate run to the bug. Netplan team members will not
mark ‘verification-done’ until this has happened.
[Regression Potential]
In order to mitigate the regression potential, the results of the
aforementioned integration tests are attached to this bug.
Focal:
https://git.launchpad.net/~slyon/+git/files/tree/LP1898997/focal_amd64.log
https://git.launchpad.net/~slyon/+git/files/tree/LP1898997/focal_arm64.log
https://git.launchpad.net/~slyon/+git/files/tree/LP1898997/focal_armhf.log
https://git.launchpad.net/~slyon/+git/files/tree/LP1898997/focal_ppc64el.log
https://git.launchpad.net/~slyon/+git/files/tree/LP1898997/focal_s390x.log
[Discussion]
To fully fix the MAAS/OVS problem, cloud-init needs to be updated as well. The fixes to netplan.io and cloud-init can be applied independently, though.
[Changelog]
- Setup OVS early in network-pre.target to avoid delays (LP: #1898997)
- Suggest openvswitch-switch runtime dependency
- Improve stability of autopkgtests |
|
2020-10-28 10:40:16 |
Lukas Märdian |
tags |
fr-721 verification-needed verification-needed-focal |
fr-721 verification-done-focal verification-needed |
|
2020-10-28 10:40:47 |
Lukas Märdian |
netplan.io (Ubuntu): assignee |
|
Lukas Märdian (slyon) |
|
2020-10-28 10:40:48 |
Lukas Märdian |
netplan.io (Ubuntu Groovy): assignee |
|
Lukas Märdian (slyon) |
|
2020-10-28 10:40:49 |
Lukas Märdian |
netplan.io (Ubuntu Focal): assignee |
|
Lukas Märdian (slyon) |
|
2020-10-29 15:19:22 |
Lukas Märdian |
cloud-init: status |
Confirmed |
Fix Committed |
|
2020-11-04 00:21:29 |
Chris Halse Rogers |
removed subscriber Ubuntu Stable Release Updates Team |
|
|
|
2020-11-04 00:28:08 |
Launchpad Janitor |
netplan.io (Ubuntu Focal): status |
Fix Committed |
Fix Released |
|
2020-11-24 17:59:07 |
Chad Smith |
cloud-init: status |
Fix Committed |
Fix Released |
|
2023-05-12 09:44:50 |
James Falcon |
bug watch added |
|
https://github.com/canonical/cloud-init/issues/3792 |
|