MAAS can no longer commission machines with OpenVSwitch Bridges

Bug #2033442 reported by Alan Baghumian
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
maas-images
Triaged
High
Unassigned

Bug Description

Hello MAAS Team,

I just realized commissioning machines that have network interfaces configured with OpenVSwitch bridges no longer works. I am using MAAS 3.3 and tried commissioning with Focal as well as Jammy images with the same results.

SSH'ing into the commissioning machine reveals the following issue:

Aug 29 22:39:27 os-vm-4 ovs-ctl[4169]: modprobe: FATAL: Module openvswitch not found in directory /lib/modules/5.15.0-69-generic
Aug 29 22:39:27 os-vm-4 ovs-ctl[4161]: * Inserting openvswitch module
Aug 29 22:39:27 os-vm-4 systemd[1]: ovs-vswitchd.service: Control process exited, code=exited, status=1/FAILURE

Also tried:

$ sudo modprobe openvswitch
modprobe: FATAL: Module openvswitch not found in directory /lib/modules/5.15.0-69-generic

It's been a while (approx 5-6 months) since I commissioned a machine with this configuration, so I'm not quite sure when this issue actually started.

I'm also attaching a screenshot of the network configuration of the affected machine.

Please review and let me know if there are any questions.

Best,
Alan

Revision history for this message
Alan Baghumian (alanbach) wrote :
summary: - MAAS can no longer commisson machines with OpenVSwithch Bridges
+ MAAS can no longer commisson machines with OpenVSwitch Bridges
summary: - MAAS can no longer commisson machines with OpenVSwitch Bridges
+ MAAS can no longer commission machines with OpenVSwitch Bridges
Revision history for this message
Dagmawi Biru (dagbiru) wrote :

I'm also able to reproduce this behavior.

- MAAS 3.3.2
- Commissioning Image: Ubuntu Focal

Console error shown during commissioning image boot:
---
[FAILED] Failed to start Open vSwitch Forwarding Unit.
See 'systemctl status ovs-vswitchd.service' for details.
[DEPEND] Dependency failed for Open vSwitch Record Hostname.
[ 37.201868] cloud-init[2033]: Could not execute systemctl: at /usr/bin/deb-systemd-invoke line 142.
[ OK ] Stopped Open vSwitch Forwarding Unit.
         Starting Open vSwitch Forwarding Unit...
[FAILED] Failed to start Open vSwitch Forwarding Unit.
See 'systemctl status ovs-vswitchd.service' for details.
[DEPEND] Dependency failed for Open vSwitch Record Hostname.
[DEPEND] Dependency failed for Open vSwitch.
[ 37.458764] cloud-init[2033]: A dependency job for openvswitch-switch.service failed. See 'journalctl -xe' for details.
[ 37.459807] cloud-init[2033]: invoke-rc.d: initscript openvswitch-switch, action "start" failed.
[ 37.462900] cloud-init[2033]: ○ openvswitch-switch.service - Open vSwitch
[ 37.463269] cloud-init[2033]: Loaded: loaded (/lib/systemd/system/openvswitch-switch.service; enabled; vendor preset: enabled)
[ 37.463402] cloud-init[2033]: Active: inactive (dead)
[ 37.463711] cloud-init[2033]:
[ 37.465523] cloud-init[2033]: Aug 29 22:35:17 maas-large systemd[1]: Dependency failed for Open vSwitch.
[ 37.465911] cloud-init[2033]: Aug 29 22:35:17 maas-large systemd[1]: openvswitch-switch.service: Job openvswitch-switch.service/start failed with result 'dependency'.
[ 37.466121] cloud-init[2033]: dpkg: error processing package openvswitch-switch (--configure):
[ 37.466371] cloud-init[2033]: installed openvswitch-switch package post-installation script subprocess returned error exit status 1
[ 37.480777] cloud-init[2033]: Processing triggers for man-db (2.10.2-1) ...

---

Revision history for this message
Alan Baghumian (alanbach) wrote :

I retested Focal with 5.4 and Bionic with 4.15 kernels and they also failed :-(

Revision history for this message
Alan Baghumian (alanbach) wrote :

I found the following workaround:

- Commission without an OpenVSwitch bridge
- After the commissioning is done, remove the Linux Bridge and add the OpenVSwitch bridge
- Deploy

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

The linux-modules package has the kernel module file:

$ dpkg-deb -c linux-modules-5.15.0-69-generic_5.15.0-69.76~20.04.1_amd64.deb | grep openvswitch.ko
-rw-r--r-- root/root 870793 2023-03-20 12:32 ./lib/modules/5.15.0-69-generic/kernel/net/openvswitch/openvswitch.ko

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Not present in the squashfs filesystem (expected):

$ wget http://images.maas.io/ephemeral-v3/stable/jammy/amd64/20230328/squashfs
$ unsquashfs squashfs
$ ls -l squashfs-root/lib/modules
total 0
$ grep 'Package: linux-modules' squashfs-root/var/lib/dpkg/status
$

Not in the initrd's shipped modules:

$ wget http://images.maas.io/ephemeral-v3/stable/jammy/amd64/20230328/ga-22.04/generic/boot-initrd
$ unmkinitramfs boot-initrd initrd-ga
$ find initrd-ga/ -name '*.ko' | wc -l
1622
$ find initrd-ga/ -name '*.ko' | grep openvswitch
$

Changed in maas:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Igor Brovtsin (igor-brovtsin) wrote :

Checked multiple GA images in stable stream, can confirm that there is no `openvswitch` module in them.

Candidate stream images for focal (20230907) and jammy (20230519) both contain the modules.

Revision history for this message
Jerzy Husakowski (jhusakowski) wrote :

Test and promote recent candidate images to fix the issue in initramfs.

affects: maas → maas-images
Changed in maas-images:
importance: Medium → High
Revision history for this message
Alan Baghumian (alanbach) wrote :

I pulled the candidate images into my local repository last night and just completed a round of commissioning & testing with a machine using an OpenVswitch bridge and everything worked.

Thank you so much for resolving this. It should be good to get pushed to production I believe.

Revision history for this message
Claiton Campos (penacleiton) wrote (last edit ):

Hi Alan,
Please, can you clarify if just one machine with vlan Open vSwitch is enough? Can I deploy the other machines with normal/linux bridge? I have six physical machines, and I'm trying to deploy them all with Open vSwitch but only one of them is ok, in the others the open vSwitch vlan has an error and I can't create VMs.

Can you confirm if that's right?

Thank you.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.