OVN controllers dead (XXX) after zed upgrade
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
kolla-ansible |
New
|
Medium
|
Unassigned |
Bug Description
Performed a major upgrade from Yoga to Zed on a cloud running Kayobe + Kolla Ansible on Ubuntu Jammy 22.04 hosts with OVN networking.
After the upgrade, metadata is broken for existing instances and new instances fail to boot.
all OVN controllers are seen as dead by Neutron (they have XXX in the Alive column of openstack network agent list). They have a State of UP.
There are various warnings in the OVN controller logs about the Southbound database schema:
2024-03-
2024-03-
2024-03-
2024-03-
2024-03-
2024-03-
and one slightly more weird/scary one:
2024-03-
Restarting the OVN controller services did not help.
Restarted one Southbound DB (ovn_sb_db) container and all agents came back to life. Metadata and booting an instance now works.
Some versions:
(ovn-controller)# ovn-controller --version
ovn-controller 22.09.1
Open vSwitch Library 3.0.3
OpenFlow versions 0x6:0x6
SB DB Schema 20.25.0
(ovn-sb-db)# ovn-sbctl --version
ovn-sbctl 22.09.1
Open vSwitch Library 3.0.3
DB Schema 20.25.0
It's not clear what happened, but it seems that for some reason the SB DB had not performed or completed its DB upgrade.
Changed in kolla-ansible: | |
importance: | Undecided → Medium |
I have no idea what could be done better in kolla-ansible - but maybe we should do some DB consistency check if possible.