[OVN] neutron agent not cleaned up at departure causes new units to fail

Bug #1932070 reported by Pedro Guimarães
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Neutron API Charm
New
Undecided
Unassigned
charm-ovn-central
New
Undecided
Unassigned
charm-ovn-chassis
New
Undecided
Unassigned

Bug Description

$ openstack network agent list shows:

| juju-d34d64-25-lxd-6.DOMAIN | OVN Controller agent | juju-d34d64-25-lxd-6.DOMAIN | | XXX | UP | ovn-controller |

In which the agent is marked as gone.
Indeed, that is corresponding to an octavia unit (25/lxd/6) which has been previously removed.
I can also see it on Chassis table: https://pastebin.ubuntu.com/p/pSSNqr5bBC/

As well as other ovn-chassis units that have been previously removed.

I can also see it in Encap table:
_uuid : 157c9a73-0ddf-4091-b7f2-9ea10241103c
chassis_name : juju-d34d64-23-lxd-6.DOMAIN
ip : "IP" <<<<<-----------------------------------
options : {csum="true"}
type : geneve

My issue is that, redeploying a new unit for octavia (in this case, 25/lxd/10) does not eventually work, since the ovn-chassis charm accompanying it cannot register itself.
In the /var/log/ovn/ovn-controller.log, I can see entries every minute approx. with:

2021-06-15T14:34:21.321Z|00010|ovsdb_idl|WARN|transaction error: {"details":"Transaction causes multiple rows in \"Encap\" table to have identical values (geneve and \"IP\") for index on columns \"type\" and \"ip\". First row, with UUID 0c7b68b5-3238-4a4c-883f-238590c18f37, was inserted by this transaction. Second row, with UUID 157c9a73-0ddf-4091-b7f2-9ea10241103c, existed in the database before this transaction and was not modified by the transaction.","error":"constraint violation"}

According to:
https://bugzilla.redhat.com/show_bug.cgi?id=1946179

That happens because "IP" is being reused, and given the automation (in our case, the charms) did not clean the previous entries; then this new ovn-chassis unit cannot register itself.

A comment on the bug above uggests that ovn-sbctl should be used to clean those entries.
The equivalent neutron agent should also be cleaned up.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.