neutron-openvswitch-agent deletes existing other_config and thus breaks undercloud's MAC address assignment in tripleo

Bug #1795716 reported by Andreas Karis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Invalid
Undecided
Unassigned

Bug Description

https://github.com/openstack/neutron/commit/1f8378e0ac4b8c3fc4670144e6efc51940d796ad was supposed to change other-config:mac-table-size to set a higher value of mac-table-size
option can be set for bridge. Instead, it overwrites the entire other-config array and thus interferes with tripleo's undercloud settings where other-config:hwaddr is a requirement.

neutron-openvswitch-agent thus resets the MAC address to a random value although it should be fixed to the underlying interface's MAC>

The original bridge configuration is:
~~~
ov[root@undercloud-7 ~]# ovs-vsctl list-bridge br-ctlplane
ovs-vsctl: unknown command 'list-bridge'; use --help for help
[root@undercloud-7 ~]# ovs-vsctl list bridge br-ctlplane
_uuid : d56235c5-4933-4334-b33b-be2134c99995
auto_attach : []
controller : []
datapath_id : "0000525400ec14c2"
datapath_type : ""
datapath_version : "<unknown>"
external_ids : {bridge-id=br-ctlplane}
fail_mode : standalone
flood_vlans : []
flow_tables : {}
ipfix : []
mcast_snooping_enable: false
mirrors : []
name : br-ctlplane
netflow : []
other_config : {hwaddr="52:54:00:ec:14:c2"}
ports : [054cde3c-0e02-497d-ac25-be8e6992f708, fcbfcff7-d6b8-4bce-824d-085a681663cf]
protocols : []
rstp_enable : false
rstp_status : {}
sflow : []
status : {}
stp_enable : false
~~~

The new version of neutron-openvswitch-agent sets this:
~~~
2018-10-02 12:31:43.032 3949 DEBUG neutron.agent.ovsdb.impl_idl [-] Running txn command(idx=1): DbSetCommand(table=Bridge, col_values=(('other_config', {'mac-table-size': '50000'}),), record=br-ctlplane) do_commit /usr/lib/python2.7/site-packages/neutron/agent/ovsdb/impl_idl.py:98
~~~

Which removes the hwaddr:
~~~
[root@undercloud-7 ~]# ovs-vsctl list bridge br-ctlplane
_uuid : 334f1314-e024-4c0e-ad6f-acddaa43bb40
auto_attach : []
controller : [505d73e7-4049-44b8-862c-e19e556bc051]
datapath_id : "000016134f330e4c"
datapath_type : system
datapath_version : "<unknown>"
external_ids : {bridge-id=br-ctlplane}
fail_mode : secure
flood_vlans : []
flow_tables : {}
ipfix : []
mcast_snooping_enable: false
mirrors : []
name : br-ctlplane
netflow : []
other_config : {mac-table-size="50000"}
ports : [18c205e9-c869-4c0b-a24a-18e249cf4f3e, 90ab6c75-f108-4716-a328-9c26ba7b1a75]
protocols : ["OpenFlow10", "OpenFlow13"]
rstp_enable : false
rstp_status : {}
sflow : []
status : {}
stp_enable : false
~~~

When it should run something similar to this manual command:
~~~
[root@undercloud-7 ~]# ovs-vsctl set bridge br-ctlplane other-config:mac-table-size=50000
[root@undercloud-7 ~]# ovs-vsctl list bridge br-ctlplane
_uuid : d56235c5-4933-4334-b33b-be2134c99995
auto_attach : []
controller : []
datapath_id : "0000525400ec14c2"
datapath_type : ""
datapath_version : "<unknown>"
external_ids : {bridge-id=br-ctlplane}
fail_mode : standalone
flood_vlans : []
flow_tables : {}
ipfix : []
mcast_snooping_enable: false
mirrors : []
name : br-ctlplane
netflow : []
other_config : {hwaddr="52:54:00:ec:14:c2", mac-table-size="50000"}
ports : [054cde3c-0e02-497d-ac25-be8e6992f708, fcbfcff7-d6b8-4bce-824d-085a681663cf]
protocols : []
rstp_enable : false
rstp_status : {}
sflow : []
status : {}
stp_enable : false
[root@undercloud-7 ~]#
[root@undercloud-7 ~]# ip link ls dev br-ctlplane
14: br-ctlplane: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether 52:54:00:ec:14:c2 brd ff:ff:ff:ff:ff:ff
~~~

The neutron OVS agent issue can be reproduced manually:
~~~
[root@undercloud-7 ~]# ovs-vsctl set bridge br-ctlplane other-config='{mac-table-size=50000}'
[root@undercloud-7 ~]# ovs-vsctl list bridge br-ctlplane | grep other
other_config : {mac-table-size="50000"}
[root@undercloud-7 ~]# ip link ls dev br-ctlplane
14: br-ctlplane: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether c6:35:62:d5:34:43 brd ff:ff:ff:ff:ff:ff
[root@undercloud-7 ~]#
~~~

Andreas Karis (akaris)
Changed in neutron:
assignee: nobody → Andreas Karis (akaris)
Revision history for this message
Andreas Karis (akaris) wrote :

This should fix it:

[root@undercloud-7 ~]# diff -u /usr/lib/python2.7/site-packages/neutron/agent/common/ovs_lib.py{.bck,}
--- /usr/lib/python2.7/site-packages/neutron/agent/common/ovs_lib.py.bck 2018-10-02 13:46:50.075675926 -0400
+++ /usr/lib/python2.7/site-packages/neutron/agent/common/ovs_lib.py 2018-10-02 14:29:37.638086084 -0400
@@ -226,8 +226,13 @@
                               check_error=True)

     def create(self, secure_mode=False):
- other_config = {
- 'mac-table-size': str(cfg.CONF.bridge_mac_table_size)}
+ other_config = self.db_get_val('Bridge',
+ self.br_name, 'other_config')
+ if other_config is None:
+ other_config = {}
+
+ other_config['mac-table-size'] = str(cfg.CONF.bridge_mac_table_size)
+
         with self.ovsdb.transaction() as txn:
             txn.add(
                 self.ovsdb.add_br(self.br_name,

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/607341

Changed in neutron:
status: New → In Progress
Revision history for this message
Andreas Karis (akaris) wrote :
Download full text (6.7 KiB)

I dug into this further. I discovered this issue in RH OSP 10, but it's fixed in RH OSP 13 and upstream master.

The problem in RH OSP 10 is that stuff is simply replaced when a map needs to be set:

OSP 10: /usr/lib/python2.7/site-packages/neutron/agent/ovsdb/native/commands.py
~~~
class DbSetCommand(BaseCommand):
    def __init__(self, api, table, record, *col_values):
        super(DbSetCommand, self).__init__(api)
        self.table = table
        self.record = record
        self.col_values = col_values

    def run_idl(self, txn):
        record = idlutils.row_by_record(self.api.idl, self.table, self.record)
        for col, val in self.col_values:
            # TODO(twilson) Ugh, the OVS library doesn't like OrderedDict
            # We're only using it to make a unit test work, so we should fix
            # this soon.
            if isinstance(val, collections.OrderedDict):
                val = dict(val)
            setattr(record, col, val)
~~~

This seems to be fixed here in OSP 13, if I read this correctly:

OSP 13: /usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/command.py
~~~
class DbSetCommand(BaseCommand):
    def __init__(self, api, table, record, *col_values):
        super(DbSetCommand, self).__init__(api)
        self.table = table
        self.record = record
        self.col_values = col_values

    def run_idl(self, txn):
        record = idlutils.row_by_record(self.api.idl, self.table, self.record)
        for col, val in self.col_values:
            # TODO(twilson) Ugh, the OVS library doesn't like OrderedDict
            # We're only using it to make a unit test work, so we should fix
            # this soon.
            if isinstance(val, collections.OrderedDict):
                val = dict(val)
            if isinstance(val, dict):
                # NOTE(twilson) OVS 2.6's Python IDL has mutate methods that
                # would make this cleaner, but it's too early to rely on them.
                existing = getattr(record, col, {})
                existing.update(val)
                val = existing
            self.set_column(record, col, val)

~~~
I suppose it's that commit here: https://github.com/openstack/ovsdbapp/commit/558783eba2987a86a1b838a5a89d8957eb950f94 (but I'm not sure)

Unfortunately, it's not trivial to simply backport that change due to the significant API changes over time, so I cannot prove this:
~~~
class DbSetCommand(BaseCommand):
    def __init__(self, api, table, record, *col_values):
        super(DbSetCommand, self).__init__(api)
        self.table = table
        self.record = record
        self.col_values = col_values

# def run_idl(self, txn):
# record = idlutils.row_by_record(self.api.idl, self.table, self.record)
# for col, val in self.col_values:
# # TODO(twilson) Ugh, the OVS library doesn't like OrderedDict
# # We're only using it to make a unit test work, so we should fix
# # this soon.
# if isinstance(val, collections.OrderedDict):
# val = dict(val)
# setattr(record, col, val)

    @classmethod
    def set_column(cls, row, col, val):
        setattr(row, col, idlutils.db_repl...

Read more...

Changed in neutron:
status: In Progress → Invalid
Changed in neutron:
assignee: Andreas Karis (akaris) → Brian Haley (brian-haley)
status: Invalid → In Progress
Revision history for this message
Andreas Karis (akaris) wrote :

Hi,

You decided to pursue this for upstream? Because this bug was only found in earlier versions of OpenStack, it was fixed in master due to other changes in neutron (see my last comment). Just in case this wasn't clear.

- Andreas

Revision history for this message
Brian Haley (brian-haley) wrote :

Andreas - yes, I guess it wasn't clear to me, I will take a look at the downstream bug again.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Brian Haley (<email address hidden>) on branch: master
Review: https://review.opendev.org/607341
Reason: No longer needed

Changed in neutron:
assignee: Brian Haley (brian-haley) → nobody
Changed in neutron:
status: In Progress → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.