Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=1795198
Looks like neutron-server is pinging agents too frequently as per what's observed in the logs. nb-cfg being bumped at a non-fixed rate:
For example, in this part of the log I could find 11 updates in less than 2 minutes:
2020-01-27 12:23:04.247 43567 DEBUG ovsdbapp.backend.ovs_idl.event [-] Matched UPDATE: SbGlobalUpdateEvent(events=('update',), table='SB_Global', conditions=None, old_conditions=None) to row=SB_Global(ipsec=False, ssl=[], nb_cfg=49008, options={'mac_prefix': 'b2:64:0d'}, external_ids={}) old=SB_Global(nb_cfg=49007) matches /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/event.py:44
2020-01-27 12:23:05.179 43567 DEBUG ovsdbapp.backend.ovs_idl.event [-] Matched UPDATE: SbGlobalUpdateEvent(events=('update',), table='SB_Global', conditions=None, old_conditions=None) to row=SB_Global(ipsec=False, ssl=[], nb_cfg=49009, options={'mac_prefix': 'b2:64:0d'}, external_ids={}) old=SB_Global(nb_cfg=49008) matches /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/event.py:44
2020-01-27 12:23:32.216 43567 DEBUG ovsdbapp.backend.ovs_idl.event [-] Matched UPDATE: SbGlobalUpdateEvent(events=('update',), table='SB_Global', conditions=None, old_conditions=None) to row=SB_Global(ipsec=False, ssl=[], nb_cfg=49010, options={'mac_prefix': 'b2:64:0d'}, external_ids={}) old=SB_Global(nb_cfg=49009) matches /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/event.py:44
2020-01-27 12:23:41.248 43567 DEBUG ovsdbapp.backend.ovs_idl.event [-] Matched UPDATE: SbGlobalUpdateEvent(events=('update',), table='SB_Global', conditions=None, old_conditions=None) to row=SB_Global(ipsec=False, ssl=[], nb_cfg=49011, options={'mac_prefix': 'b2:64:0d'}, external_ids={}) old=SB_Global(nb_cfg=49010) matches /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/event.py:44
2020-01-27 12:23:42.183 43567 DEBUG ovsdbapp.backend.ovs_idl.event [-] Matched UPDATE: SbGlobalUpdateEvent(events=('update',), table='SB_Global', conditions=None, old_conditions=None) to row=SB_Global(ipsec=False, ssl=[], nb_cfg=49012, options={'mac_prefix': 'b2:64:0d'}, external_ids={}) old=SB_Global(nb_cfg=49011) matches /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/event.py:44
2020-01-27 12:24:09.210 43567 DEBUG ovsdbapp.backend.ovs_idl.event [-] Matched UPDATE: SbGlobalUpdateEvent(events=('update',), table='SB_Global', conditions=None, old_conditions=None) to row=SB_Global(ipsec=False, ssl=[], nb_cfg=49013, options={'mac_prefix': 'b2:64:0d'}, external_ids={}) old=SB_Global(nb_cfg=49012) matches /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/event.py:44
2020-01-27 12:24:18.252 43567 DEBUG ovsdbapp.backend.ovs_idl.event [-] Matched UPDATE: SbGlobalUpdateEvent(events=('update',), table='SB_Global', conditions=None, old_conditions=None) to row=SB_Global(ipsec=False, ssl=[], nb_cfg=49014, options={'mac_prefix': 'b2:64:0d'}, external_ids={}) old=SB_Global(nb_cfg=49013) matches /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/event.py:44
2020-01-27 12:24:19.179 43567 DEBUG ovsdbapp.backend.ovs_idl.event [-] Matched UPDATE: SbGlobalUpdateEvent(events=('update',), table='SB_Global', conditions=None, old_conditions=None) to row=SB_Global(ipsec=False, ssl=[], nb_cfg=49015, options={'mac_prefix': 'b2:64:0d'}, external_ids={}) old=SB_Global(nb_cfg=49014) matches /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/event.py:44
2020-01-27 12:24:46.205 43567 DEBUG ovsdbapp.backend.ovs_idl.event [-] Matched UPDATE: SbGlobalUpdateEvent(events=('update',), table='SB_Global', conditions=None, old_conditions=None) to row=SB_Global(ipsec=False, ssl=[], nb_cfg=49016, options={'mac_prefix': 'b2:64:0d'}, external_ids={}) old=SB_Global(nb_cfg=49015) matches /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/event.py:44
2020-01-27 12:24:55.254 43567 DEBUG ovsdbapp.backend.ovs_idl.event [-] Matched UPDATE: SbGlobalUpdateEvent(events=('update',), table='SB_Global', conditions=None, old_conditions=None) to row=SB_Global(ipsec=False, ssl=[], nb_cfg=49017, options={'mac_prefix': 'b2:64:0d'}, external_ids={}) old=SB_Global(nb_cfg=49016) matches /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/event.py:44
2020-01-27 12:24:56.177 43567 DEBUG ovsdbapp.backend.ovs_idl.event [-] Matched UPDATE: SbGlobalUpdateEvent(events=('update',), table='SB_Global', conditions=None, old_conditions=None) to row=SB_Global(ipsec=False, ssl=[], nb_cfg=49018, options={'mac_prefix': 'b2:64:0d'}, external_ids={}) old=SB_Global(nb_cfg=49017) matches /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/event.py:44
This is triggering too frequent writes from *all* metadata-agents and ovn-controllers in the cloud which creates a lot of traffic. At scale, this can be a problem.
Imagine a 500 node deployment, with one update per 10 seconds as in the example above. That will translate into 1K (1 metadata agent + 1 ovn-controller per node) write transactions into the SB database every 10 seconds so 100 transactions per second that trigger a JSON RPC command update to every single client into the cloud.
Related fix proposed to branch: master /review. opendev. org/705295
Review: https:/