In the following scenarios (especially in large-scale cases, when restarting many ovs-agents at the same time), the openflow table is missing and cannot be self-recovered
As a simple example, restarting two ovs-agent at the same time:
```
network.local_ip=30.0.1.6,output="vxlan-1e000106"
compute1.local_ip=30.0.1.7,output="vxlan-1e000107"
compute2.local_ip=30.0.1.8,output="vxlan-1e000108"
// rpc-3
Notify l2population agent compute2 at q-agent-notifier the message add_fdb_entries with {'8883e077-aadb-4b79-9315-3c029e94a857': {'segment_id': 22, 'network_type': 'vxlan', 'ports': {'30.0.1.6': [('00:00:00:00:00:00', '0.0.0.0'), PortInfo(mac_address='fa:16:3e:db:75:11', ip_address='192.168.1.2')], '30.0.1.7': [('00:00:00:00:00:00', '0.0.0.0'), PortInfo(mac_address='fa:16:3e:21:34:43', ip_address='192.168.1.11')]}}} _notification_host
// rpc-4
Fanout notify l2population agents at q-agent-notifier the message add_fdb_entries with {'8883e077-aadb-4b79-9315-3c029e94a857': {'segment_id': 22, 'network_type': 'vxlan', 'ports': {'30.0.1.8': [('00:00:00:00:00:00', '0.0.0.0'), PortInfo(mac_address='fa:16:3e:45:eb:6a', ip_address='192.168.1.141')]}}} _notification_fanout
```
1. After iter_num=0, cleanup_stale_flows clears table=21 and table=22 of stale openflow tables
2. If compute1 receives rpc-4 first, tunnels_missing=False
3. rpc-1 timeout not received
4. As a result, table=22,priority=1, output is missing output="vxlan-1e000106" and table=21,priority=1 is missing 192.168.1.2 arp responder table
5. Missing flow tables will always be missing, resulting in VMs under this network not being able to communicate with VMs under the network node at layer 2
In the following scenarios (especially in large-scale cases, when restarting many ovs-agents at the same time), the openflow table is missing and cannot be self-recovered
As a simple example, restarting two ovs-agent at the same time: local_ip= 30.0.1. 6,output= "vxlan- 1e000106" local_ip= 30.0.1. 7,output= "vxlan- 1e000107" local_ip= 30.0.1. 8,output= "vxlan- 1e000108"
```
network.
compute1.
compute2.
network. port=(' 192.168. 1.2') port=(' 192.168. 1.11') port=(' 192.168. 1.141')
compute1.
compute2.
// iter_num=0 of compute1 plugins. ml2.db [req-f8093da8- 9f1a-4da2- a27f-03f1b4d50d fd - - - - -] For port cb7fad87- 7dc7-4008- a349-3a17e3b8be 71, host compute1, got binding levels [PortBindingLev el(driver= 'openvswitch' ,host=' compute1' ,level= 0,port_ id=cb7fad87- 7dc7-4008- a349-3a17e3b8be 71,segment= NetworkSegment( 0bcd776d- 92cd-4d96- 9e54-92350700c4 ca),segment_ id=0bcd776d- 92cd-4d96- 9e54-92350700c4 ca)] get_binding_ level_objs /usr/lib/ python3. 6/site- packages/ neutron/ plugins/ ml2/db. py:78 plugins. ml2.drivers. l2pop.mech_ driver [req-f8093da8- 9f1a-4da2- a27f-03f1b4d50d fd - - - - -] host: compute1, agent_active_ports: 3, refresh_tunnels: True update_port_up
DEBUG neutron.
DEBUG neutron.
// rpc-1 aadb-4b79- 9315-3c029e94a8 57': {'segment_id': 22, 'network_type': 'vxlan', 'ports': {'30.0.1.6': [('00:00: 00:00:00: 00', '0.0.0.0'), PortInfo( mac_address= 'fa:16: 3e:db:75: 11', ip_address= '192.168. 1.2')], '30.0.1.8': [('00:00: 00:00:00: 00', '0.0.0.0'), PortInfo( mac_address= 'fa:16: 3e:45:eb: 6a', ip_address= '192.168. 1.141') ]}}} _notification_host
Notify l2population agent compute1 at q-agent-notifier the message add_fdb_entries with {'8883e077-
// rpc-2 aadb-4b79- 9315-3c029e94a8 57': {'segment_id': 22, 'network_type': 'vxlan', 'ports': {'30.0.1.7': [('00:00: 00:00:00: 00', '0.0.0.0'), PortInfo( mac_address= 'fa:16: 3e:21:34: 43', ip_address= '192.168. 1.11')] }}} _notification_ fanout
Fanout notify l2population agents at q-agent-notifier the message add_fdb_entries with {'8883e077-
// iter_num>0 of compute1 plugins. ml2.db [req-f8093da8- 9f1a-4da2- a27f-03f1b4d50d fd - - - - -] For port cb7fad87- 7dc7-4008- a349-3a17e3b8be 71, host compute1, got binding levels [PortBindingLev el(driver= 'openvswitch' ,host=' compute1' ,level= 0,port_ id=cb7fad87- 7dc7-4008- a349-3a17e3b8be 71,segment= NetworkSegment( 0bcd776d- 92cd-4d96- 9e54-92350700c4 ca),segment_ id=0bcd776d- 92cd-4d96- 9e54-92350700c4 ca)] get_binding_ level_objs /usr/lib/ python3. 6/site- packages/ neutron/ plugins/ ml2/db. py:78 plugins. ml2.drivers. l2pop.mech_ driver [req-f8093da8- 9f1a-4da2- a27f-03f1b4d50d fd - - - - -] host: compute1, agent_active_ports: 3, refresh_tunnels: False update_port_up
DEBUG neutron.
2022-06-09 17:45:39.546 833566 DEBUG neutron.
...
// iter_num=0 of compute2 plugins. ml2.db [req-2e977b20- 4438-4928- 85bb-59de4c7389 f6 - - - - -] For port ccca9701- 19c0-4590- 92d0-5fbd909d4e eb, host compute2, got binding levels [PortBindingLev el(driver= 'openvswitch' ,host=' compute2' ,level= 0,port_ id=ccca9701- 19c0-4590- 92d0-5fbd909d4e eb,segment= NetworkSegment( 0bcd776d- 92cd-4d96- 9e54-92350700c4 ca),segment_ id=0bcd776d- 92cd-4d96- 9e54-92350700c4 ca)] get_binding_ level_objs /usr/lib/ python3. 6/site- packages/ neutron/ plugins/ ml2/db. py:78 plugins. ml2.drivers. l2pop.mech_ driver [req-2e977b20- 4438-4928- 85bb-59de4c7389 f6 - - - - -] host: compute2, agent_active_ports: 3, refresh_tunnels: True update_port_up
DEBUG neutron.
DEBUG neutron.
// rpc-3 aadb-4b79- 9315-3c029e94a8 57': {'segment_id': 22, 'network_type': 'vxlan', 'ports': {'30.0.1.6': [('00:00: 00:00:00: 00', '0.0.0.0'), PortInfo( mac_address= 'fa:16: 3e:db:75: 11', ip_address= '192.168. 1.2')], '30.0.1.7': [('00:00: 00:00:00: 00', '0.0.0.0'), PortInfo( mac_address= 'fa:16: 3e:21:34: 43', ip_address= '192.168. 1.11')] }}} _notification_host
Notify l2population agent compute2 at q-agent-notifier the message add_fdb_entries with {'8883e077-
// rpc-4 aadb-4b79- 9315-3c029e94a8 57': {'segment_id': 22, 'network_type': 'vxlan', 'ports': {'30.0.1.8': [('00:00: 00:00:00: 00', '0.0.0.0'), PortInfo( mac_address= 'fa:16: 3e:45:eb: 6a', ip_address= '192.168. 1.141') ]}}} _notification_ fanout
Fanout notify l2population agents at q-agent-notifier the message add_fdb_entries with {'8883e077-
```
1. After iter_num=0, cleanup_stale_flows clears table=21 and table=22 of stale openflow tables missing= False priority= 1, output is missing output= "vxlan- 1e000106" and table=21,priority=1 is missing 192.168.1.2 arp responder table
2. If compute1 receives rpc-4 first, tunnels_
3. rpc-1 timeout not received
4. As a result, table=22,
5. Missing flow tables will always be missing, resulting in VMs under this network not being able to communicate with VMs under the network node at layer 2