Comment 0 for bug 1794991

Revision history for this message
Gaƫtan Trellu (goldyfruit) wrote : Inconsistent flows with DVR l2pop on br-tun

We are using Neutron (Pike) configured as DVR with l2pop and ARP responder. Since few weeks we are experiencing unexpected behaviors:

- [1] Some instances are not able to get DHCP address
- [2] Instances are not able to ping other instances on different compute

This is totally random, sometime it will work as expected and sometime we will have the behaviors describe above.

After checking the flows between network and compute nodes we have been able to discover that for behavior [1] is due to missing flows on the compute nodes pointing to the DHCP agent on the network one.

About behavior [2] it is related to missing flows too, some compute nodes have missing output to other compute nodes which prevent an instance on compute 1 to communicate to an instance on compute 2.

When we add the missing flows for [1] and [2] we are able to fix the issues but if we restart neutron-openvswitch-agent the flows are missing again.

For [1] sometime just disable/enable the port on the network nodes related to each DHCP solve the problem and sometime not.

For [2] the only way we found to fix the flows without adding them manually is to remove all instances of a network from the compute and create a new instance from this network which will sends a notification message to all computing and network nodes.

We cherry-picked the commits but nothing changed:
  - https://review.openstack.org/#/c/600151/
  - https://review.openstack.org/#/c/573785/

Any ideas ?