L2 pop notifications are not reliable
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Opinion
|
Wishlist
|
Unassigned |
Bug Description
Problem: lack of connectivity (e.g. vxlan tunnels, OVS flows) between nodes/VMs in L2 segment due to partial RabbitMQ unavailability, RPC message loss or agent failure on applying fdb entry updates.
Why: currently FDB entries are sent by neutron server to L2 agents one-way (no feedback), thus agent has no way to detect if all required tunnels/flows are built. On the other hand server has no way to detect if all sent FDB entries were delivered and required flows were applied. In case some messages are lost - only agent restart fixes possible issues.
Way to address: new synchronization mechanism on L2 agent side, which will periodically request net topology from server and match it to actual config applied on the node, with applying missing parts.
Option 2: move from RPC fanouts and casts to RPC calls which guarantee message delivery. Concerns: scalability, increased load on neutron server.
tags: | removed: rfe |
Hi Oleg,
I will add this RFE to the agenda of our next drivers meeting: http:// eavesdrop. openstack. org/#Neutron_ drivers_ Meeting - so it would be great if You could join there if there would be any additional questions. But RFE should be discussed even if You will not be able to attend this meeting.