[L2][scale issue] ovs-agent meets unexpected tunnel lost
Bug #1813715 reported by
LIU Yulong
This bug affects 2 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
Undecided
|
LIU Yulong |
Bug Description
The ovs-agent will lost some tunnels to other nodes, for instance to DHCP node or L3 node, these lost tunnels can sometimes cause VM failed to boot or dataplane down.
When subnets or security group ports quantity reaches 2000+, this issue can be seen in high probability.
This is a subproblem of bug #1813703, for more information, please see the summary:
https:/
tags: | added: ovs |
Changed in neutron: | |
status: | New → Confirmed |
summary: |
- [L2][scale issue] ovs-agent meet unexpected tunnel lost + [L2][scale issue] ovs-agent meets unexpected tunnel lost |
Changed in neutron: | |
assignee: | LIU Yulong (dragon889) → Brian Haley (brian-haley) |
Changed in neutron: | |
assignee: | Brian Haley (brian-haley) → LIU Yulong (dragon889) |
tags: | added: neutron-proactive-backport-potential |
To post a comment you must log in.
I think the base root cause is that the ovs-agent and openvswitchd is not able to keep up with the connections at heavy load.
So probably these issues are side effects and probably can't be fixed on its own.
So I would say that we should focus on the efficiency of the ovs-agent handling the port info.
Also you mentioned that this happens when there are just 200VM ports as part of the compute, so the ovs-agent on the compute side may be able to handle more than 200VM ports, but on the server side we may not be able to handle if the ports size is 2000 or more than 2000 (Right).
Not sure if this an issue with the Ml2 mechanism driver/DB/rpc issues.