L2-population fanout-cast leads to performance and scalability issue
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Expired
|
Undecided
|
Unassigned |
Bug Description
https:/
def _notification_
....
the fanout_cast will publish the message to all L2 agents listening "l2population" topic.
If there are 1000 agents (it is a small cloud), and all of them are listening to "l2population" topic, adding one new port will leads to 1000 sub messages. Generally rabbitMQ can handle 10k messages per second, and the fanout_cast method will leads to greatly performance issues, and make the neutron service hard to scale, the concurrency of VM port request will be very very small.
No matter how many ports in the subnet, the performance is up to the number of the L2 agents listening the topic.
The way to solve the performance and scalability issue is to make the L2 agent listening a topic related to network, for example, using network uuid as the topic. If one port is activated in the subnet, only those agents where there are VMs of the same network should receive the L2-pop message. This is parial-mesh, the original design purpose, but not implemented yet.
tags: |
added: loadimpact removed: l2 |
Changed in neutron: | |
assignee: | nobody → steve (ruansx) |
assignee: | steve (ruansx) → nobody |
The problem described in the bug seems to be a new feature needed to increase performance on the scale.
It can't be really considered as a bug because the described behavior is as designed.
I suggest to work on this problem in the scope of appropriate blueprint.