Large number of FIPs and subnets causes slow sync_routers response
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
Medium
|
Adam Oswick |
Bug Description
When a large number of subnets and FIPs are configured on a network, the response time for neutron.
Based on profiling data, a large amount of time is spent waiting on _get_sync_
ncalls tottime percall cumtime percall filename:
...TRUNCATED...
16 2 0.000 0.000 19.827 9.913 /var/lib/
...TRUNCATED...
In the above example, the total execution time logged for sync_routers was 26.645s.
Further investigation reveals that the call to l3_obj.
Reproduction steps:
- Setup OpenStack with DVR enabled
- Create a network
- Attach a large number of subnets (the above has 27)
- Create a large number of FIPs and attach them to VMs (the above has around 1000 attached FIPs)
- Restart neutron_l3_agent on a compute node and observe slow response times for the get_routers() RPC
Version:
- OpenStack: Zed
- Kernel/distro: N/A
tags: | added: l3-dvr-backlog loadimpact |
Changed in neutron: | |
importance: | Undecided → Medium |
Changed in neutron: | |
status: | Fix Committed → Fix Released |
Some details relating to the environment have not been included in the above description.
I'm hoping that these won't be necessary as https:/ /review. opendev. org/c/openstack /neutron/ +/876168 has already been created to resolve the issue.
However, I am happy to update the description above with more details if needed.