neutron iptables manager is slow modifying a large amount of rules
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
High
|
Brian Haley | ||
Havana |
Fix Released
|
Undecided
|
Unassigned | ||
Icehouse |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Sudhakar Gariganti has noticed that with a very large number of iptables rules that _modify_rules() was taking so long to complete (140 seconds) that VMs couldn't be reliably booted because the rules weren't getting put in place before the initial DHCP requests had timed out. With a small change the update can be done much quicker, and also allow each node to support a larger set of iptables rules.
I've included a snippet from the related bug for reference, https:/
"We have done significant testing with this patch and want to share few results from our experiments.
We were basically trying to see how many VMs we can scale with the OVS agent in use. With default security groups(which has remote security group), beyond 250-300 VMs, VMs were not able to get DHCP IPs. We were having 16 CNs, with VMs uniformly distributed across them. The VM image had a wait period of 120 secs to receive the DHCP response.
By the time we have around 18-19 VMs on each CN(there were around 6k Iptable rules), each RPC loop was taking close to 140 seconds(if there is any update). And the reason VMs were not getting IPs was that the Iptable rules required for the VM to send out the DHCP request were not in place before the 120 secs wait period. Upon further investigations we discovered that the "for loop searching iptable rules" in _modify_rules method of iptables_manger.py is eating a big chunk of the overall time spent.
After this patch, we were able to see close to 680 VMs were able to get IPs. The number of Iptable rules at this point was close to 20K, with around 40 VMs per CN.
To summarize, we were able to increase the processing capability of compute node from 6K Iptable rules to 20K Iptable rules, which helped more VMs get DHCP IP within the 120 sec wait period. You can imagine the situation when the wait time is less than 120 secs."
Changed in neutron: | |
assignee: | nobody → Brian Haley (brian-haley) |
Changed in neutron: | |
status: | New → In Progress |
Changed in neutron: | |
importance: | Undecided → High |
milestone: | none → juno-1 |
tags: | added: icehouse-backport-potential |
Changed in neutron: | |
status: | Fix Committed → Fix Released |
Changed in neutron: | |
milestone: | juno-1 → 2014.2 |
tags: | added: sg-fw |
Reviewed: https:/ /review. openstack. org/77549 /git.openstack. org/cgit/ openstack/ neutron/ commit/ ?id=0c202ab3e45 3e38c09f04978e4 fce30d6ee6350c
Committed: https:/
Submitter: Jenkins
Branch: master
commit 0c202ab3e453e38 c09f04978e4fce3 0d6ee6350c
Author: Sudhakar <email address hidden>
Date: Mon Mar 3 15:35:20 2014 +0530
Improve iptables_manager _modify_rules() method
As the number of ports per default security group increases, the
number of iptables entries on the Compute Node grows. Because of
this, there is a gradual increase in the time taken to apply chains
and rules.
Currently we are using list comprehensions to find if a new chain or
rule matches an existing one. Instead, walk through the list in
reverse to find a matching entry.
Added a new method, _find_last_entry(), to return the entry we are
searching for.
Change-Id: I3585479ffa00be 556b8b21dc9dbd6 b36ad37f4de
Closes-Bug: #1302272
Related-Bug: #1253993