Batch DVR ARP updates

Bug #1511134 reported by Rawlin Peters
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
Won't Fix
Undecided
Unassigned

Bug Description

The L3 agent currently issues ARP updates one at a time while processing a DVR router. Each ARP update creates an external process which has to call the neutron-rootwrap helper while also "ip netns exec <qrouter namespace>" -ing each time.

The ip command contains a "-batch <FILENAME>" option which would be able to batch all of the "ip neigh replace" commands into one external process per qrouter namespace. This would greatly reduce the amount of time it takes the L3 agent to update large numbers of ARP entries, particularly as the number of VMs in a deployment rises.

The benefit of batching ip commands can be seen in this simple bash example:

$ time for i in {0..50}; do sudo ip netns exec qrouter-bc38451e-0c2f-4ad2-b76b-daa84066fefb ip a > /dev/null; done

real 0m2.437s
user 0m0.183s
sys 0m0.359s
$ for i in {0..50}; do echo a >> /tmp/ip_batch_test; done
$ time sudo ip netns exec qrouter-bc38451e-0c2f-4ad2-b76b-daa84066fefb ip -b /tmp/ip_batch_test > /dev/null

real 0m0.046s
user 0m0.003s
sys 0m0.007s

If just 50 arp updates are batched together, there is about a 50x speedup. Repeating this test with 500 commands showed a speedup of 250x (disclaimer: this was a rudimentary test just to get a rough estimate of the performance benefit).

Note: see comments #1-3 for less-artificial performance data.

Changed in neutron:
assignee: nobody → Rawlin Peters (rawlin-peters)
Changed in neutron:
status: New → In Progress
Revision history for this message
Rawlin Peters (rawlin-peters) wrote :

Here is some initial less-artificial performance data gathered in the manner of the following diff and restarting the L3 agent 10 times:

diff --git a/neutron/agent/l3/agent.py b/neutron/agent/l3/agent.py
index 8191c5a..428ee79 100644
--- a/neutron/agent/l3/agent.py
+++ b/neutron/agent/l3/agent.py
@@ -498,7 +498,12 @@ class L3NATAgent(firewall_l3_agent.FWaaSL3AgentRpcCallback,
                 continue

             try:
+ import time
+ t0 = time.time()
                 self._process_router_if_compatible(router)
+ t1 = time.time()
+ delta = t1 - t0
+ LOG.debug("RAWLIN: _process_router delta: %s" % delta)
             except n_exc.RouterNotCompatibleWithAgent as e:
                 LOG.exception(e.msg)
                 # Was the router previously handled by this agent?

WITH batched ARP updates:

4.00962495804
4.05432415009
3.92502999306
3.85153913498
3.89367389679
3.91031813622
3.93485879898
3.99531412125
3.91884207726
3.98265600204
Average: 3.94761812687

WITHOUT batched ARP updates:
4.10144209862
4.33488178253
4.28370594978
4.1496078968
4.27167916298
4.32324385643
4.16499876976
3.97995710373
4.2998650074
4.12419891357
Average: 4.20335805416

Batching the ARP updates saves about .26 seconds here, and this was on a devstack with only 5 nova instances with fips (on one net/subnet attached to one router).

Revision history for this message
Rawlin Peters (rawlin-peters) wrote :

Here is some more performance data (gathered in the same manner as the previous comment) on a larger devstack setup with 30 nova instances (larger devstack has 32GB RAM and 8 cpus):

BATCHING:
2.32522296906
2.52060699463
2.39257097244
2.33361196518
2.44776415825
2.35187101364
2.28699588776
2.34910392761
2.24793386459
2.59339404106
Average: 2.38490757942

NO BATCHING:
3.66568088531
3.58729600906
3.75545597076
3.5426402092
3.5883140564
3.92159104347
3.81680822372
3.67821598053
3.56479907036
3.72220802307
Average: 3.68430094719

Batching ARP updates in this scenario saves about 1.3 seconds here.

Revision history for this message
Rawlin Peters (rawlin-peters) wrote :

Here is some more data from devstack running 30 instances (same setup as previous comment), except with another optimization to the BatchIpCommand object which gets a set of existing devices to check against when it's created:

WITH BATCHING:
1.0600669384
1.15013003349
0.992280960083
0.97692489624
1.01580500603
0.974492073059
0.984760046005
0.974772930145
1.00022315979
0.968346834183
Average: 1.00978028774

WITHOUT:
3.83378100395
3.8090801239
3.71890711784
3.92692112923
3.93453598022
3.72982597351
3.60387706757
3.82278084755
3.69037294388
4.00546097755
Average: 3.80755431652

With that added optimization, _process_router_if_compatible speeds up by about 73% with only 30 instances, saving about 2.8 seconds in this case.

Note: this new optimization is introduced in patchset 6 of https://review.openstack.org/#/c/239543.

description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/239543
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

ping?

Changed in neutron:
status: In Progress → Incomplete
assignee: Rawlin Peters (rawlin-peters) → nobody
Revision history for this message
Rawlin Peters (rawlin-peters) wrote :

There was a decision to disregard https://review.openstack.org/#/c/239543/ in favor of integrating privsep and subsequent usage of the pyroute2 library.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
Revision history for this message
Gustavo Randich (gustavo-randich) wrote :

Hi, in our organization we are affected by this issue... Don't know if privsep/pyroute2 is being integrated now

Background:

When our hosts boots up, the ARP cache population loop of L3 Agent is delaying the start of neutron-ns-metadata-proxy for around a minute (for a subnet with 170 used ports); then, when nova-compute launches VMs, all of cloud-init runs fail with timeout when reading metadata

To workaround this, we've made a systemd unit on which nova-compute is dependent; this unit waits for ns-metadata-proxy process to appear, and only then nova-compute starts

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/418609

Changed in neutron:
assignee: nobody → Brian Haley (brian-haley)
status: Expired → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/418609
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d7a6827e43a60a740a7d12c40fc1ead027046348
Submitter: Jenkins
Branch: master

commit d7a6827e43a60a740a7d12c40fc1ead027046348
Author: Brian Haley <email address hidden>
Date: Tue Jan 10 17:44:58 2017 -0500

    Change neighbour commands to use pyroute2

    Change ip_lib's IpNeighCommand class to use pyroute2
    for adding, deleting and dumping entries, rather than
    using 'ip neigh'. This will increase performance when
    many ARP updates happen at once.

    Change-Id: Idd528c0b402d1c9fc4b030f2aaa6d641d86ec02a
    Partial-Bug: #1511134

tags: added: neutron-proactive-backport-potential
Revision history for this message
Brian Haley (brian-haley) wrote :

Unfortunately, the changes to allow usage of pyroute2 were dependent on privsep work, which is only in Ocata. Unless that is also being backported the change in https://review.openstack.org/418609 can't be cherry-picked.

Adding the tag actually reminded me that pyroute2 does support batching, but not for neigh entries. I need to look at that code and/or file a bug as it would lead to an even better increase than what we got with these changes (3-5x).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/431197

Revision history for this message
Kevin Benton (kevinbenton) wrote : auto-abandon-script

This bug has had a related patch abandoned and has been automatically un-assigned due to inactivity. Please re-assign yourself if you are continuing work or adjust the state as appropriate if it is no longer valid.

Changed in neutron:
assignee: Brian Haley (brian-haley) → nobody
status: In Progress → New
tags: added: timeout-abandon
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Kevin Benton (<email address hidden>) on branch: master
Review: https://review.openstack.org/431197
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/431197
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
Lajos Katona (lajos-katona) wrote :

As I see the history we can close this with won't fix if you think please reopen this bug

Changed in neutron:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.