Should send multiple ARPs after floating IP assignment

Bug #1043796 reported by Phil Day
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Brian Haley

Bug Description

When an IP address is assigned to an instance linux_net can be configured to send an gratuitous APR to announce the change to the network - configured by the "send_arp_for_ha" flag.

The command used, arping, accepts an argument "-c" to control the number of APRs sent, but in linux_net.py this is currently hard coded to 1.

We have seen that is some circumstances it is necesary (esp if the network is loaded) to send more than one gratuitous APR to ensure that the network devices see and respond to this change.

This shoudl be a simple change, introducing a new "arp_count" flag which can passed to the arpinc command instead of teh current hard coded value.

Revision history for this message
Mark McLoughlin (markmc) wrote :

Sounds reasonable that we should want to send more than 1

Don't need to add yet another flag - just sending 5 or more should be fine for everyone

For reference, qemu sends 5 after a live migration:
http://git.qemu.org/?p=qemu.git;a=blob;f=savevm.c;h=c7fe283145;hb=HEAD#l135

Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
milestone: none → folsom-rc1
summary: - Number of ARPs sent for floating IP assignment needs to be configurable
+ Should send multiple ARPs after floating IP assignment
Revision history for this message
Brian Haley (brian-haley) wrote :

I think this should be a flag since for every additional ARP we send we add 1 second of startup time for an instance (setting to 5 adds 4 seconds). Plus it gives someone the ability to change this for their environment without hacking the code.

I'll send out a patch.

Revision history for this message
Mark McLoughlin (markmc) wrote :

Hmm, I wonder does this really need to be done synchronously?

Changed in nova:
assignee: nobody → Brian Haley (brian-haley)
Revision history for this message
Phil Day (philip-day) wrote :

I guess there is no reason why the apring couldn't be run from a seperate thread - but do we really need that additional complexity to save 4 seconds from VM start-up time ?

I'd suggest we fix the basic problem for now, and optimise in the next pass.

Revision history for this message
Mark McLoughlin (markmc) wrote :

4 seconds of VM startup time seems pretty significant to me

Running it from eventlet.spawn_n() isn't terribly complicated either

Again, this makes it possible to just "do the right thing" rather than add a "you get to choose slow or broken" flag

Revision history for this message
Brian Haley (brian-haley) wrote :

There is already a "broken" flag, since by default FLAGS.send_arp_for_ha=False, so you'll get no ARPs unless you change that. People will want a tunable if they set that flag (it maybe should have been an int from the beginning). To get that simple change out and make it easy enough to add spawn_n() later I'll propose the following:

Create a new flag for the number of ARPs:

    "int" FLAGS.send_arp_for_ha_count, default: 3 (high enough?)

Create a new function to send the ARPs:

    def send_arp_for_ip(ip, device, count)

Change the two call sites that call arping directly to call that instead, making spawn_n() somewhat painless.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/12436

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/12436
Committed: http://github.com/openstack/nova/commit/e9b05caca3c47c57225a9dfcc36bf77d91324bfe
Submitter: Jenkins
Branch: master

commit e9b05caca3c47c57225a9dfcc36bf77d91324bfe
Author: Brian Haley <email address hidden>
Date: Wed Sep 5 12:19:33 2012 -0400

    Add a tunable to control how many ARPs are sent.

    This new flag, send_arp_for_ha_count, controls how many
    ARPs are sent when binding a floating IP address to an
    instance. Also increased the default number to 3 from 1,
    to make this more robust and guarantee other network
    devices see them.

    Fixes bug 1043796.

    Change-Id: Ib9118fcc5334ef4a8c5d7a5e765364e26fea68da

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: folsom-rc1 → 2012.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.