better to send gratuitous ARPs to support HA

Bug #782364 reported by Tushar Patil
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Wishlist
Tushar Patil

Bug Description

To support high availability of nova-network service, there is a need to send gratuitous ARP or unsolicited ARP from nova-network for following IP addresses.

1) Gateway address of each vlan
2) Floating IPs

The reason to do this is obvious when master dies, the backup server will take over and because the mac address of the vlan/public interface of backup server is different from master any clients connected to the VM instances will be dropped out. The clients will still try to send the request to the master server since the ARP cache is not yet updated. To avoid this it is important to send gratuitous ARPs from the backup server which will cause routers and other hardware to update ARP cache.

Any service provider can choose OSS like heartbeat, keepalived or so on to detect failover and take action of starting nova-network service on the backup server in case of active-passive mode. It is also possible to send the gratuitous ARPs externally without modifying the nova-network source code but design wise I feel it is more appropriate to do this in nova-network service.

I am thinking of introducing one more flag for example send_arps which should be set to True to send gratuitous ARP in highly availability environment. If this flag is set to False, no gratuitous ARP will be sent.

This is a very small change in the nova-network source code so I thought of posting it as a bug rather than creating a blueprint for such a small change.

Related branches

Thierry Carrez (ttx)
Changed in nova:
importance: Undecided → Wishlist
status: New → Confirmed
Revision history for this message
Devin Carlen (devcamcar) wrote :

I'd also love to see this feature added.

Revision history for this message
Mark Gius (markgius) wrote :

I think I might be missing some context, because I'm not seeing a compelling reason why this should be apart of nova-network, rather than managed entirely by external software such as heartbeat.

In the case of an active/passive nova cluster, in the event of the active host failing the passive host will need to do the following:

  - Bring up a network interface for the IP address that the active host used to respond on
    - It is possible that this interface is already active and configured to not ARP
  - Send a gratuitous ARP on the new interface so the new mac address/switch port gets registered across the network
  - Start up nova-network on the passive host, ready to continue work of active via some shared resource or synchronization

Currently, Nova cannot do any of these things, so even if gratuitous ARP support is added, Nova is still dependent upon third party HA software to manage bringing up the interface and turning on nova-network, so why not let the third-party manage the nics and ARPs?

The only way adding this makes sense to me is if there is a long term goal to add full HA services to nova such that no third-party HA software is necessary. I didn't see any blueprint that talks about full HA services in launchpad.

Revision history for this message
Mark Gius (markgius) wrote :

Context has been delivered to me. I hadn't realized that nova-network in some cases serves as a gateway for instances, which are endpoints not likely to be visible easily to heartbeat or keepalived, etc. It also sounds like nova-network as it currently exists is deprecated and is going to be split up in such a way as to render this ARPing nonsense unnecessary.

TL;DR: I see the reason to add this while waiting for nova-network to be split out and discarded a la https://blueprints.launchpad.net/nova/+spec/ha-flatdhcp and https://blueprints.launchpad.net/nova/+spec/making-nova-components-ha

Revision history for this message
Tushar Patil (tpatil) wrote :

Mark, Your understanding is correct.
I am going to discuss about this bug again within our team and see if it's worth adding this changes now and discard it later on after nova-network is split out.

Revision history for this message
Tushar Patil (tpatil) wrote :

Added patch fix for those who wants to test nova-network with HA.

You will need to install iputils-arping package on compute and network nodes.

Nova Configuration changes:-
--send_arp_for_ha = True # By default it is set to False

Thierry Carrez (ttx)
Changed in nova:
status: Confirmed → Triaged
Thierry Carrez (ttx)
Changed in nova:
assignee: nobody → Tushar Patil (tpatil)
milestone: none → 2011.3
status: Triaged → In Progress
Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Revision history for this message
Stephen Cole (stephen-c1ud) wrote :

Why is the iputils-arping package needed on compute nodes? Wouldn't it just be needed on the nova-network nodes (to send a GARP when switching between physical hosts)?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.