Comment 4 for bug 1454420

Revision history for this message
chhandak (chhandak) wrote : Re: [Bug 1454420] [NEW] [R2.20] Bond interface flap can break haproxy/vrouter networking

Hi Amit,

We can recreate the problem in your setup.

It is known problem in ubuntu 14.04 :
https://bugs.launchpad.net/ubuntu/+source/ifenslave/+bug/1288196

Followed similar workaround suggested. Added hwaddress in bond config
which will make sure the consistency of bond mac.

With this change tried restarting network multiple time and never hit the
problem.

We should able to push the same through our provisioning script. Leaving
that open for further discussion.

For now you should not hit the problem with the workaround.

Thanks and Regards,
Chhandak

On 5/13/15, 3:04 AM, "amit surana" <email address hidden> wrote:

>Private bug reported:
>
>R2.20 build 13 Ubuntu 14.0.4.
>
>
>Flapping a bond interface can cause its MAC address to change. Other
>services that rely on this MAC address, should also be restarted so
>that they can attach to the new MAC address. Details below.
>
>
>Consider a case where the controller/compute node is connected to the IP
>fabric via a 2 interface bond (LAG). The bond assumes the mac address of
>one of the slave interfaces (usually the first one that comes up).
>
>On the compute, the vhost0 interface also gets assigned the same MAC
>address as that of the bond interface. This MAC address is also used by
>keepalived/haproxy as the MAC address corresponding to the VIP.
>
>Now, if the bond interface flaps (networking service was restarted for
>instance) and the MAC address of the bond interface changes to that of a
>different slave interface, it is seen that vhost0 interface still points
>to the old MAC and this breaks all the connectivity to the compute. As
>far as the KA/HAproxy is concerned, though the services are running, the
>VIP isn't owned by any of the nodes, and so this functionality also
>breaks.
>
>The above scenario was simulated on the solution testbed by having the
>fab script add a static route to all the nodes on a working cluster.
>After adding the static route the fab add_static_route script restarts
>the networking service; this flaps the bond interface and causes the
>MAC address of the bond interface to change, which in turn leads to the
>above noted issues.
>
>If vrouter/keepalive services are restarted, the issue is resolved.
>
>
>bond0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>em1 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>em2 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>vhost0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>
>==restart networking==
>
>bond0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8b
>em1 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8b
>em2 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8b
>vhost0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>
>** Affects: juniperopenstack
> Importance: High
> Status: New
>
>** Affects: juniperopenstack/r2.20
> Importance: High
> Status: New
>
>** Affects: juniperopenstack/trunk
> Importance: High
> Status: New
>
>** Also affects: juniperopenstack/r2.20
> Importance: Undecided
> Status: New
>
>** Also affects: juniperopenstack/trunk
> Importance: High
> Status: New
>
>** Changed in: juniperopenstack/r2.20
> Importance: Undecided => High
>
>** Changed in: juniperopenstack/r2.20
> Milestone: None => r2.20-fcs
>
>** Changed in: juniperopenstack/trunk
> Milestone: r2.20-fcs => r2.30-fcs
>
>** Description changed:
>
>- Consider a case where the controller/compute node is connected to the IP
>- fabric via a 2 interface bond (LAG). The bond assumes the mac address of
>- one of the slave interfaces (usually the first one that comes up).
>+ R2.20 build 13 Ubuntu 14.0.4.
>+
>+
>+ Flapping a bond interface can cause its MAC address to change. Other
>+ services that rely on this MAC address, should also be restarted so
>+ that they can attach to the new MAC address. Details below.
>+
>+
>+ Consider a case where the controller/compute node is connected to the
>IP fabric via a 2 interface bond (LAG). The bond assumes the mac address
>of one of the slave interfaces (usually the first one that comes up).
>
> On the compute, the vhost0 interface also gets assigned the same MAC
> address as that of the bond interface. This MAC address is also used by
> keepalived/haproxy as the MAC address corresponding to the VIP.
>
> Now, if the bond interface flaps (networking service was restarted for
> instance) and the MAC address of the bond interface changes to that of a
> different slave interface, it is seen that vhost0 interface still points
> to the old MAC and this breaks all the connectivity to the compute. As
> far as the KA/HAproxy is concerned, though the services are running, the
> VIP isn't owned by any of the nodes, and so this functionality also
> breaks.
>
> The above scenario was simulated on the solution testbed by having the
> fab script add a static route to all the nodes on a working cluster.
> After adding the static route the fab add_static_route script restarts
>- the networking service; this flaps the bond interface and caused the
>- MAC address of the bond interface to change, which in turn lead to the
>+ the networking service; this flaps the bond interface and causes the
>+ MAC address of the bond interface to change, which in turn leads to the
> above noted issues.
>
>- If vrouter/keepalive services are restarted, the issue is resolved -
>- which points towards a potential solution.
>+ If vrouter/keepalive services are restarted, the issue is resolved.
>
>
>- bond0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>- em1 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>- em2 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>- vhost0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>+ bond0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>+ em1 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>+ em2 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>+ vhost0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>
> ==restart networking==
>
> bond0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8b
>- em1 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8b
>- em2 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8b
>+ em1 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8b
>+ em2 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8b
> vhost0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>
>--
>You received this bug notification because you are a member of Contrail
>Systems engineering, which is subscribed to Juniper Openstack.
>https://bugs.launchpad.net/bugs/1454420
>
>Title:
> [R2.20] Bond interface flap can break haproxy/vrouter networking
>
>Status in Juniper Openstack distribution:
> New
>Status in Juniper Openstack r2.20 series:
> New
>Status in Juniper Openstack trunk series:
> New
>
>Bug description:
> R2.20 build 13 Ubuntu 14.0.4.
>
>
> Flapping a bond interface can cause its MAC address to change. Other
> services that rely on this MAC address, should also be restarted so
> that they can attach to the new MAC address. Details below.
>
>
> Consider a case where the controller/compute node is connected to the
>IP fabric via a 2 interface bond (LAG). The bond assumes the mac address
>of one of the slave interfaces (usually the first one that comes up).
>
> On the compute, the vhost0 interface also gets assigned the same MAC
> address as that of the bond interface. This MAC address is also used
> by keepalived/haproxy as the MAC address corresponding to the VIP.
>
> Now, if the bond interface flaps (networking service was restarted for
> instance) and the MAC address of the bond interface changes to that of
> a different slave interface, it is seen that vhost0 interface still
> points to the old MAC and this breaks all the connectivity to the
> compute. As far as the KA/HAproxy is concerned, though the services
> are running, the VIP isn't owned by any of the nodes, and so this
> functionality also breaks.
>
> The above scenario was simulated on the solution testbed by having the
> fab script add a static route to all the nodes on a working cluster.
> After adding the static route the fab add_static_route script restarts
> the networking service; this flaps the bond interface and causes the
> MAC address of the bond interface to change, which in turn leads to
> the above noted issues.
>
> If vrouter/keepalive services are restarted, the issue is resolved.
>
>
> bond0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
> em1 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
> em2 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
> vhost0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>
> ==restart networking==
>
> bond0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8b
> em1 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8b
> em2 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8b
> vhost0 Link encap:Ethernet HWaddr 08:9e:01:d9:27:8a
>
>To manage notifications about this bug go to:
>https://bugs.launchpad.net/juniperopenstack/+bug/1454420/+subscriptions