vRouter responds to ARP req for default GW from BMS with vhost0 MAC

Bug #1485804 reported by amit surana
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
High
Praveen
R2.20.x
Fix Committed
High
Divakar Dharanalakota
Trunk
Fix Committed
High
Divakar Dharanalakota

Bug Description

If DM is managing 2 MX routers and configures the virtual-gateway-address (VNs default GW IP) on both of those routers for a VN, the TSN/vRouter responds to ARP requests from BMSs for g/w IP address with its own vhost0 mac address, rather than forwarding the ARP request to MX routers. This breaks inter-vxlan routing.

172.16.183.1 is the TOR, 172.16.180.9 is the TSN. 1.1.1.3 is the BMS and its trying to ping 1.1.1.1.

16:38:50.751062 10:0e:7e:be:79:00 > 90:e2:ba:50:ac:89, ethertype IPv4 (0x0800), length 106: 172.16.183.1.4212 > 172.16.180.9.4789: VXLAN, flags [I] (0x08), vni 126
00:e0:ed:20:fa:53 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 56: Request who-has 1.1.1.1 tell 1.1.1.3, length 42
16:38:50.751136 90:e2:ba:50:ac:89 > 10:0e:7e:be:79:00, ethertype IPv4 (0x0800), length 106: 172.16.180.9.65478 > 172.16.183.1.4789: VXLAN, flags [I] (0x08), vni 126
90:e2:ba:50:ac:89 > 00:e0:ed:20:fa:53, ethertype ARP (0x0806), length 56: Reply 1.1.1.1 is-at 90:e2:ba:50:ac:89, length 42
16:38:50.751319 10:0e:7e:be:79:00 > 90:e2:ba:50:ac:89, ethertype IPv4 (0x0800), length 148: 172.16.183.1.42359 > 172.16.180.9.4789: VXLAN, flags [I] (0x08), vni 126
00:e0:ed:20:fa:53 > 90:e2:ba:50:ac:89, ethertype IPv4 (0x0800), length 98: 1.1.1.3 > 1.1.1.1: ICMP echo request, id 9023, seq 1, length 64
16:38:51.750447 10:0e:7e:be:79:00 > 90:e2:ba:50:ac:89, ethertype IPv4 (0x0800), length 148: 172.16.183.1.42359 > 172.16.180.9.4789: VXLAN, flags [I] (0x08), vni 126
00:e0:ed:20:fa:53 > 90:e2:ba:50:ac:89, ethertype IPv4 (0x0800), length 98: 1.1.1.3 > 1.1.1.1: ICMP echo request, id 9023, seq 2, length 64
16:38:52.750465 10:0e:7e:be:79:00 > 90:e2:ba:50:ac:89, ethertype IPv4 (0x0800), length 148: 172.16.183.1.42359 > 172.16.180.9.4789: VXLAN, flags [I] (0x08), vni 126
00:e0:ed:20:fa:53 > 90:e2:ba:50:ac:89, ethertype IPv4 (0x0800), length 98: 1.1.1.3 > 1.1.1.1: ICMP echo request, id 9023, seq 3, length 64
16:38:53.750465 10:0e:7e:be:79:00 > 90:e2:ba:50:ac:89, ethertype IPv4 (0x0800), length 148: 172.16.183.1.42359 > 172.16.180.9.4789: VXLAN, flags [I] (0x08), vni 126
00:e0:ed:20:fa:53 > 90:e2:ba:50:ac:89, ethertype IPv4 (0x0800), length 98: 1.1.1.3 > 1.1.1.1: ICMP echo request, id 9023, seq 4, length 64
^C
1602 packets captured
1603 packets received by filter
0 packets dropped by kernel
root@csol2-node9:/var/log# ifconfig vhost0
vhost0 Link encap:Ethernet HWaddr 90:e2:ba:50:ac:89
          inet addr:172.16.180.9 Bcast:172.16.180.255 Mask:255.255.255.0
          inet6 addr: fe80::92e2:baff:fe50:ac89/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:6322707 errors:0 dropped:2102 overruns:0 frame:0
          TX packets:1497677 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:435504428 (435.5 MB) TX bytes:601016317 (601.0 MB)

Tags: vrouter
Nischal Sheth (nsheth)
no longer affects: juniperopenstack/r2.30
amit surana (asurana-t)
summary: - vRouter responds to ARP req from BMS for def. GW with vhost0 IP
+ vRouter responds to ARP req for def. GW from BMS with vhost0 IP
Revision history for this message
Nischal Sheth (nsheth) wrote : Re: vRouter responds to ARP req for def. GW from BMS with vhost0 IP

I suspect this happens because there's an ECMP route for the GW address.
RouteKSyncEntry::BuildArpFlags has an exception for ipam subnet route,
but maybe that's not sufficient.

Do we also need to add an exception for GW address and ensure that the
route has an IP-MAC binding?

information type: Proprietary → Public
Nischal Sheth (nsheth)
summary: - vRouter responds to ARP req for def. GW from BMS with vhost0 IP
+ vRouter responds to ARP req for default GW from BMS with vhost0 MAC
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/13340
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/13340
Committed: http://github.org/Juniper/contrail-vrouter/commit/cfb0524933c1bcbfc895613d53f5db482e4fef17
Submitter: Zuul
Branch: R2.20

commit cfb0524933c1bcbfc895613d53f5db482e4fef17
Author: Divakar <email address hidden>
Date: Wed Aug 26 17:39:11 2015 +0530

No source IP lookup for ARP requests from BMS

For the packets from VM to an ECMP destination we are forcing the
packets to be L3 routed. When ARP request comes for that VM from one of
the ECMP sources, though we have the stiching for VM's IP we give
Vhost's MAC to route the packets as packets need to be routed in this
direction as well. This functioanlity is added with the fix for the bug
1472796
.
But the fix for the above bug should not handle the ARP request coming
from BMS (in TSN) as TSN is never a gateway for BMS. Such ARP request
should be flooded. So the fix is to not force the L3 if ARP request is
from BMS.

Change-Id: Ib2626f27a89d34cd98b04e6084aac12ca8eb4ac9
closes-bug: #1485804

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/16374
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/16374
Committed: http://github.org/Juniper/contrail-vrouter/commit/c5098ddeb28bf0db2e0cc43a4cb59c390d882ae1
Submitter: Zuul
Branch: master

commit c5098ddeb28bf0db2e0cc43a4cb59c390d882ae1
Author: Divakar <email address hidden>
Date: Wed Aug 26 17:39:11 2015 +0530

No source IP lookup for ARP requests from BMS

For the packets from VM to an ECMP destination we are forcing the
packets to be L3 routed. When ARP request comes for that VM from one of
the ECMP sources, though we have the stiching for VM's IP we give
Vhost's MAC to route the packets as packets need to be routed in this
direction as well. This functioanlity is added with the fix for the bug
1472796
.
But the fix for the above bug should not handle the ARP request coming
from BMS (in TSN) as TSN is never a gateway for BMS. Such ARP request
should be flooded. So the fix is to not force the L3 if ARP request is
from BMS.

Change-Id: Ib2626f27a89d34cd98b04e6084aac12ca8eb4ac9
closes-bug: #1485804

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20.x

Review in progress for https://review.opencontrail.org/17480
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/17480
Committed: http://github.org/Juniper/contrail-vrouter/commit/806037a7bc25ad1f4dfff9e4713b414be62c38b2
Submitter: Zuul
Branch: R2.20.x

commit 806037a7bc25ad1f4dfff9e4713b414be62c38b2
Author: Divakar <email address hidden>
Date: Tue Sep 22 21:59:40 2015 +0530

Replyin to ARP request of ECMP source only if VM is hosted

When an ARP request is received on compute node on fabric interface from
an ECMP source, ARP response is sent with Vhost mac even though the ARP
request is not meant for any VM on that compute node. Because of this,
even if BMS pings another BMS, every compute node receiving this ARP
request is responding with Vhost mac leading to ARP cache poisoning in
BMS.

As a fix, only if ARP request is meant for a VM on compute node, the
response is sent with Vhost mac.

No source IP lookup for ARP requests from BMS

For the packets from VM to an ECMP destination we are forcing the
packets to be L3 routed. When ARP request comes for that VM from one of
the ECMP sources, though we have the stiching for VM's IP we give
Vhost's MAC to route the packets as packets need to be routed in this
direction as well. This functioanlity is added with the fix for the bug
1472796
.
But the fix for the above bug should not handle the ARP request coming
from BMS (in TSN) as TSN is never a gateway for BMS. Such ARP request
should be flooded. So the fix is to not force the L3 if ARP request is
from BMS.

Change-Id: I4036dcd6eaf757b579de8ae391855aa7269a9ac1
closes-bug: #1485804
closes-bug: #1491644

Revision history for this message
Robert Rosiak (robert-rosiak) wrote :

This issue seems to be reintroduces in contrail 3.0.2.0-51 kilo.
There are 2 MX routers and configures the virtual-gateway-address,
When BMS i sending ARP request, then sometimes it gets a response from vrouter Vhost0 MAC address instead of VM MAC.

Here's the tcpdump on BMS
[root@BMS ~]# tcpdump -e -n -i bond0.1500 arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0.1500, link-type EN10MB (Ethernet), capture size 65535 bytes
01:38:07.260082 02:2e:ee:16:58:fd > 1c:c1:de:e7:b4:70, ethertype ARP (0x0806), length 56: Request who-has 11.0.1.200 tell 11.0.1.111, length 42
01:38:07.260099 1c:c1:de:e7:b4:70 > 02:2e:ee:16:58:fd, ethertype ARP (0x0806), length 42: Reply 11.0.1.200 is-at 1c:c1:de:e7:b4:70, length 28
01:38:13.257002 1c:c1:de:e7:b4:70 > 02:2e:ee:16:58:fd, ethertype ARP (0x0806), length 42: Request who-has 11.0.1.111 tell 11.0.1.200, length 28
01:38:13.257215 e4:1f:13:7a:6a:7a > 1c:c1:de:e7:b4:70, ethertype ARP (0x0806), length 56: Reply 11.0.1.111 is-at e4:1f:13:7a:6a:7a, length 42
01:38:14.258962 1c:c1:de:e7:b4:70 > 02:2e:ee:16:58:fd, ethertype ARP (0x0806), length 42: Request who-has 11.0.1.111 tell 11.0.1.200, length 28
01:38:14.259253 e4:1f:13:7a:6a:7a > 1c:c1:de:e7:b4:70, ethertype ARP (0x0806), length 56: Reply 11.0.1.111 is-at e4:1f:13:7a:6a:7a, length 42

02:2e:ee:16:58:fd is VM MAC on compute1
1c:c1:de:e7:b4:70 is BMS MAC address on VLAN1500
e4:1f:13:7a:6a:7a is the vhost0 on compute1:

root@compute-kvm-1:~# ifconfig vhost0
vhost0 Link encap:Ethernet HWaddr e4:1f:13:7a:6a:7a
          inet addr:10.10.1.13 Bcast:10.10.1.255 Mask:255.255.255.0

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.