[EVPN VXLAN] Multi Homing: Traffic from BMS to VM dropped at compute for Invalid NH

Bug #1724681 reported by chhandak
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R4.1
Fix Committed
Critical
Divakar Dharanalakota
Trunk
Fix Committed
Critical
Divakar Dharanalakota

Bug Description

Tring this with private agent binary for L2 ECMP

Description:
When BMS mac is having composite netxthop, ICMP traffic from BMS to VM is getting dropped at respective compute due to Invalid NextHOP. This is happening on when composite next hop is programmed. Eventually, agent is programming the vtep IP of the QFX where MAC is locally learned and traffic is resuming.

Steps to reproduce:

Step 1: Initially Traffic is fine. Vtep source from where traffic is coming is only programmed in agent.

root@5b11s15:~# tcpdump -ni ens2f1 udp port 4789
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens2f1, link-type EN10MB (Ethernet), capture size 262144 bytes
13:25:19.465489 IP 172.16.2.1.52689 > 172.16.180.102.4789: VXLAN, flags [I] (0x08), vni 4
IP 1.1.1.6 > 1.1.1.10: ICMP echo request, id 3535, seq 9, length 64
13:25:19.465708 IP 172.16.180.102.54005 > 172.16.2.1.4789: VXLAN, flags [I] (0x08), vni 4
IP 1.1.1.10 > 1.1.1.6: ICMP echo reply, id 3535, seq 9, length 64
13:25:20.465482 IP 172.16.2.1.52689 > 172.16.180.102.4789: VXLAN, flags [I] (0x08), vni 4
IP 1.1.1.6 > 1.1.1.10: ICMP echo request, id 3535, seq 10, length 64

root@5b11s15:~# rt --dump 2 --family bridge
Flags: L=Label Valid, Df=DHCP flood, Mm=Mac Moved, L2c=L2 Evpn Control Word, N=New Entry, Ec=EvpnControlProcessing
vRouter bridge table 0/2
Index DestMac Flags Label/VNID Nexthop Stats
31264 0:0:5e:0:1:0 Df - 3 15942
40992 90:e2:ba:c4:2e:6c LDf 4 29 75
54820 2:72:59:8:e6:1 - 31 13314950
112924 ff:ff:ff:ff:ff:ff LDf 4 37 1075
170732 2:62:15:24:9c:d6 LDf 4 19 0
229640 90:e2:ba:a7:30:ad Df - 3 0
root@5b11s15:~# nh --get 29
Id:29 Type:Tunnel Fmly: AF_INET Rid:0 Ref_cnt:3 Vrf:0
              Flags:Valid, Vxlan, Etree Root,
              Oif:0 Len:14 Data:84 b5 9c c8 00 00 90 e2 ba a7 30 ad 08 00
              Sip:172.16.180.102 Dip:172.16.2.1

root@5b11-qfx2# run show ethernet-switching table

MAC flags (S - static MAC, D - dynamic MAC, L - locally learned, P - Persistent static
           SE - statistics enabled, NM - non configured MAC, R - remote PE MAC, O - ovsdb MAC)

Ethernet switching table : 3 entries, 3 learned
Routing instance : default-switch
   Vlan MAC MAC Logical Active
   name address flags interface source
   contrail_vn-test-1-l2-4 02:62:15:24:9c:d6 D vtep.32769 172.16.180.101
   contrail_vn-test-1-l2-4 02:72:59:08:e6:01 D vtep.32771 172.16.180.102
   contrail_vn-test-1-l2-4 90:e2:ba:c4:2e:6c DL ae127.100 >>> Traffic localy learned on 172.16.2.1

Ethernet switching table : 3 entries, 3 learned
Routing instance : default-switch
   Vlan MAC MAC Logical Active
   name address flags interface source
   contrail_vn-test-1-l2-4 02:62:15:24:9c:d6 D vtep.32769 172.16.180.101
   contrail_vn-test-1-l2-4 02:72:59:08:e6:01 D vtep.32773 172.16.180.102
   contrail_vn-test-1-l2-4 90:e2:ba:c4:2e:6c DR ae127.100 >>> Remote learned in 172.16.3.1. Not present in agent table

Step 2: Disable the active interface on QFX2 . So traffic is only learned on 172.16.3.1. Agent programs the next hop accordingly and traffic continues.

{master:0}[edit]
root@5b11-qfx2# set interfaces xe-0/0/46 disable

root@5b11s15:~# rt --dump 2 --family bridge
Flags: L=Label Valid, Df=DHCP flood, Mm=Mac Moved, L2c=L2 Evpn Control Word, N=New Entry, Ec=EvpnControlProcessing
vRouter bridge table 0/2
Index DestMac Flags Label/VNID Nexthop Stats
31264 0:0:5e:0:1:0 Df - 3 15966
40992 90:e2:ba:c4:2e:6c LDf 4 18 76
54820 2:72:59:8:e6:1 - 31 13315203
112924 ff:ff:ff:ff:ff:ff LDf 4 34 1075
170732 2:62:15:24:9c:d6 LDf 4 19 0
229640 90:e2:ba:a7:30:ad Df - 3 0
root@5b11s15:~# nh --get 18
Id:18 Type:Tunnel Fmly: AF_INET Rid:0 Ref_cnt:3 Vrf:0
              Flags:Valid, Vxlan, Etree Root,
              Oif:0 Len:14 Data:84 b5 9c c8 00 00 90 e2 ba a7 30 ad 08 00
              Sip:172.16.180.102 Dip:172.16.3.1 >>> Switched to new QFX

Step 3: Now enable the interface on QFX2. Here from sometime in agent both qfx is getting programmed for BMS mac. As long BMS mac is composite nexthop traffic dropped in compute for Invalid NH. Eventually agent again programs the only active vtep QFX and traffic resumes.

root@5b11s15:~# rt --dump 2 --family bridge
Flags: L=Label Valid, Df=DHCP flood, Mm=Mac Moved, L2c=L2 Evpn Control Word, N=New Entry, Ec=EvpnControlProcessing
vRouter bridge table 0/2
Index DestMac Flags Label/VNID Nexthop Stats
31264 0:0:5e:0:1:0 Df - 3 16001
40992 90:e2:ba:c4:2e:6c LDf -1 33 428
54820 2:72:59:8:e6:1 - 31 13315557
112924 ff:ff:ff:ff:ff:ff LDf 4 36 1075
170732 2:62:15:24:9c:d6 LDf 4 19 0
229640 90:e2:ba:a7:30:ad Df - 3 0
root@5b11s15:~# nh --get 33
Id:33 Type:Composite Fmly: AF_INET Rid:0 Ref_cnt:2 Vrf:2
              Flags:Valid, Ecmp, Etree Root,
              Valid Hash Key Parameters: Proto,SrcIP,SrcPort,DstIp,DstPort
              Sub NH(label): 29(4) 18(4)

Id:29 Type:Tunnel Fmly: AF_INET Rid:0 Ref_cnt:3 Vrf:0
              Flags:Valid, Vxlan, Etree Root,
              Oif:0 Len:14 Data:84 b5 9c c8 00 00 90 e2 ba a7 30 ad 08 00
              Sip:172.16.180.102 Dip:172.16.2.1

Id:18 Type:Tunnel Fmly: AF_INET Rid:0 Ref_cnt:3 Vrf:0
              Flags:Valid, Vxlan, Etree Root,
              Oif:0 Len:14 Data:84 b5 9c c8 00 00 90 e2 ba a7 30 ad 08 00
              Sip:172.16.180.102 Dip:172.16.3.1

root@5b11s15:~# dropstats | grep -v " 0$"
IF Drop 3

Flow Action Drop 11184705
Flow Queue Limit Exceeded 36

Discards 13
Cloned Original 50

Invalid NH 1409
Invalid Mcast Source 2

Invalid Source 187
No L2 Route 3

root@5b11s15:~# dropstats | grep -v " 0$"
IF Drop 3

Flow Action Drop 11184705
Flow Queue Limit Exceeded 36

Discards 13
Cloned Original 50

Invalid NH 1421
Invalid Mcast Source 2

Invalid Source 192
No L2 Route 3

Revision history for this message
chhandak (chhandak) wrote :

Gcore in problemetic state copied to /auto/cores/1724681

Changed in juniperopenstack:
importance: Undecided → Critical
assignee: nobody → Manish Singh (manishs)
information type: Proprietary → Public
summary: - [EVPN VXLAN] Multi Homing: Traffic from BMS to VM dropped at compute
- from Invalid NH
+ [EVPN VXLAN] Multi Homing: Traffic from BMS to VM dropped at compute for
+ Invalid NH
Changed in juniperopenstack:
milestone: none → r4.1.0.0-fcs
tags: added: blocker
Manish Singh (manishs)
Changed in juniperopenstack:
assignee: Manish Singh (manishs) → Divakar Dharanalakota (ddivakar)
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.1

Review in progress for https://review.opencontrail.org/37092
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/37093
Submitter: Divakar Dharanalakota (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/37093
Committed: http://github.com/Juniper/contrail-vrouter/commit/c24c559e29a3685e3a980ce14cabc30243912569
Submitter: Zuul (<email address hidden>)
Branch: master

commit c24c559e29a3685e3a980ce14cabc30243912569
Author: Divakar D <email address hidden>
Date: Thu Nov 2 14:32:30 2017 +0530

RPF check for Vxlan Packets

For Vxlan tunneled packets that are received on Fabric interface RPF
callback was missing, leading to no RPF validation of the vxlan packets.

Due to Tor Evpn support, Tors can be in Ecmp, which is a composite Ecmp
nexthop in Vrouter. When the first packet corressponding to unique 5 tuple is
received on Fabric (rather from VM) from one of the Ecmp sources of Ecmp
composite nexthop, the component nexthop is pinned to the source, only if
RPF callback exists. Lack of this RPF callback was making the Ecmp go
wrong

As a fix, Vxlan RPF call back is provided

Change-Id: If53a0bc76398cfc8c176a3a94a0aede8b26262b4
closes-bug: #1724681

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/37092
Committed: http://github.com/Juniper/contrail-vrouter/commit/b7033979694fbd9d4b936ced0a3381c3af7ef771
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit b7033979694fbd9d4b936ced0a3381c3af7ef771
Author: Divakar D <email address hidden>
Date: Thu Nov 2 14:32:30 2017 +0530

RPF check for Vxlan Packets

For Vxlan tunneled packets that are received on Fabric interface RPF
callback was missing, leading to no RPF validation of the vxlan packets.

Due to Tor Evpn support, Tors can be in Ecmp, which is a composite Ecmp
nexthop in Vrouter. When the first packet corressponding to unique 5 tuple is
received on Fabric (rather from VM) from one of the Ecmp sources of Ecmp
composite nexthop, the component nexthop is pinned to the source, only if
RPF callback exists. Lack of this RPF callback was making the Ecmp go
wrong

As a fix, Vxlan RPF call back is provided

Change-Id: If53a0bc76398cfc8c176a3a94a0aede8b26262b4
closes-bug: #1724681

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.