[CentOS 2.0-2462~icehouse] Ping between VMs thru a ECMP in-net SVC is failing

Bug #1394089 reported by Ganesha HV
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Fix Committed
High
Naveen N
R2.0
Fix Committed
High
Naveen N

Bug Description

Setup
====
env.roledefs = {
    'all': [host1, host2, host3, host4, host5],
    'cfgm': [host1],
    'control': [host2, host3],
    'compute': [host4, host5],
    'collector': [host1],
    'openstack': [host1],
    'webui': [host1],
    'database': [host1],
    'build': [host_build],
}

env.hostnames = {
     'all': ['nodec4', 'nodec5', 'nodec26', 'nodei27', 'nodei28']
}

1]. Created two networks in_network_vn2-35690624(84.106.240.0/24) and in_network_vn1-94859801(77.134.83.0/24).

2]. Launched VMs in_network_vm1-52509176(77.134.83.3) on nodei27 and in_network_vm2-65296005(84.106.240.3) on nodei28.

3]. Created service template
Template : in_net_svc_template_1-07951226
Display Name : in_net_svc_template_1-07951226
Mode : In-network
Type : Firewall
Scaling : Enabled
Availability Zone :
Interface Type: Left(Shared IP), Right(Shared IP)
Image : ubuntu-in-net
Instances : TestECMPSanity-99493442:in_net_svc_instance-29132965_1
Flavor : contrail_flavor_2cpu

4]. Launched a service instances with max_count set to 3 :
Instance Name : in_net_svc_instance-29132965_1
Display Name : in_net_svc_instance-29132965_1
Template : in_net_svc_template_1-07951226 (In-network)
Number of instances : 3 Instances
Networks : Left Network : in_network_vn1-94859801,Right Network : in_network_vn2-35690624
Image : ubuntu-in-net
Flavor : contrail_flavor_2cpu
Availability Zone : ANY:ANY
Instance Details :
default-domain__TestECMPSanity-99493442__5ea1020c-63cf-47f4-a4a8-a86a5a09687c__3 ACTIVE RUNNING
in_network_vn1-94859801:77.134.83.2 in_network_vn2-35690624:84.106.240.2
default-domain__TestECMPSanity-99493442__5ea1020c-63cf-47f4-a4a8-a86a5a09687c__1 ACTIVE RUNNING
in_network_vn1-94859801:77.134.83.2 in_network_vn2-35690624:84.106.240.2
default-domain__TestECMPSanity-99493442__5ea1020c-63cf-47f4-a4a8-a86a5a09687c__2 ACTIVE RUNNING
in_network_vn1-94859801:77.134.83.2 in_network_vn2-35690624:84.106.240.2

5]. Created a policy between the two networks to allow ANY traffic through the service-chain :
Display Name : policy_in_network-35729388
Associated Networks : in_network_vn1-94859801 in_network_vn2-35690624
Rules : protocol any network in_network_vn1-94859801 ports [ 0--1 ] <> network in_network_vn2-35690624 ports [ 0--1 ] services in_net_svc_instance-29132965_1

6]. Ping from in_network_vm1-52509176(77.134.83.3) to in_network_vm2-65296005(84.106.240.3) is failing.

7]. Using tcpdump, I see that the ICMP request is reaching the default-domain__TestECMPSanity-99493442__5ea1020c-63cf-47f4-a4a8-a86a5a09687c__1 on nodei28 :
[root@nodei28 ~]# tcpdump -ni tap7810f352-75
tcpdump: WARNING: tap7810f352-75: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tap7810f352-75, link-type EN10MB (Ethernet), capture size 65535 bytes
12:25:30.184671 IP 77.134.83.3 > 84.106.240.3: ICMP echo request, id 1405, seq 7891, length 64
12:25:31.184745 IP 77.134.83.3 > 84.106.240.3: ICMP echo request, id 1405, seq 7892, length 64
12:25:32.184734 IP 77.134.83.3 > 84.106.240.3: ICMP echo request, id 1405, seq 7893, length 64
^C
3 packets captured
3 packets received by filter
0 packets dropped by kernel
[root@nodei28 ~]# tcpdump -ni tapcdc1e4e8-9c
tcpdump: WARNING: tapcdc1e4e8-9c: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tapcdc1e4e8-9c, link-type EN10MB (Ethernet), capture size 65535 bytes
12:25:36.184915 IP 77.134.83.3 > 84.106.240.3: ICMP echo request, id 1405, seq 7897, length 64
12:25:37.184794 IP 77.134.83.3 > 84.106.240.3: ICMP echo request, id 1405, seq 7898, length 64
12:25:38.184784 IP 77.134.83.3 > 84.106.240.3: ICMP echo request, id 1405, seq 7899, length 64
12:25:39.185029 IP 77.134.83.3 > 84.106.240.3: ICMP echo request, id 1405, seq 7900, length 64
^C
4 packets captured
4 packets received by filter

8]. It is not reaching the tap-interface of in_network_vm2-65296005.

9]. I see Invalid Source counter incrementing on nodei28:

[root@nodei28 ~]# dropstats
GARP 0
ARP notme 769266
Invalid ARPs 0

Invalid IF 0
Trap No IF 0
IF TX Discard 0
IF Drop 0
IF RX Discard 0

Flow Unusable 0
Flow No Memory 0
Flow Table Full 0
Flow NAT no rflow 0
Flow Action Drop 411
Flow Action Invalid 0
Flow Invalid Protocol 0
Flow Queue Limit Exceeded 0

Discards 1281
TTL Exceeded 0
Mcast Clone Fail 0
Cloned Original 0

Invalid NH 35
Invalid Label 0
Invalid Protocol 0
Rewrite Fail 0
Invalid Mcast Source 0

Push Fails 0
Pull Fails 0
Duplicated 2037
Head Alloc Fails 0
Head Space Reserve Fails 0
PCOW fails 0
Invalid Packets 0

Misc 11
Nowhere to go 0
Checksum errors 0
No Fmd 0
Ivalid VNID 0
Fragment errors 0
Invalid Source 73572

Sandip Dey (sandipd)
tags: added: blocker
information type: Proprietary → Public
Changed in juniperopenstack:
milestone: r2.0-fcs → r2.1-fcs
Changed in juniperopenstack:
milestone: r2.1-fcs → none
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/5426
Committed: http://github.org/Juniper/contrail-controller/commit/1a189c814d4eb132a661fa7f87ef7227504df78e
Submitter: Zuul
Branch: R2.0

commit 1a189c814d4eb132a661fa7f87ef7227504df78e
Author: Naveen N <email address hidden>
Date: Tue Dec 9 01:16:19 2014 -0800

* In case of inline service instance, VRF translate action would specify
the vrf in which route lookup happens, this translated VRF might not
have the local vm peer path, since control node would have leaked the
routes.
In this case with ecmp, where one SI resides locally and other SI
instance is on remote compute node, both forward flow and reverse flow
should have the key set to policy enabled nexthop of the interface.
For picking the reverse flow key, we were looking inside the composite
NH and getting interface NH, but this interface NH would be policy
disabled, hence reverse flow key calculated was wrong.
Correcting the same
Closes-bug:#1394089

Change-Id: I74ee5c6506a1c95ed796a12932c1ea18a4acc4fd

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/5388
Committed: http://github.org/Juniper/contrail-controller/commit/4b36963f32d56003b27d458f0fad7e13899b6606
Submitter: Zuul
Branch: master

commit 4b36963f32d56003b27d458f0fad7e13899b6606
Author: Naveen N <email address hidden>
Date: Mon Dec 8 03:28:15 2014 -0800

* In case of inline service instance, VRF translate action would specify
the vrf in which route lookup happens, this translated VRF might not
have the local vm peer path, since control node would have leaked the
routes.
In this case with ecmp, where one SI resides locally and other SI
instance is on remote compute node, both forward flow and reverse flow
should have the key set to policy enabled nexthop of the interface.
For picking the reverse flow key, we were looking inside the composite
NH and getting interface NH, but this interface NH would be policy
disabled, hence reverse flow key calculated was wrong.
Correcting the same
Closes-bug:#1394089

Change-Id: Ia07edf6997c7abffe817049b7bf536c9d47d8e13

Changed in juniperopenstack:
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.