TSN was not forwarding Unknown Unicast traffic

Bug #1686287 reported by mehul
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.1
New
Critical
Hari Prasad Killi
R3.2
New
Critical
Hari Prasad Killi
R4.0
New
High
Hari Prasad Killi
Trunk
New
Critical
Hari Prasad Killi

Bug Description

Hi Team,

Issue:

Customer has upgraded contrail from version 2.21.2 -> 3.1.3.0-72 on 2017/4/19(JST). Then customer configured VN and LIF with contrail. They suspect that the issue might have occurred during that time.

This issue has occurred in customer LAB and still it persists.

Communication is impossible except for the following patterns
However, It is able to communicate in the reverse direction.

[QFX11] to [QFX6]
 30:6B:0b:00:00:0b ->30:6b:06:00:00:0b
 30:6B:0b:00:00:20 ->30:6b:06:00:00:20
 30:6B:0b:00:00:45 ->30:6b:06:00:00:45
 30:6B:0b:00:00:4a ->30:6b:06:00:00:4a

BUM Tree Topology:

[QFX11]-[TSN(openc-34)]-[TSN(openc-35)]-[QFX6]

BUM Tree status: Normally

Packet Capture:

[QFX11]->[QFX6]
result: TSN(openc-35) could not forward UnknowUnicast traffic to QFX6.

[QFX11]<-[QFX6]
result: Normally

BUM Tree status was normal.However, TSN(openc-35) was not forwarding the Unknown Unicast.

Attaching below to look the issue:

bum-tree.txt
20170421_cap.zip-->Capture
20170421_openc31_contrail.tar.gz
20170421_openc32_contrail.tar.gz
20170421_openc33_contrail.tar.gz
20170421_openc34_contrail.tar.gz
20170421_openc35_contrail.tar.gz
Topology Diagram
testped.py file

Core files:

core.17069.openc-34-tor-agent-11
core.16936.openc-34-tor-agent-6
core.16940.openc-34-vrouter-agent
core.17295.openc-35-tor-agent-6
core.17299.openc-35-vrouter-agent
core.17448.openc-35-tor-agent-11

Could you investigate the core files and try to find out the root cause of this issue?

-Regards,
Mehul Patel

Tags: bms vrouter nttc
Revision history for this message
mehul (pmehul) wrote :
Revision history for this message
mehul (pmehul) wrote :
Revision history for this message
mehul (pmehul) wrote :

Hi Team,

Please find all the core files, testped.py, logs on the below locations

IP:10.219.48.123, root/Jtaclab123
path:/home/mehul/2017-0424-0342

-Regards,
Mehul Patel

Jeba Paulaiyan (jebap)
tags: added: bms vrouter
Revision history for this message
mehul (pmehul) wrote :

Customer keeps an environment of this issue until 5/1 a.m.(JST). If we need any logs, please let me know before 5/1 09:00(JST)

tags: added: nttc
Revision history for this message
vivekananda shenoy (vshenoy83) wrote :

Hi Hari,

Any updates on this issue ?

Regards,
Vivek

Revision history for this message
mehul (pmehul) wrote :

Hi Hari/Vivek,

Customer also installed contrail 3.1.3(clean install) and they still see this issue. Do let me know if you need any specific information.

-Regards,
Mehul Patel

mehul (pmehul)
information type: Proprietary → Public
Revision history for this message
Manish Singh (manishs) wrote :

Same fabric label was allocated to two VRF.
This results in issue as MPLS label is not pointing to one of the BUM tree.
Say there are tow VRF v1 and v2 using label x. So if v1 programs x and then v2 updates same label, label will point to v2 bum tree and not v1. Hence all bum in v1 gets dropped.

Label in question(from core):
$1 = (AgentPath *) 0x7fd5d44e0760
(gdb) p $1->label_
$2 = 129229

Revision history for this message
Manish Singh (manishs) wrote :

Need investigation from CN on how this can happen.

Revision history for this message
Manish Singh (manishs) wrote :

Debugs:

(gdb) p $13->mac_
$51 = ff:ff:ff:ff:ff:ff
(gdb) p $13->vrf_->name_
$52 = "default-domain:commonmax-010-pr-0434:commonmax-010-vn-0434:commonmax-010-vn-0434"
(gdb) dump_route_paths $13
Number of paths : 3
Path : 0x7fd5b58958b0
Path : 0x7fd5cdb8bd40
Path : 0x7fd5b58956f0
(gdb) p ((AgentPath *) 0x7fd5b58958b0)->label_
$55 = 129229

(gdb) p $14->mac_
$53 = ff:ff:ff:ff:ff:ff
(gdb) p $14->vrf_->name_
$54 = "default-domain:commonmax-008-pr-0001:commonmax-008-vn-0001:commonmax-008-vn-0001"
(gdb) dump_route_paths $14
Number of paths : 7
Path : 0x7fd5d44e0760
Path : 0x7fd59caf50d0
Path : 0x7fd59c90aa80
Path : 0x7fd57582a240
Path : 0x7fd57582a7a0
Path : 0x7fd5e2c4c910
Path : 0x7fd5d613b400
(gdb) p ((AgentPath *) 0x7fd5d44e0760)->label_
$56 = 129229

Both routes are using same FMG label.

Revision history for this message
Manish Singh (manishs) wrote :

Will be great if CN core can be taken as well.

Revision history for this message
mehul (pmehul) wrote :

Hi Manish,

Are you referring all the tor and vrouter agent cores files from problematic TSNs after applying the binary?

-Regards,
Mehul Patel

Revision history for this message
Manish Singh (manishs) wrote :

Control-node cores all.
Tor-agent - all
vrouter-agent - all

Though I need only CN core, on reproducing issue tor-agent and vrouter-agent needs to be taken to get info for vxlan.

Can they reproduce it consistently? If yes what are the steps they follow?

Revision history for this message
mehul (pmehul) wrote :

Hi Manish,

Since JN-322(TSN was not transfered Unknown Unicast) and JN-323(The remote MAC address is not learned) are merged temporarily, customer requested stop the investigation for the JN-322. They said that if the root cause is found different then we will ask you to debug JN-322 and also provide the detail.

Right now, as per customer request, I will wait until the investigation is done for the JN-323

Revision history for this message
mehul (pmehul) wrote :

Hi Hari,

After applying third binary on customer setup this issue is not sen at the moment.

Revision history for this message
Manish Singh (manishs) wrote :

Fix tried here in binary is for:
https://bugs.launchpad.net/juniperopenstack/+bug/1692795

Moving it to duplicate of this issue. If its seen again plz re-open as seperate bug.

Revision history for this message
vivekananda shenoy (vshenoy83) wrote :

Hi Mehul,

If this bug is fixed as part of 1682795 , then can you please make this bug as closed. Currently this bug has no meaningful state.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.