Changes to the Service Chain in a policy disrupts route leaking for several minutes

Bug #1549454 reported by Ato
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.0
Fix Committed
High
Manish Singh
Trunk
Fix Committed
High
Manish Singh

Bug Description

Issue found in several 3.0 builds. More specifically, last time it was seen in 2715.

Initially we have the following policy:

pass protocol any network net1 ports any <> network net2 ports any services SI-1

SI-1 is In-Network. Bidirectional leaking is fine, and takes place almost immediately after the policy is applied to net1 and net2.

Then we add SI-2, an In-Network-NAT Service Instance, to the chain:

pass protocol any network net1 ports any <> network net2 ports any services SI-1,SI-2

As expected, route leaking from left to right doesn't happen. This is due to SI-2 being NAT.

However, route leaking from right to left should happen. It does happen, but only after approximately three minutes.

The same delay occurs when we remove SI-2 from the chain.

Similar symptoms have been observed in service chains with purely In-Network services and no NAT.

Nischal Sheth (nsheth)
tags: added: service-chain
no longer affects: opencontrail
Nischal Sheth (nsheth)
tags: added: blocker
Revision history for this message
amit surana (asurana-t) wrote :

This bug is seen if the left/right VM happens to land on the same compute as the SIs.

If right VM is on same compute as SIs, no routes are leaked into the right SI RIs. If left VM is on the same compute as SIs, no routes are leaked into the left SI RIs. If all are on the same compute, either of the above could happen. The SI RIs are created/deleted correctly based on adding/removing SI form the chain; the acl/vrf-assign rule also looked correct; for some reason the routes weren't leaked for several minutes resulting in vrouter dropping the packets (NoSrcRt or NoDstRt).

Revision history for this message
Prakash Bailkeri (prakashmb) wrote :
Download full text (4.5 KiB)

Debugged it further and found that agent sends delayed subscription to control-node. Due to missing subscription, control-node doesn’t re-originate the route.

Routing instance config rxed:
2016-02-26 15:52:53.298 AgentXmppMessage: Received xmpp message from: 172.16.180.8 Port 5269 Size: 770 Packet: <?xml version="1.0"?> <iq type="set" <email address hidden>" to="default-global-system-config:csol2-node11/config"> <config> <update> <node type="routing-instance"> <name>default-domain:obs-sc:net-right:service-9f0454e8-e5e0-4dc1-91f5-41662750b95d-default-domain_obs-sc_obs-in-net1</name> <routing-instance-has-pnf>false</routing-instance-has-pnf> </node> <link> <node type="virtual-network"> <name>default-domain:obs-sc:net-right</name> </node> <node type="routing-instance"> <name>default-domain:obs-sc:net-right:service-9f0454e8-e5e0-4dc1-91f5-41662750b95d-default-domain_obs-sc_obs-in-net1</name> </node> <metadata type="virtual-network-routing-instance" /> </link> </update> </config> </iq> $ controller/src/vnsw/agent/controller/controller_init.cc 839
2016-02-26 15:52:53.383 AgentXmppMessage: Received xmpp message from: 172.16.180.8 Port 5269 Size: 357 Packet: <?xml version="1.0"?> <iq type="set" <email address hidden>" to="default-global-system-config:csol2-node11/config"> <config> <update> <node type="routing-instance"> <name>default-domain:obs-sc:net-right:service-9f0454e8-e5e0-4dc1-91f5-41662750b95d-default-domain_obs-sc_obs-in-net1</name> </node> </update> </config> </iq> $ controller/src/vnsw/agent/controller/controller_init.cc 839
2016-02-26 15:52:53.436 AgentXmppMessage: Received xmpp message from: 172.16.180.8 Port 5269 Size: 835 Packet: <?xml version="1.0"?> <iq type="set" <email address hidden>" to="default-global-system-config:csol2-node11/config"> <config> <update> <node type="routing-instance"> <name>default-domain:obs-sc:net-left:service-9f0454e8-e5e0-4dc1-91f5-41662750b95d-default-domain_obs-sc_obs-in-net1</name> <service-chain-information> <routing-instance>default-domain:obs-sc:net-right:net-right</routing-instance> <prefix>83.1.1.0/24</prefix> <prefix>84.1.1.0/24</prefix> <service-chain-address>81.1.1.6</service-chain-address> <service-instance>default-domain:obs-sc:obs-in-net1</service-instance> <source-routing-instance>default-domain:obs-sc:net-left:net-left</source-routing-instance> </service-chain-information> <static-route-entries /> </node> </update> </config> </iq> $ controller/src/vnsw/agent/controller/controller_init.cc 839
2016-02-26 15:52:53.436 AgentXmppMessage: Received xmpp message from: 172.16.180.8 Port 5269 Size: 807 Packet: <?xml version="1.0"?> <iq type="set" <email address hidden>" to="default-global-system-config:csol2-node11/config"> <config> <update> <node type="routing-instance"> <name>default-domain:obs-sc:net-right:service-9f0454e8-e5e0-4dc1-91f5-41662750b95d-default-domain_obs-sc_obs-in-net1</name> <service-chain-information> <routing-instance>default-domain:obs-sc:net-left:net-left</routing-instance> <prefix>81.1.1.0/24</prefix> <prefix>82.1.1.0/24</prefix> <service-chain-address>83.1.1.6</service-chain-address> <service-instance>...

Read more...

amit surana (asurana-t)
tags: added: releasenote
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/17971
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/17972
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/17972
Committed: http://github.org/Juniper/contrail-controller/commit/465d0b9d83148b4356b6b4987fc8b9ca319b33b8
Submitter: Zuul
Branch: R3.0

commit 465d0b9d83148b4356b6b4987fc8b9ca319b33b8
Author: Manish <email address hidden>
Date: Sun Feb 28 09:28:17 2016 +0530

Adding/deleting SI in chain results in traffic loss.

Problem:
While re-evaluating flow/rflow the old reverse flow of flow/rflow were not
pointing to same(pointing to NULL) and hence were marked as short flow,
but same was not updated to stats_collector. This resulted in flow dangling till
aging and in turn holding respective VRF in deleted state. Afet aging VRF was
removed and new VRF added, in turn restoring traffic.

Solution:
Notify stats collector on marking as short flow.

Change-Id: Ibfac8308002e1f91bd423426b771d699fc9bb3f4
Closes-bug: 1549454

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/18085
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/18085
Submitter: Praveen K V (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/18085
Committed: http://github.org/Juniper/contrail-controller/commit/d2c75e1595b7149ab24443430d102b3baf439366
Submitter: Zuul
Branch: master

commit d2c75e1595b7149ab24443430d102b3baf439366
Author: Manish <email address hidden>
Date: Sun Feb 28 09:28:17 2016 +0530

Adding/deleting SI in chain results in traffic loss.

Problem:
While re-evaluating flow/rflow the old reverse flow of flow/rflow were not
pointing to same(pointing to NULL) and hence were marked as short flow,
but same was not updated to stats_collector. This resulted in flow dangling till
aging and in turn holding respective VRF in deleted state. Afet aging VRF was
removed and new VRF added, in turn restoring traffic.

Solution:
Notify stats collector on marking as short flow.

(cherry picked from commit 74e0dff2e00ffdf55cdbfbcbe74544e169453daf)
Closes-bug: 1549454
Change-Id: Ibfac8308002e1f91bd423426b771d699fc9bb3f4

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.