IPv6 traffic failed after vSRX2.0 failover, vRouter still send traffic to old active node

Bug #1592119 reported by zongwang@juniper.net
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.0
Fix Committed
High
Anand H. Krishnan
R3.0.2.x
Fix Committed
High
Anand H. Krishnan
R3.1
Fix Committed
High
Anand H. Krishnan
Trunk
Fix Committed
High
Anand H. Krishnan

Bug Description

Pls keep this as private since this is our Juniper own lab.

Currently we are using allowed-address-pairs to support vSRX HA in Contrail. Last time we have problems with the vsrx2.0 HA failover on contrail server, at that time issue disappeared after creating a new HA setup. But now it is reported again in QA’s setup. From the tcpdump on tap interface, we can see the icmp neighbor advertisements are sent out from vSRX instance, but seems vRouter doesn't refresh it’s forwarding table, it doesn’t detect the failover and still send the IPv6 traffic to old active node.

Contrail https://spg86-centos.spglab.juniper.net:8143
OpenStack http://spg86-centos.spglab.juniper.net/dashboard/
Username: admin
Password: contrail123

PC1 (centos-pc1) ipv4 1.1.1.4 ipv6 1000::4
PC2 (centos-pc2) ipv4 10.1.1.4 ipv6 1100::4

vSRX cluster HA-D50-Simon
ipv4 left 1.1.1.3 right 10.1.1.3
ipv6 left 1000::3 right 1100::3

From PC1 we can ping 1100::4 then I do a failover on the vsrx for RG1, but traffic stops for about 10mins then resumes.

You can access the PCs via OpenStack their instance name are centos-pc1 and centos-pc2

How to do failover on VSRX:
#request chassis cluster failover reset redundancy group 1
#request chassis cluster failover redundancy group 1 node 0

I still keep the last working setup, you can find it @(I didn’t see any difference about the setup/configurations)
https://nyesx11.spglab.juniper.net:8143/#p=config_sc_svcInstances
HA setup: HA-0512-ipv6
PC: centos-left
PC: centos-right

Here is the tcpdump from another testbed, but exactly the same neighbor advertisement packets are sent out.

18:01:56.546088 00:10:db:ff:80:00 (oui Unknown) > 33:33:00:00:00:01 (oui Unknown), ethertype IPv6 (0x86dd), length 86: 9000::10 > ff02::1: ICMP6, neighbor advertisement, tgt is 9000::10, length 32
18:01:57.542663 00:10:db:ff:80:00 (oui Unknown) > 33:33:00:00:00:01 (oui Unknown), ethertype IPv6 (0x86dd), length 86: 9000::10 > ff02::1: ICMP6, neighbor advertisement, tgt is 9000::10, length 32
18:01:58.540912 00:10:db:ff:80:00 (oui Unknown) > 33:33:00:00:00:01 (oui Unknown), ethertype IPv6 (0x86dd), length 86: 9000::10 > ff02::1: ICMP6, neighbor advertisement, tgt is 9000::10, length 32
18:01:59.536963 00:10:db:ff:80:00 (oui Unknown) > 33:33:00:00:00:01 (oui Unknown), ethertype IPv6 (0x86dd), length 86: 9000::10 > ff02::1: ICMP6, neighbor advertisement, tgt is 9000::10, length 32

18:05:13.798424 00:10:db:ff:80:00 (oui Unknown) > 02:05:3f:2c:c8:30 (oui Unknown), ethertype IPv6 (0x86dd), length 118: 1000::11 > 9000::11: ICMP6, echo request, seq 3492, length 64
18:05:14.798618 00:10:db:ff:80:00 (oui Unknown) > 02:05:3f:2c:c8:30 (oui Unknown), ethertype IPv6 (0x86dd), length 118: 1000::11 > 9000::11: ICMP6, echo request, seq 3493, length 64
18:05:15.177313 00:00:5e:00:01:00 (oui Unknown) > 33:33:00:00:00:01 (oui Unknown), ethertype IPv6 (0x86dd), length 110: fe80::5e00:100 > ff02::1: ICMP6, router advertisement, length 56
18:05:15.798598 00:10:db:ff:80:00 (oui Unknown) > 02:05:3f:2c:c8:30 (oui Unknown), ethertype IPv6 (0x86dd), length 118: 1000::11 > 9000::11: ICMP6, echo request, seq 3494, length 64
18:05:16.798765 00:10:db:ff:80:00 (oui Unknown) > 02:05:3f:2c:c8:30 (oui Unknown), ethertype IPv6 (0x86dd), length 118: 1000::11 > 9000::11: ICMP6, echo request, seq 3495, length 64

Setup information:
[root@spg86-centos ~]# openstack-status
== Nova services ==
openstack-nova-api: active
openstack-nova-cert: dead
openstack-nova-compute: active
openstack-nova-network: active (disabled on boot)
openstack-nova-scheduler: active
openstack-nova-conductor: active
== Glance services ==
openstack-glance-api: active
openstack-glance-registry: active
== Keystone service ==
openstack-keystone: active
== Horizon service ==
openstack-dashboard: active
== neutron services ==
neutron-server: active
neutron-dhcp-agent: active (disabled on boot)
neutron-l3-agent: failed (disabled on boot)
neutron-metadata-agent: active (disabled on boot)
== Cinder services ==
openstack-cinder-api: dead
openstack-cinder-scheduler: dead
openstack-cinder-volume: failed (disabled on boot)
openstack-cinder-backup: failed (disabled on boot)
== Heat services ==
openstack-heat-api: failed (disabled on boot)
openstack-heat-api-cfn: inactive (disabled on boot)
openstack-heat-api-cloudwatch: inactive (disabled on boot)
openstack-heat-engine: active (disabled on boot)
== Support services ==
mysqld: active (disabled on boot)
libvirtd: active
dbus: active
rabbitmq-server: active
memcached: active
== Keystone users ==
Warning keystonerc not sourced
[root@spg86-centos ~]# contrail-status
== Contrail vRouter ==
supervisor-vrouter: active
unix:///tmp/supervisord_vrouter.sockno

== Contrail Control ==
supervisor-control: active
unix:///tmp/supervisord_control.sockno

== Contrail Analytics ==
supervisor-analytics: active
unix:///tmp/supervisord_analytics.sockno

== Contrail Config ==
supervisor-config: active
unix:///tmp/supervisord_config.sockno

== Contrail Web UI ==
supervisor-webui: active
contrail-webui active
contrail-webui-middleware active

== Contrail Database ==
contrail-database: active
supervisor-database: active
unix:///tmp/supervisord_database.sockno

== Contrail Support Services ==
supervisor-support-service: active
rabbitmq-server active

[root@spg86-centos ~]# contrail-version
Package Version Build-ID | Repo | RPM Name
-------------------------------------- ------------------------------ ----------------------------------
contrail-analytics 3.0.1.0-23.el7.centos 23
contrail-config 3.0.1.0-23.el7.centos 23
contrail-config-openstack 3.0.1.0-23.el7.centos 23
contrail-control 3.0.1.0-23.el7.centos 23
contrail-database 3.0.1.0-23.el7.centos 23
contrail-dns 3.0.1.0-23.el7.centos 23
contrail-docs 3.0.1.0-23.el7.centos 23
contrail-fabric-utils 3.0.1.0-23 23
contrail-heat 3.0.1.0-23.el7.centos 23
contrail-install-packages 3.0.1.0-23~kilo.el7.centos contrail-install-packages-3.0.1.0-23~centos71kilo.el7.centos.noarch
contrail-lib 3.0.1.0-23.el7.centos 23
contrail-nodemgr 3.0.1.0-23.el7.centos 23
contrail-nova-networkapi 3.0.1.0-231.el7.centos 23
contrail-openstack 3.0.1.0-23.el7.centos 23
contrail-openstack-analytics 3.0.1.0-23.el7.centos 23
contrail-openstack-config 3.0.1.0-23.el7.centos 23
contrail-openstack-control 3.0.1.0-23.el7.centos 23
contrail-openstack-database 3.0.1.0-23.el7.centos 23
contrail-openstack-vrouter 3.0.1.0-23.el7.centos 23
contrail-openstack-webui 3.0.1.0-23.el7.centos 23
contrail-setup 3.0.1.0-23.el7.centos 23
contrail-utils 3.0.1.0-23.el7.centos 23
contrail-vrouter 3.0.1.0-23.el7.centos 23
contrail-vrouter-agent 3.0.1.0-23.el7.centos 23
contrail-vrouter-common 3.0.1.0-23.el7.centos 23
contrail-vrouter-init 3.0.1.0-23.el7.centos 23
contrail-vrouter-utils 3.0.1.0-23.el7.centos 23
contrail-web-controller 3.0.1.0-23 23
contrail-web-core 3.0.1.0-23 23
neutron-plugin-contrail 3.0.1.0-23.el7.centos 23
python-contrail 3.0.1.0-23.el7.centos 23
python-contrail-vrouter-api 3.0.1.0-23.el7.centos 23
python-opencontrail-vrouter-netns 3.0.1.0-23.el7.centos 23
[root@spg86-centos ~]# uname -a
Linux spg86-centos 3.10.0-229.el7.x86_64 #1 SMP Fri Mar 6 11:36:42 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
zongwang@juniper.net (zongwang) wrote :

Oh, just forget that this problem only happen with IPv6 traffic, IPv4 works fine. this is also reproduciable.

Changed in juniperopenstack:
importance: Undecided → High
tags: added: service-chain vrouter
Revision history for this message
zongwang@juniper.net (zongwang) wrote :

Can you help assign this bug to somebody? This is a critical PR for vsrx HA support on Contrail and customer are waiting for the release, pls high priority this, thanks a lot.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/21376
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/21376
Committed: http://github.org/Juniper/contrail-controller/commit/6d3008cb1e0fa8bd8316398959785f45e0eef218
Submitter: Zuul
Branch: master

commit 6d3008cb1e0fa8bd8316398959785f45e0eef218
Author: ashoksingh <email address hidden>
Date: Thu Jun 23 17:20:07 2016 +0530

Add route to trap neighbor advertisement messages to Agent.

Issue:
In case of IPv6 Allowed Address Pair, when switchover happens to new interface,
the traffic does not move to new interface. As part of switchover, Unsolicited
Neighbor Advertisement message from new interface is not trapped to agent.

Fix
Agent now adds a route to trap IPv6 All Nodes Multicast address in each VRF so
that Unsolicited Neighbor Advertisement message reaches agent. Agent, as part
of this message processing updates the Allowed Address pair state machine so
that route preference is updated for new interface and traffic starts flowing
from new interface. Also define separate counters in agent for Solicited and
Unsolicited Neighbor advertisement message.
Also update the UT.

Change-Id: Ibadaa5cbe84d973b6e284f7a412d9c583b363d32
Closes-Bug: #1592119

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/21416
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
Ashok Singh (ashoksr) wrote :

Have submitted contrail-vrouter-agent changes. Assigning to Anand for vrouter changes.

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/21416
Committed: http://github.org/Juniper/contrail-controller/commit/de1f03eb1d2902ba37e4883a99b0d643a80db8ef
Submitter: Zuul
Branch: R3.0

commit de1f03eb1d2902ba37e4883a99b0d643a80db8ef
Author: ashoksingh <email address hidden>
Date: Thu Jun 23 17:20:07 2016 +0530

Add route to trap neighbor advertisement messages to Agent.

Issue:
In case of IPv6 Allowed Address Pair, when switchover happens to new interface,
the traffic does not move to new interface. As part of switchover, Unsolicited
Neighbor Advertisement message from new interface is not trapped to agent.

Fix
Agent now adds a route to trap IPv6 All Nodes Multicast address in each VRF so
that Unsolicited Neighbor Advertisement message reaches agent. Agent, as part
of this message processing updates the Allowed Address pair state machine so
that route preference is updated for new interface and traffic starts flowing
from new interface. Also define separate counters in agent for Solicited and
Unsolicited Neighbor advertisement message.
Also update the UT.

Closes-Bug: #1592119
(cherry picked from commit 6d3008cb1e0fa8bd8316398959785f45e0eef218)

Change-Id: Ic46a99cd096c03d6af23c8fc50cac4bd2c70d0b1

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.1

Review in progress for https://review.opencontrail.org/22548
Submitter: Anand H. Krishnan (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/22548
Committed: http://github.org/Juniper/contrail-vrouter/commit/f48acc642306a0718aa63d03e2bcfc4881a25406
Submitter: Zuul
Branch: R3.1

commit f48acc642306a0718aa63d03e2bcfc4881a25406
Author: Anand H. Krishnan <email address hidden>
Date: Thu Jul 28 13:58:43 2016 +0530

Trap & Flood neighbor advertisements

Change-Id: I822e1f1d4ca8f43bd648383a46f3f5bb30d0274a
Closes-Bug: #1592119

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/24073
Submitter: Anand H. Krishnan (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/24074
Submitter: Anand H. Krishnan (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0.2.x

Review in progress for https://review.opencontrail.org/24076
Submitter: Anand H. Krishnan (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/24073
Committed: http://github.org/Juniper/contrail-vrouter/commit/30519e3ee90acc47609a7df2807f2fc0fb9e595f
Submitter: Zuul
Branch: master

commit 30519e3ee90acc47609a7df2807f2fc0fb9e595f
Author: Anand H. Krishnan <email address hidden>
Date: Thu Jul 28 13:58:43 2016 +0530

Trap & Flood neighbor advertisements

Change-Id: I822e1f1d4ca8f43bd648383a46f3f5bb30d0274a
Closes-Bug: #1592119

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/24074
Committed: http://github.org/Juniper/contrail-vrouter/commit/931d74e3e6fb04beaad7e75327feb7885c3ba43e
Submitter: Zuul
Branch: R3.0

commit 931d74e3e6fb04beaad7e75327feb7885c3ba43e
Author: Anand H. Krishnan <email address hidden>
Date: Thu Jul 28 13:58:43 2016 +0530

Trap & Flood neighbor advertisements

Change-Id: I822e1f1d4ca8f43bd648383a46f3f5bb30d0274a
Closes-Bug: #1592119

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0.2.x

Review in progress for https://review.opencontrail.org/24123
Submitter: Ashok Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/24076
Committed: http://github.org/Juniper/contrail-vrouter/commit/859f4b11c03cbce19ac0e56d595dafbddd758d37
Submitter: Zuul
Branch: R3.0.2.x

commit 859f4b11c03cbce19ac0e56d595dafbddd758d37
Author: Anand H. Krishnan <email address hidden>
Date: Thu Jul 28 13:58:43 2016 +0530

Trap & Flood neighbor advertisements

Change-Id: I822e1f1d4ca8f43bd648383a46f3f5bb30d0274a
Closes-Bug: #1592119

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/24123
Committed: http://github.org/Juniper/contrail-controller/commit/991fe4a7a1aba74ace881f6471d636146b6aad20
Submitter: Zuul
Branch: R3.0.2.x

commit 991fe4a7a1aba74ace881f6471d636146b6aad20
Author: ashoksingh <email address hidden>
Date: Thu Jun 23 17:20:07 2016 +0530

Add route to trap neighbor advertisement messages to Agent.

Issue:
In case of IPv6 Allowed Address Pair, when switchover happens to new interface,
the traffic does not move to new interface. As part of switchover, Unsolicited
Neighbor Advertisement message from new interface is not trapped to agent.

Fix
Agent now adds a route to trap IPv6 All Nodes Multicast address in each VRF so
that Unsolicited Neighbor Advertisement message reaches agent. Agent, as part
of this message processing updates the Allowed Address pair state machine so
that route preference is updated for new interface and traffic starts flowing
from new interface. Also define separate counters in agent for Solicited and
Unsolicited Neighbor advertisement message.
Also update the UT.

Closes-Bug: #1592119
(cherry picked from commit 6d3008cb1e0fa8bd8316398959785f45e0eef218)

Change-Id: I2363856cb7d1d57e073573140bafc13590031b40

information type: Proprietary → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.