3.1.1-85:Agent programs the olist but not the source-label

Bug #1724114 reported by Sandeep Sridhar
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.1
Fix Committed
Critical
Manish Singh
R3.2
Fix Committed
Critical
Manish Singh
R4.0
Fix Committed
Critical
Manish Singh
R4.1
Fix Committed
Critical
Manish Singh
Trunk
Fix Committed
Critical
Manish Singh

Bug Description

Nischal and Ananth debugged control-node core and verified that ribout (what was sent) to the agent is correct and matches whats in the tree.
Also, contrail-control sent olist and source-label together. So, agent having programmed the olist but not the source-label also indicates that most likely agent did not program the source-label (or deleted it for some reason).

Requesting vrouter-agent guys to take a look and let us know what went wrong.

(gdb) p *$adv
$47 = {
  slist_node = {
    <boost::intrusive::detail::generic_hook<boost::intrusive::get_slist_node_algo<void*>, boost::intrusive::member_tag, (boost::intrusive::link_mode_type)1, 0>> = {
      <boost::intrusive::detail::no_default_definer> = {<No data fields>},
      <boost::intrusive::slist_node<void*>> = {
        next_ = 0x0
      }, <No data fields>}, <No data fields>},
  bitset = {
    <BitSet> = {
      static npos = 18446744073709551615,
      blocks_ = std::vector of length 1, capacity 1 = {16}
    }, <No data fields>},
  roattr = {
    attr_out_ = (boost::intrusive_ptr<BgpAttr const>) 0x7f6ac1087160,
    nexthop_list_ = std::vector of length 1, capacity 1 = {{
        address_ = <IPv4 0.0.0.0 >,
        label_ = 192707,
        origin_vn_index_ = 21,
        encap_ = std::vector of length 0, capacity 0
      }},
    is_xmpp_ = true,
    vrf_originated_ = true,
    repr_ = ""
  }
}

(gdb) p *(BgpOList *)0x7f6ac0105280
$50 = (BgpOList) {
  _vptr.BgpOList = 0xd93ad0 <vtable for BgpOList+16>,
  elements_ = std::vector of length 1, capacity 1 = {0x7f6ac32d0340},
  refcount_ = (tbb::atomic) 1,
  olist_db_ = 0x2596b60,
  olist_spec_ = {
    <BgpAttribute> = {
      <ParseObject> = {
        _vptr.ParseObject = 0xd93a90 <vtable for BgpOListSpec+16>
      },
      members of BgpAttribute:
      static FLAG_MASK = 192 '\300',
      code = 0 '\000',
      subcode = 1 '\001',
      flags = 0 '\000'
    },
    members of BgpOListSpec:
    static kSize = 0,
    elements = std::vector of length 1, capacity 1 = {{
        address = <<172.23.10.201>>,
        label = 192619,
        encap = std::vector of length 2, capacity 2 = {"gre", "udp"}
      }}
  }
}

Tags: vrouter
Changed in juniperopenstack:
assignee: nobody → Hari Prasad Killi (haripk)
importance: Undecided → Critical
Changed in juniperopenstack:
milestone: none → r3.1.4.0
information type: Proprietary → Public
Revision history for this message
vivekananda shenoy (vshenoy83) wrote :

Hi Hari,

Any updates on this bug ?

Regards,
Vivek

tags: added: vrouter
Changed in juniperopenstack:
assignee: Hari Prasad Killi (haripk) → Manish Singh (manishs)
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.1

Review in progress for https://review.opencontrail.org/39893
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.2

Review in progress for https://review.opencontrail.org/39894
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.1

Review in progress for https://review.opencontrail.org/39895
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.0

Review in progress for https://review.opencontrail.org/39896
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/39897
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/39894
Committed: http://github.com/Juniper/contrail-controller/commit/4f1ec8c0fce98bfe47b11600543b459b74f08319
Submitter: Zuul (<email address hidden>)
Branch: R3.2

commit 4f1ec8c0fce98bfe47b11600543b459b74f08319
Author: Manish <email address hidden>
Date: Tue Feb 20 12:21:52 2018 +0530

FMG label missed.

Transiently FMG label can be assigned to multiple tree, because of control node
restarts or flaps. In this case say tree-1 is using label A and it was supposed
to be withdrawn or staled. Re-use of label happens at CN and is assigned to
tree-2. XMPP messages for withdrawal of label A from tree-1 and adding to tree-2
can go in any order. So add for tree-2 is seen before withdrawal in problematic
case. This results in label A removed even though tree-2 is active user.

Solution:
Maintain a list of users for label. Label remains intact till list is not empty.
In above case tree-2 will be present in list even after tree-1 is withdrawn and
label remains intact.

Change-Id: Ia2bb22d0c958355a5c2295709fd5eb90c1a79a65
CLoses-bug: #1724114

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/39896
Committed: http://github.com/Juniper/contrail-controller/commit/3b003876a203f2c55d9b0d2293e5884bfc96a4d3
Submitter: Zuul (<email address hidden>)
Branch: R4.0

commit 3b003876a203f2c55d9b0d2293e5884bfc96a4d3
Author: Manish <email address hidden>
Date: Tue Feb 20 12:34:50 2018 +0530

FMG label missed.

Transiently FMG label can be assigned to multiple tree,
because of control node restarts or flaps. In this case say tree-1 is using
label A and it was supposed to be withdrawn or staled. Re-use of label
happens at CN and is assigned to tree-2. XMPP messages for withdrawal of
label A from tree-1 and adding to tree-2 can go in any order. So add for
tree-2 is seen before withdrawal in problematic case. This results in label
A removed even though tree-2 is active user.

Solution: Maintain a list of
users for label. Label remains intact till list is not empty. In above case
tree-2 will be present in list even after tree-1 is withdrawn and label
remains intact.

Change-Id: I5bbe817b9db2d9ac8ad23d812b88b343f94a09a3
CLoses-bug: #1724114

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/39895
Committed: http://github.com/Juniper/contrail-controller/commit/a98bb1aa505b52161d32c3268da94a95f8d03104
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit a98bb1aa505b52161d32c3268da94a95f8d03104
Author: Manish <email address hidden>
Date: Tue Feb 20 12:34:50 2018 +0530

FMG label missed.

Transiently FMG label can be assigned to multiple tree,
because of control node restarts or flaps. In this case say tree-1 is using
label A and it was supposed to be withdrawn or staled. Re-use of label
happens at CN and is assigned to tree-2. XMPP messages for withdrawal of
label A from tree-1 and adding to tree-2 can go in any order. So add for
tree-2 is seen before withdrawal in problematic case. This results in label
A removed even though tree-2 is active user.

Solution: Maintain a list of
users for label. Label remains intact till list is not empty. In above case
tree-2 will be present in list even after tree-1 is withdrawn and label
remains intact.

Change-Id: I5bbe817b9db2d9ac8ad23d812b88b343f94a09a3
CLoses-bug: #1724114

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/39897
Committed: http://github.com/Juniper/contrail-controller/commit/2bac062e3cec032866b7a48eeaff4689a44fc497
Submitter: Zuul v3 CI (<email address hidden>)
Branch: master

commit 2bac062e3cec032866b7a48eeaff4689a44fc497
Author: Manish <email address hidden>
Date: Tue Feb 20 12:34:50 2018 +0530

FMG label missed.

Transiently FMG label can be assigned to multiple tree,
because of control node restarts or flaps. In this case say tree-1 is using
label A and it was supposed to be withdrawn or staled. Re-use of label
happens at CN and is assigned to tree-2. XMPP messages for withdrawal of
label A from tree-1 and adding to tree-2 can go in any order. So add for
tree-2 is seen before withdrawal in problematic case. This results in label
A removed even though tree-2 is active user.

Solution: Maintain a list of
users for label. Label remains intact till list is not empty. In above case
tree-2 will be present in list even after tree-1 is withdrawn and label
remains intact.

Change-Id: I5bbe817b9db2d9ac8ad23d812b88b343f94a09a3
CLoses-bug: #1724114

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/39893
Committed: http://github.com/Juniper/contrail-controller/commit/6f8d7b4ae781157a034a785934ede0c4fc753a61
Submitter: Zuul (<email address hidden>)
Branch: R3.1

commit 6f8d7b4ae781157a034a785934ede0c4fc753a61
Author: Manish <email address hidden>
Date: Tue Feb 20 12:21:52 2018 +0530

FMG label missed.

Transiently FMG label can be assigned to multiple tree, because of control node
restarts or flaps. In this case say tree-1 is using label A and it was supposed
to be withdrawn or staled. Re-use of label happens at CN and is assigned to
tree-2. XMPP messages for withdrawal of label A from tree-1 and adding to tree-2
can go in any order. So add for tree-2 is seen before withdrawal in problematic
case. This results in label A removed even though tree-2 is active user.

Solution:
Maintain a list of users for label. Label remains intact till list is not empty.
In above case tree-2 will be present in list even after tree-1 is withdrawn and
label remains intact.

Change-Id: Ia2bb22d0c958355a5c2295709fd5eb90c1a79a65
CLoses-bug: #1724114

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/47233
Committed: http://github.com/Juniper/contrail-controller/commit/118e005ebe1ea91d0a58b32bc8215b1d741c5f40
Submitter: Zuul v3 CI (<email address hidden>)
Branch: master

commit 118e005ebe1ea91d0a58b32bc8215b1d741c5f40
Author: Pramodh D'Souza <email address hidden>
Date: Mon Oct 22 12:42:46 2018 -0700

Fix label allocation for multicast

During BFD scale tests a crash was observed when attempting to
free an MPLS label, It turns out the label for a local interface
was allocated a label in the multicast range.
The root cause - during the allocation of label blocks per
control node the last label in the block wasn't actually
reserved. This label was allocated to locally on the agent for a VMI.
When attempting to free the label the check for the multicast range
suceeds, at this point it attempts to look up the vrf name from the
label data (changes made in bug#1724114), the label data is not set
in this case and causes the crash.

Fix: Correct the label block allocation for multicast to reserve all
labels in the range.

Change-Id: I47feb2a23b721a616b879d5e542fb2f80348d4fa
Closes-Bug: 1798483

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/47197
Committed: http://github.com/Juniper/contrail-controller/commit/7df5a97c1aa5b5e87eb50bc5139daaa0824a35bf
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit 7df5a97c1aa5b5e87eb50bc5139daaa0824a35bf
Author: Pramodh D'Souza <email address hidden>
Date: Mon Oct 22 12:42:46 2018 -0700

Fix label allocation for multicast

During BFD scale tests a crash was observed when attempting to
free an MPLS label, It turns out the label for a local interface
was allocated a label in the multicast range.
The root cause - during the allocation of label blocks per
control node the last label in the block wasn't actually
reserved. This label was allocated to locally on the agent for a VMI.
When attempting to free the label the check for the multicast range
suceeds, at this point it attempts to look up the vrf name from the
label data (changes made in bug#1724114), the label data is not set
in this case and causes the crash.

Fix: Correct the label block allocation for multicast to reserve all
labels in the range.

Change-Id: I47feb2a23b721a616b879d5e542fb2f80348d4fa
Closes-Bug: 1798483

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/47232
Committed: http://github.com/Juniper/contrail-controller/commit/67d2bc7c38d918d1382c053166c02d1c3f964aad
Submitter: Zuul v3 CI (<email address hidden>)
Branch: R5.0

commit 67d2bc7c38d918d1382c053166c02d1c3f964aad
Author: Pramodh D'Souza <email address hidden>
Date: Mon Oct 22 12:42:46 2018 -0700

Fix label allocation for multicast

During BFD scale tests a crash was observed when attempting to
free an MPLS label, It turns out the label for a local interface
was allocated a label in the multicast range.
The root cause - during the allocation of label blocks per
control node the last label in the block wasn't actually
reserved. This label was allocated to locally on the agent for a VMI.
When attempting to free the label the check for the multicast range
suceeds, at this point it attempts to look up the vrf name from the
label data (changes made in bug#1724114), the label data is not set
in this case and causes the crash.

Fix: Correct the label block allocation for multicast to reserve all
labels in the range.

Change-Id: I47feb2a23b721a616b879d5e542fb2f80348d4fa
Closes-Bug: 1798483

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.