vrouter crash during VM deletion : bool VrfEntry::DeleteTimeout()") at assert.c:101

Bug #1577517 reported by manishkn
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
Critical
Manish Singh
R2.21.x
Fix Committed
Critical
Manish Singh
R2.22.x
Fix Committed
Critical
Manish Singh
R3.0
Fix Committed
Critical
Manish Singh
Trunk
Fix Committed
Critical
Manish Singh

Bug Description

I see a vrouter crash on all compute nodes during VM/VN deletion.

contrail-version: 2.22.2-10

Will copy the core to /cs-shared/bugs/

I see similar crash in Launchpad, Pls let me know if you need a new bug for this issue.

Program terminated with signal SIGABRT, Aborted.
#0 0x00007f364e878cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007f364e878cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f364e87c0d8 in __GI_abort () at abort.c:89
#2 0x00007f364e871b86 in __assert_fail_base (fmt=0x7f364e9c2830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x108faf5 "0",
    file=file@entry=0x109eca0 "controller/src/vnsw/agent/oper/vrf.cc", line=line@entry=333, function=function@entry=0x109ee60 "bool VrfEntry::DeleteTimeout()")
    at assert.c:92
#3 0x00007f364e871c32 in __GI___assert_fail (assertion=0x108faf5 "0", file=0x109eca0 "controller/src/vnsw/agent/oper/vrf.cc", line=333,
    function=0x109ee60 "bool VrfEntry::DeleteTimeout()") at assert.c:101
#4 0x0000000000a0a9dd in VrfEntry::DeleteTimeout() ()
#5 0x00000000010568e9 in Timer::TimerTask::Run() ()
#6 0x000000000105034c in TaskImpl::execute() ()
#7 0x00007f364f447b3a in ?? ()
#8 0x00007f3648087f28 in ?? ()
#9 0x00007f3648087f40 in ?? ()
#10 0x0000000000000001 in ?? ()
#11 0x00007f36480e3180 in ?? ()
#12 0x00007f3648087f28 in ?? ()
#13 0x0000000000000000 in ?? ()

root@sdkvse25:~# flow
Usage:flow [-f flow_index]
           [-d flow_index]
           [-i flow_index]
           [--mirror=mirror table index]
           [-l]
           [--show-evicted]
           [-r]
           [-s]

-f <flow_index> Set forward action for flow at flow_index <flow_index>
-d <flow_index> Set drop action for flow at flow_index <flow_index>
-i <flow_index> Invalidate flow at flow_index <flow_index>
--mirror Mirror index to mirror to
-l List flows
--show-evicted Show evicted flows too
-r Start dumping flow setup rate
-s Start dumping flow stats
--help Print this help

manishkn (manishkn)
Changed in juniperopenstack:
assignee: nobody → Hari Prasad Killi (haripk)
importance: Undecided → Critical
amit surana (asurana-t)
tags: added: soln
Jeba Paulaiyan (jebap)
information type: Proprietary → Public
Jeba Paulaiyan (jebap)
tags: added: blocker
Revision history for this message
Hari Prasad Killi (haripk) wrote :
Download full text (3.3 KiB)

(gdb) p ((VrfEntry *) 0x7f36182c1260)->refcount_
$3 = (tbb::atomic) 2

(gdb) p (DBEntry *) 0x7f363020a580
$4 = (CompositeNH *) 0x7f363020a570
(gdb) p (DBEntry *) 0x7f361427e210
$5 = (CompositeNH *) 0x7f361427e200
(gdb) p *$4
$6 = (CompositeNH) {
  <NextHop> = {
    <AgentRefCount<NextHop>> = {
      _vptr.AgentRefCount = 0x108ee90 <vtable for CompositeNH+16>,
      refcount_ = (tbb::atomic) 1
    },
    <AgentDBEntry> = {
      <DBEntry> = {
        <DBEntryBase> = {
          _vptr.DBEntryBase = 0x108ef18 <vtable for CompositeNH+152>,
          chg_list_ = <boost::intrusive_hook> next = 0x0 prev = 0x0,
          tpart_ = 0x7f36380152d0,
          state_ = std::map with 3 elements = {
            [0] = 0x7f36300c1830,
            [1] = 0x7f363021ce90,
            [2] = 0x7f3630104010
          },
          flags = 0 '\000',
          onremoveq_ = (tbb::atomic) false,
          last_change_at_ = 1462040507773241
        },
        members of DBEntry:
        node_ = <boost::intrusive_hook> parent = 0x7f36340d5c10 left = 0x7f3610191570 right = 0x7f361c102090
      },
      members of AgentDBEntry:
      flags_ = 0 '\000'
    },
    members of NextHop:
    static kInvalidIndex = 4294967295,
    type_ = NextHop::COMPOSITE,
    valid_ = true,
    policy_ = false,
    id_ = 883
  },
  members of CompositeNH:
  static kInvalidComponentNHIdx = 4294967295,
  composite_nh_type_ = Composite::L2COMP,
  component_nh_key_list_ = std::vector of length 1, capacity 1 = {(boost::shared_ptr<ComponentNHKey const>) (count 2, weak count 1) 0x7f36301bc400},
  component_nh_list_ = std::vector of length 1, capacity 1 = {(boost::shared_ptr<ComponentNH const>) (count 1, weak count 1) 0x7f36301fc9b0},
  vrf_ = (boost::intrusive_ptr<VrfEntry>) 0x7f36182c1260
}
(gdb) p *$5
$7 = (CompositeNH) {
  <NextHop> = {
    <AgentRefCount<NextHop>> = {
      _vptr.AgentRefCount = 0x108ee90 <vtable for CompositeNH+16>,
      refcount_ = (tbb::atomic) 1
    },
    <AgentDBEntry> = {
      <DBEntry> = {
        <DBEntryBase> = {
          _vptr.DBEntryBase = 0x108ef18 <vtable for CompositeNH+152>,
          chg_list_ = <boost::intrusive_hook> next = 0x0 prev = 0x0,
          tpart_ = 0x7f36380152d0,
          state_ = std::map with 3 elements = {
            [0] = 0x7f36141a07b0,
            [1] = 0x7f36142cfbd0,
            [2] = 0x7f361403b2f0
          },
          flags = 0 '\000',
          onremoveq_ = (tbb::atomic) false,
          last_change_at_ = 1462040507472091
        },
        members of DBEntry:
        node_ = <boost::intrusive_hook> parent = 0x7f36303442e0 left = 0x7f36142c8be0 right = 0x7f36402a0110
      },
      members of AgentDBEntry:
      flags_ = 0 '\000'
    },
    members of NextHop:
    static kInvalidIndex = 4294967295,
    type_ = NextHop::COMPOSITE,
    valid_ = true,
    policy_ = false,
    id_ = 884
  },
  members of CompositeNH:
  static kInvalidComponentNHIdx = 4294967295,
  composite_nh_type_ = Composite::EVPN,
  component_nh_key_list_ = std::vector of length 1, capacity 1 = {(boost::shared_ptr<ComponentNHKey const>) (count 3, weak count 1) 0x7f36141ded80},
  component_nh_list_ = std::vector of length ...

Read more...

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.0

Review in progress for https://review.opencontrail.org/20188
Submitter: Manish Singh (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/20191
Submitter: Hari Prasad Killi (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20191
Committed: http://github.org/Juniper/contrail-controller/commit/17dc2e2d3321a35a79f6a380f3e6d5d07f4b94cb
Submitter: Zuul
Branch: R2.22.x

commit 17dc2e2d3321a35a79f6a380f3e6d5d07f4b94cb
Author: Manish <email address hidden>
Date: Fri May 13 13:47:27 2016 +0530

Vrf pending because of mpls label.

Problem:
MPLS label was pending pointing to Composite NH. This NH holds
reference to VRF. This label was programmed because adding EVPN path in multicast
route vxlan tag was added as ethernet identifier and label as well. When all
routes except evpn peer path is gone this vxlan tag (which was copied in label)
gets copied to master path which in turn tries to rebake NH for
this label.
EVPN mpls label should only be picked from local path or local vm peer path.

Solution:
Pass invalid label for evpn path.

Change-Id: Iea899d752be101d7a48ebe0e41635ffe23e06602
Closes-bug: #1577517

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20188
Committed: http://github.org/Juniper/contrail-controller/commit/4b1d1ce53ba37c2dd46639866325c05fc1144d82
Submitter: Zuul
Branch: R3.0

commit 4b1d1ce53ba37c2dd46639866325c05fc1144d82
Author: Manish <email address hidden>
Date: Fri May 13 13:47:27 2016 +0530

Vrf pending because of mpls label.

Problem:
MPLS label was pending pointing to Composite NH. This NH holds
reference to VRF. This label was programmed because adding EVPN path in multicast
route vxlan tag was added as ethernet identifier and label as well. When all
routes except evpn peer path is gone this vxlan tag (which was copied in label)
gets copied to master path which in turn tries to rebake NH for
this label.
EVPN mpls label should only be picked from local path or local vm peer path.

Solution:
Pass invalid label for evpn path.

Change-Id: Ia685e9179cb710ce3d0a38ee90330e19bb590ac7
Closes-bug: #1577517

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/20369
Submitter: Hari Prasad Killi (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/20371
Submitter: Hari Prasad Killi (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/20372
Submitter: Hari Prasad Killi (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/20369
Committed: http://github.org/Juniper/contrail-controller/commit/93e18f9402eb8f935e6dc61da6b2d6cb5e91a6c6
Submitter: Zuul
Branch: R2.20

commit 93e18f9402eb8f935e6dc61da6b2d6cb5e91a6c6
Author: Manish <email address hidden>
Date: Fri May 13 13:47:27 2016 +0530

Vrf pending because of mpls label.

Problem:
MPLS label was pending pointing to Composite NH. This NH holds
reference to VRF. This label was programmed because adding EVPN path in multicast
route vxlan tag was added as ethernet identifier and label as well. When all
routes except evpn peer path is gone this vxlan tag (which was copied in label)
gets copied to master path which in turn tries to rebake NH for
this label.
EVPN mpls label should only be picked from local path or local vm peer path.

Solution:
Pass invalid label for evpn path.

Change-Id: Iea899d752be101d7a48ebe0e41635ffe23e06602
Closes-bug: #1577517

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20371
Committed: http://github.org/Juniper/contrail-controller/commit/50de9ff1fb461015c89a6ffca9b7746e72024772
Submitter: Zuul
Branch: master

commit 50de9ff1fb461015c89a6ffca9b7746e72024772
Author: Manish <email address hidden>
Date: Fri May 13 13:47:27 2016 +0530

Vrf pending because of mpls label.

Problem:
MPLS label was pending pointing to Composite NH. This NH holds
reference to VRF. This label was programmed because adding EVPN path in multicast
route vxlan tag was added as ethernet identifier and label as well. When all
routes except evpn peer path is gone this vxlan tag (which was copied in label)
gets copied to master path which in turn tries to rebake NH for
this label.
EVPN mpls label should only be picked from local path or local vm peer path.

Solution:
Pass invalid label for evpn path.

Change-Id: Iea899d752be101d7a48ebe0e41635ffe23e06602
Closes-bug: #1577517

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/20372
Committed: http://github.org/Juniper/contrail-controller/commit/b7e81966b936772939b5a0f186fb291cc39ab791
Submitter: Zuul
Branch: R2.21.x

commit b7e81966b936772939b5a0f186fb291cc39ab791
Author: Manish <email address hidden>
Date: Fri May 13 13:47:27 2016 +0530

Vrf pending because of mpls label.

Problem:
MPLS label was pending pointing to Composite NH. This NH holds
reference to VRF. This label was programmed because adding EVPN path in multicast
route vxlan tag was added as ethernet identifier and label as well. When all
routes except evpn peer path is gone this vxlan tag (which was copied in label)
gets copied to master path which in turn tries to rebake NH for
this label.
EVPN mpls label should only be picked from local path or local vm peer path.

Solution:
Pass invalid label for evpn path.

Change-Id: Iea899d752be101d7a48ebe0e41635ffe23e06602
Closes-bug: #1577517

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.