[3.0.2.0-28 ] contrail-control crash @ BgpXmppChannel::XmppPeer::~XmppPeer()
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
R2.20 |
Fix Committed
|
High
|
Prakash Bailkeri | |||
R2.21.x |
Fix Committed
|
High
|
Prakash Bailkeri | |||
R2.22.x |
Fix Committed
|
High
|
Prakash Bailkeri | |||
R3.0 |
Fix Committed
|
High
|
Prakash Bailkeri | |||
Trunk |
Fix Committed
|
High
|
Prakash Bailkeri |
Bug Description
Observed this control node crash while deleting lif and vmi in scale setup
Backtrace
----------------
(gdb) bt
#0 0x00007f7b6170ccc9 in __GI_raise (sig=sig@entry=6) at ../nptl/
#1 0x00007f7b617100d8 in __GI_abort () at abort.c:89
#2 0x00007f7b61705b86 in __assert_fail_base (fmt=0x7f7b61856830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
assertion=
file=
function=
#3 0x00007f7b61705c32 in __GI___assert_fail (assertion=0xcdba4a "GetRefCount() == 0",
file=0xd0a1b0 "controller/
function=
#4 0x000000000041b580 in ?? ()
#5 0x000000000097efa0 in ?? ()
#6 0x00000000009555f7 in ?? ()
#7 0x0000000000955fd9 in ?? ()
#8 0x000000000094d7ad in ?? ()
#9 0x00000000009876e3 in ?? ()
#10 0x0000000000687cac in ?? ()
#11 0x00007f7b624e3b3a in ?? () from /usr/lib/
#12 0x00007f7b624df816 in ?? () from /usr/lib/
#13 0x00007f7b624def4b in ?? () from /usr/lib/
#14 0x00007f7b624db0ff in ?? () from /usr/lib/
#15 0x00007f7b624db2f9 in ?? () from /usr/lib/
#16 0x00007f7b626ff182 in start_thread (arg=0x7f7b58b7
#17 0x00007f7b617d047d in clone () at ../sysdeps/
root@5b7s2:~# contrail-version | grep control
contrail-control 3.0.2.0-28 28
contrail-
root@5b7s2:~# contrail-status | grep control
supervisor-control: active
contrail-control initializing (IFMap Server End-Of-RIB not computed)
contrail-
Contrail-control log
-------
Oper: LinkRemove instance-
2016-04-23 Sat 23:12:33:150.535 PDT 5b7s2 [Thread 139788032145152, Pid 25240]: SANDESH: Queue Drop: IFMap [SYS_DEBUG]: LinkOper: LinkRemove instance-
2016-04-23 Sat 23:12:33:184.153 PDT 5b7s2 [Thread 139786421532416, Pid 25240]: XMPP [SYS_NOTICE]: XmppEventLog: Mode Server: PassiveOpen in state: Idle peer ip: 172.17.90.7 ( ) controller/
Changed in juniperopenstack: | |
importance: | Undecided → High |
importance: | High → Critical |
description: | updated |
summary: |
- contrail-control crash @ BgpXmppChannel::XmppPeer::~XmppPeer() + [3.0.2.0-28 28] contrail-control crash @ + BgpXmppChannel::XmppPeer::~XmppPeer() |
summary: |
- [3.0.2.0-28 28] contrail-control crash @ + [3.0.2.0-28 ] contrail-control crash @ BgpXmppChannel::XmppPeer::~XmppPeer() |
information type: | Proprietary → Public Security |
information type: | Public Security → Public |
I think we missed one scenario in the previous fix.
Add operation is enqueued to DB, unsubscribe for the
instance is received, membership manager API to
unregister table is called, membership managers starts
leave and part of the table is walked but membership
manager has not yet removed IPeerRib, add operation
is processed and path gets added.
Fix could be to invalidate the subscribe gen id before
or when leave starts.