tor-agent crash at OVSDB::OvsdbDBEntry::NotifyAdd on tor-scale setup
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
R2.1 |
Fix Committed
|
High
|
Prabhjot Singh Sethi | |||
Trunk |
Fix Committed
|
High
|
Prabhjot Singh Sethi |
Bug Description
2.1 Build 39 Ubuntu 14.04 Icehouse multinode setup
On this setup with 128 tor-agents and 110 TORs, 11K VMis, and 1.1K real endpoints,
three crashes were seen with same backtrace.
Am not really sure what was going on in the testbed at this time.
Crash files will be in http://
(gdb) bt
#0 0x00007fac3ad51bb9 in raise () from /lib/x86_
#1 0x00007fac3ad54fc8 in abort () from /lib/x86_
#2 0x00007fac3ad4aa76 in ?? () from /lib/x86_
#3 0x00007fac3ad4ab22 in __assert_fail () from /lib/x86_
#4 0x00000000008cabba in OVSDB::
at controller/
#5 0x00000000008d7866 in OVSDB::
row=
#6 0x00000000008e7b82 in ovsdb_idl_
#7 0x00000000008e6dfe in ovsdb_idl_
#8 0x00000000008e6bda in ovsdb_idl_
#9 0x00000000008e6861 in ovsdb_idl_
#10 0x00000000008e60ce in ovsdb_idl_
#11 0x00000000008c6b74 in OVSDB::
at controller/
#12 0x00000000008ca30a in operator() (a0=0x7fabc4010610, this=0x7fac11bf
#13 RunQueue (this=0x7fabec0
#14 QueueTaskRunner
at controller/
#15 0x0000000000cfb5d0 in TaskImpl::execute (this=0x7fac345
#16 0x00007fac3bf59b3a in ?? () from /usr/lib/
#17 0x00007fac3bf55816 in ?? () from /usr/lib/
#18 0x00007fac3bf54f4b in ?? () from /usr/lib/
#19 0x00007fac3bf510ff in ?? () from /usr/lib/
#20 0x00007fac3bf512f9 in ?? () from /usr/lib/
#21 0x00007fac3c175182 in start_thread () from /lib/x86_
#22 0x00007fac3ae15fbd in clone () from /lib/x86_
(gdb)
root@nodei38:
total 336744
-rw------- 1 root root 162619392 Feb 27 10:17 core.contrail-
-rw------- 1 root root 153624576 Feb 27 10:18 core.contrail-
-rw------- 1 root root 1808727 Feb 27 10:18 core.contrail-
-rw------- 1 root root 158982144 Feb 27 10:25 core.contrail-
-rw------- 1 root root 153276416 Feb 27 10:48 core.contrail-
-rw------- 1 root root 155561984 Feb 27 11:32 core.contrail-
-rw------- 1 root root 161374208 Feb 27 11:32 core.contrail-
-rw------- 1 root root 161402880 Feb 27 11:33 core.contrail-
-rw------- 1 root root 157286400 Feb 27 11:35 core.contrail-
root@nodei38:
Changed in juniperopenstack: | |
assignee: | Hari Prasad Killi (haripk) → Prabhjot Singh Sethi (prabhjot) |
tags: | added: blocker scale |
tags: | added: bms |
it seems like ovs schema don't have mac as a key in route tables, due to which we end up having two ovs_idl rows in ovsdb-server.
When TOR agent receives this two updates with the virtual network config already avaiable it ends up crashing un able to handle two rows with same mac.
Work Around :- disassociate tor agent from physical router and after connection to TOR associate tor agent again with TOR to recover from this situation.