R2.20-Ubuntu-14.04-Icehouse-build-16 agent core@UpdateStateStatusType(InstanceTask*, int)

Bug #1453956 reported by shajuvk on 2015-05-11
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R1.1
In Progress
High
Divakar Dharanalakota
R2.0
Fix Committed
Undecided
Divakar Dharanalakota
R2.20
Fix Committed
High
Divakar Dharanalakota
Trunk
Fix Committed
High
Divakar Dharanalakota

Bug Description

Build : 2.20-16
CoreLocation : /cs-shared/test_runs/nodeb2/2015_05_11_21_32_08
cores : {'10.204.216.33': ['core.contrail-vroute.15821.nodeb2.1431367625', 'core.contrail-vroute.3153.nodeb2.1431364549']}
LogsLocation : http://10.204.216.50/Docs/logs/2.20-16_2015_05_11_21_32_08/logs/
Report : http://10.204.216.50/Docs/logs/2.20-16_2015_05_11_21_32_08/junit-noframes.html
Topology :
Config Nodes : [u'nodeb2']
Control Nodes : [u'nodeb2']
Compute Nodes : [u'nodeb2']
Openstack Node : nodeb2
WebUI Node : nodeb2
Analytics Nodes : [u'nodeb2']

Bt
============
core.contrail-vroute.15821.nodeb2.1431367625
--------------------------------------------------------
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-vrouter-agent'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000000000e852a4 in DBEntryBase::GetState(DBTableBase*, int) ()
#0 0x0000000000e852a4 in DBEntryBase::GetState(DBTableBase*, int) ()
#1 0x000000000095e7ae in InstanceManager::UpdateStateStatusType(InstanceTask*, int) ()
#2 0x000000000095ecb7 in InstanceManager::SigChldEventHandler(InstanceManager::InstanceManagerChildEvent) ()
#3 0x000000000095ee03 in InstanceManager::DequeueEvent(InstanceManager::InstanceManagerChildEvent) ()
#4 0x000000000096118a in boost::detail::function::function_obj_invoker1<boost::_bi::bind_t<bool, boost::_mfi::mf1<bool, InstanceManager, InstanceManager::InstanceManagerChildEvent>, boost::_bi::list2<boost::_bi::value<InstanceManager*>, boost::arg<1> > >, bool, InstanceManager::InstanceManagerChildEvent>::invoke(boost::detail::function::function_buffer&, InstanceManager::InstanceManagerChildEvent) ()
#5 0x0000000000963ad1 in QueueTaskRunner<InstanceManager::InstanceManagerChildEvent, WorkQueue<InstanceManager::InstanceManagerChildEvent> >::RunQueue() ()
#6 0x0000000000f8e1d0 in TaskImpl::execute() ()
#7 0x00007f7659e44b3a in ?? () from /usr/lib/libtbb.so.2
#8 0x00007f7659e40816 in ?? () from /usr/lib/libtbb.so.2
#9 0x00007f7659e3ff4b in ?? () from /usr/lib/libtbb.so.2
#10 0x00007f7659e3c0ff in ?? () from /usr/lib/libtbb.so.2
#11 0x00007f7659e3c2f9 in ?? () from /usr/lib/libtbb.so.2
#12 0x00007f765a060182 in start_thread (arg=0x7f7633bfe700)
    at pthread_create.c:312
#13 0x00007f765933947d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

information type: Proprietary → Public
tags: added: blocker

Review in progress for https://review.opencontrail.org/10961
Submitter: Divakar Dharanalakota (<email address hidden>)

Review in progress for https://review.opencontrail.org/10889
Submitter: Divakar Dharanalakota (<email address hidden>)

Reviewed: https://review.opencontrail.org/10961
Committed: http://github.org/Juniper/contrail-controller/commit/abdb26d0b61b759089b3572648394d16413e06ee
Submitter: Zuul
Branch: R2.20

commit abdb26d0b61b759089b3572648394d16413e06ee
Author: Divakar <email address hidden>
Date: Wed May 27 20:59:31 2015 -0700

Clearing service instance DB State after STOP command

If Delete of service instance is received before it was marked
unusable, to stop the namespace, StopServiceInstance command is issued,
which invokes the deletion of the namespace script. And immediately after
this, DB state is cleared. Clearing of DB State can result in deleting
of service instance DBEntry before namespace script completion. SIGCHILD
event after script completion looks for service instance that triggered the
task, and results in crash as the service isntance is already deleted.
To avoid crash, the DBState is cleared after processing the SIGCHILD
event. Also corresponding to the service instance, there can be many
tasks queued for execution. To identify how any SIGCHILDS to wait for,
running tasks cound is maintained in DBState. This could is manipulated
everytime the task is created, destroyed, error handled corresponding to
the service instance.

Testcases to follow in the next commit

Change-Id: I4b9853de271b8ee54016bb91b7a2e199cbce6b0d
closes-bug: #1449166
closes-bug: #1453956

Review in progress for https://review.opencontrail.org/10889
Submitter: Divakar Dharanalakota (<email address hidden>)

Review in progress for https://review.opencontrail.org/10887
Submitter: Divakar Dharanalakota (<email address hidden>)

Reviewed: https://review.opencontrail.org/10887
Committed: http://github.org/Juniper/contrail-controller/commit/080dda0c6af5eb172e7690854326ba230d349388
Submitter: Zuul
Branch: master

commit 080dda0c6af5eb172e7690854326ba230d349388
Author: Divakar <email address hidden>
Date: Tue May 26 21:40:44 2015 -0700

Clearing service instance DB State after STOP command

If Delete of service instance is received before it was marked
unusable, StopServiceInstance command is issued, which invokes
the namespace script. And immediately after this, DB state is cleared.
Clearing of DB State can result in deleting of service instance
DBEntry before namespace script completion. SIGCHILD event after
script completion looks for service instance that triggered the
task, and results in crash as the service isntance is already deleted.
To avoid crash, the DBState is cleared after processing the SIGCHILD
event.

Testcases to follow in the next commit

Change-Id: I212063ede8f7df4b693771aa394c4536a76d8d20
closes-bug: #1453956

Review in progress for https://review.opencontrail.org/12052
Submitter: Sylvain Afchain (<email address hidden>)

Review in progress for https://review.opencontrail.org/12052
Submitter: ?douard Thuleau (<email address hidden>)

Reviewed: https://review.opencontrail.org/10889
Committed: http://github.org/Juniper/contrail-controller/commit/06b38bcdbfe6773adf3c41f416d5af0ebf47f7c9
Submitter: Zuul
Branch: R2.0

commit 06b38bcdbfe6773adf3c41f416d5af0ebf47f7c9
Author: Divakar <email address hidden>
Date: Sun May 31 23:33:00 2015 -0700

Clearing service instance DB State after STOP command

If Delete of service instance is received before it was marked
unusable, to stop the namespace, StopServiceInstance command is issued,
which invokes the deletion of the namespace script. And immediately after
this, DB state is cleared. Clearing of DB State can result in deleting
of service instance DBEntry before namespace script completion. SIGCHILD
event after script completion looks for service instance that triggered the
task, and results in crash as the service isntance is already deleted.
To avoid crash, the DBState is cleared after processing the SIGCHILD
event. Also corresponding to the service instance, there can be many
tasks queued for execution. To identify how any SIGCHILDS to wait for,
running tasks cound is maintained in DBState. This could is manipulated
everytime the task is created, destroyed, error handled corresponding to
the service instance.
Also, the config filters for missing tables, which treat the NULL uuid as node
deletes is also added

Change-Id: I406f9e3d70768151e4390955153bc3176d3da99f
closes-bug: #1453956

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers