contrail-collector and contrail-query-engine crash

Bug #1499129 reported by Megh Bhatt
28
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
Medium
Megh Bhatt
R2.21.x
Fix Committed
Medium
Megh Bhatt
R2.22.x
Fix Committed
Medium
Megh Bhatt
Trunk
Fix Committed
Medium
Megh Bhatt

Bug Description

Observed the following crashes on contrail-collector and contrail-query-engine on R2.20 - 99.

root@a6s23:~# gdb /var/tmp/contrail-collector.orig.debug /var/crashes/core.contrail-collec.15609.a6s23.1442972297
GNU gdb (Ubuntu 7.7-0ubuntu3) 7.7
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /var/tmp/contrail-collector.orig.debug...done.

warning: core file may not match specified executable file.
[New LWP 15609]
[New LWP 15619]
[New LWP 15617]
[New LWP 15618]
[New LWP 15616]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-collector'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 OpServerProxy::OpServerImpl::toConnectCallbackProcess (this=0x1750af0, c=0x1751fc0, r=0x0, privdata=0x0) at controller/src/analytics/OpServerProxy.cc:229
229 controller/src/analytics/OpServerProxy.cc: No such file or directory.
(gdb) bt
#0 OpServerProxy::OpServerImpl::toConnectCallbackProcess (this=0x1750af0, c=0x1751fc0, r=0x0, privdata=0x0) at controller/src/analytics/OpServerProxy.cc:229
#1 0x00000000004db38a in operator() (a2=0x0, a1=0x0, a0=0x1751fc0, this=0x7fff5ddb1890) at /usr/include/boost/function/function_template.hpp:767
#2 RedisAsyncConnection::RAC_AsyncCmdCallback (c=0x1751fc0, r=0x0, privdata=0x0) at controller/src/analytics/redis_connection.cc:239
#3 0x000000000074b689 in __redisRunCallback (cb=0x7fff5ddb19a0, cb=0x7fff5ddb19a0, reply=0x0, ac=<optimized out>) at build/third_party/hiredis/src/async.c:219
#4 __redisAsyncFree (ac=0x1751fc0) at build/third_party/hiredis/src/async.c:233
#5 0x000000000074d5c9 in redisBoostClient::handle_read (this=0x1751550, ec=...) at build/third_party/hiredis/hiredis-boostasio-adapter/boostasio.cpp:62
#6 0x000000000074dc94 in call<boost::shared_ptr<redisBoostClient>, boost::system::error_code> (b1=<synthetic pointer>, u=..., this=<optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:156
#7 operator()<boost::shared_ptr<redisBoostClient> > (a1=..., u=..., this=<optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:171
#8 operator()<boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<const boost::system::error_code&, long unsigned int const&> > (a=<synthetic pointer>, f=..., this=<optimized out>) at /usr/include/boost/bind/bind.hpp:313
#9 operator()<boost::system::error_code, long unsigned int> (a2=<optimized out>, a1=..., this=<optimized out>) at /usr/include/boost/bind/bind_template.hpp:102
#10 operator() (this=<optimized out>) at /usr/include/boost/asio/detail/bind_handler.hpp:127
#11 asio_handler_invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<boost::_bi::value<boost::shared_ptr<redisBoostClient> >, boost::arg<1> (*)()> >, boost::system::error_code, unsigned long> > (function=...) at /usr/include/boost/asio/handler_invoke_hook.hpp:64
#12 invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<boost::_bi::value<boost::shared_ptr<redisBoostClient> >, boost::arg<1> (*)()> >, boost::system::error_code, unsigned long>, boost::_bi::bind_t<void, boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<boost::_bi::value<boost::shared_ptr<redisBoostClient> >, boost::arg<1> (*)()> > > (context=..., function=...)
    at /usr/include/boost/asio/detail/handler_invoke_helpers.hpp:37
#13 boost::asio::detail::reactive_null_buffers_op<boost::_bi::bind_t<void, boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<boost::_bi::value<boost::shared_ptr<redisBoostClient> >, boost::arg<1> (*)()> > >::do_complete (
    owner=<optimized out>, base=<optimized out>) at /usr/include/boost/asio/detail/reactive_null_buffers_op.hpp:75
#14 0x0000000000562027 in complete (bytes_transferred=0, ec=..., owner=..., this=0x17a65a0) at /usr/include/boost/asio/detail/task_io_service_operation.hpp:37
#15 do_run_one (ec=..., this_thread=..., lock=..., this=0x173b0b0) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:384
#16 boost::asio::detail::task_io_service::run (this=0x173b0b0, ec=...) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:153
#17 0x000000000057f308 in run (this=0x1725700, ec=...) at /usr/include/boost/asio/impl/io_service.ipp:66
#18 EventManager::Run (this=0x1725700) at controller/src/io/event_manager.cc:32
#19 0x0000000000421933 in main (argc=<optimized out>, argv=<optimized out>) at controller/src/analytics/main.cc:412
(gdb) q
root@a6s23:~# gdb /var/tmp/contrail-query-engine.orig.debug /var/crashes/core.contrail-query-.15610.a6s23.1442972301
GNU gdb (Ubuntu 7.7-0ubuntu3) 7.7
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /var/tmp/contrail-query-engine.orig.debug...done.

warning: core file may not match specified executable file.
[New LWP 15610]
[New LWP 15613]
[New LWP 15612]
[New LWP 15614]
[New LWP 15615]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-query-engine'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 QEOpServerProxy::QEOpServerImpl::ConnectCallbackProcess (this=0x29826c0, cnum=<optimized out>, c=<optimized out>, r=0x0, privdata=<optimized out>) at controller/src/query_engine/QEOpServerProxy.cc:871
871 controller/src/query_engine/QEOpServerProxy.cc: No such file or directory.
(gdb) bt
#0 QEOpServerProxy::QEOpServerImpl::ConnectCallbackProcess (this=0x29826c0, cnum=<optimized out>, c=<optimized out>, r=0x0, privdata=<optimized out>) at controller/src/query_engine/QEOpServerProxy.cc:871
#1 0x00000000004f005a in operator() (a2=0x0, a1=0x0, a0=0x2982e20, this=0x7fff161163b0) at /usr/include/boost/function/function_template.hpp:767
#2 RedisAsyncConnection::RAC_AsyncCmdCallback (c=0x2982e20, r=0x0, privdata=0x0) at controller/src/analytics/redis_connection.cc:239
#3 0x00000000006cbe09 in __redisRunCallback (cb=0x7fff161164c0, cb=0x7fff161164c0, reply=0x0, ac=<optimized out>) at build/third_party/hiredis/src/async.c:219
#4 __redisAsyncFree (ac=0x2982e20) at build/third_party/hiredis/src/async.c:233
#5 0x00000000006cdd49 in redisBoostClient::handle_read (this=0x297dc00, ec=...) at build/third_party/hiredis/hiredis-boostasio-adapter/boostasio.cpp:62
#6 0x00000000006ce714 in call<boost::shared_ptr<redisBoostClient>, boost::system::error_code> (b1=<synthetic pointer>, u=..., this=<optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:156
#7 operator()<boost::shared_ptr<redisBoostClient> > (a1=..., u=..., this=<optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:171
#8 operator()<boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<const boost::system::error_code&, long unsigned int const&> > (a=<synthetic pointer>, f=..., this=<optimized out>) at /usr/include/boost/bind/bind.hpp:313
#9 operator()<boost::system::error_code, long unsigned int> (a2=<optimized out>, a1=..., this=<optimized out>) at /usr/include/boost/bind/bind_template.hpp:102
#10 operator() (this=<optimized out>) at /usr/include/boost/asio/detail/bind_handler.hpp:127
#11 asio_handler_invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<boost::_bi::value<boost::shared_ptr<redisBoostClient> >, boost::arg<1> (*)()> >, boost::system::error_code, unsigned long> > (function=...) at /usr/include/boost/asio/handler_invoke_hook.hpp:64
#12 invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<boost::_bi::value<boost::shared_ptr<redisBoostClient> >, boost::arg<1> (*)()> >, boost::system::error_code, unsigned long>, boost::_bi::bind_t<void, boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<boost::_bi::value<boost::shared_ptr<redisBoostClient> >, boost::arg<1> (*)()> > > (context=..., function=...)
    at /usr/include/boost/asio/detail/handler_invoke_helpers.hpp:37
#13 boost::asio::detail::reactive_null_buffers_op<boost::_bi::bind_t<void, boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<boost::_bi::value<boost::shared_ptr<redisBoostClient> >, boost::arg<1> (*)()> > >::do_complete (
    owner=<optimized out>, base=<optimized out>) at /usr/include/boost/asio/detail/reactive_null_buffers_op.hpp:75
#14 0x00000000005c2567 in complete (bytes_transferred=0, ec=..., owner=..., this=0x2990740) at /usr/include/boost/asio/detail/task_io_service_operation.hpp:37
#15 do_run_one (ec=..., this_thread=..., lock=..., this=0x295aff0) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:384
#16 boost::asio::detail::task_io_service::run (this=0x295aff0, ec=...) at /usr/include/boost/asio/detail/impl/task_io_service.ipp:153
#17 0x00000000005cdb78 in run (this=0x7fff16116b70, ec=...) at /usr/include/boost/asio/impl/io_service.ipp:66
#18 EventManager::Run (this=this@entry=0x7fff16116b70) at controller/src/io/event_manager.cc:32
#19 0x000000000041d7e7 in main (argc=<optimized out>, argv=<optimized out>) at controller/src/query_engine/qed.cc:284
(gdb) f 4
#4 __redisAsyncFree (ac=0x2982e20) at build/third_party/hiredis/src/async.c:233
233 build/third_party/hiredis/src/async.c: No such file or directory.
(gdb) p ac
$1 = (redisAsyncContext *) 0x2982e20
(gdb) p *ac
$2 = {c = {err = 3, errstr = "Server closed the connection", '\000' <repeats 99 times>, fd = 8, flags = 22, obuf = 0x298f498 "", reader = 0x2982bc0}, err = 3, errstr = 0x2982e24 "Server closed the connection", data = 0x297dc00, ev = {data = 0x2982e20,
    addRead = 0x6cde30 <call_C_addRead(void*)>, delRead = 0x6cd200 <call_C_delRead(void*)>, addWrite = 0x6cde50 <call_C_addWrite(void*)>, delWrite = 0x6cd210 <call_C_delWrite(void*)>, cleanup = 0x6cd220 <call_C_cleanup(void*)>},
  onDisconnect = 0x4efb00 <RedisAsyncConnection::RAC_DisconnectCallback(redisAsyncContext const*, int)>, onConnect = 0x4efcb0 <RedisAsyncConnection::RAC_ConnectCallback(redisAsyncContext const*, int)>, replies = {head = 0x0, tail = 0x0}, sub = {invalid = {head = 0x0,
      tail = 0x0}, channels = 0x2982af0, patterns = 0x2982b30}}
(gdb) p/x *ac
$3 = {c = {err = 0x3, errstr = {0x53, 0x65, 0x72, 0x76, 0x65, 0x72, 0x20, 0x63, 0x6c, 0x6f, 0x73, 0x65, 0x64, 0x20, 0x74, 0x68, 0x65, 0x20, 0x63, 0x6f, 0x6e, 0x6e, 0x65, 0x63, 0x74, 0x69, 0x6f, 0x6e, 0x0 <repeats 100 times>}, fd = 0x8, flags = 0x16, obuf = 0x298f498,
    reader = 0x2982bc0}, err = 0x3, errstr = 0x2982e24, data = 0x297dc00, ev = {data = 0x2982e20, addRead = 0x6cde30, delRead = 0x6cd200, addWrite = 0x6cde50, delWrite = 0x6cd210, cleanup = 0x6cd220}, onDisconnect = 0x4efb00, onConnect = 0x4efcb0, replies = {
    head = 0x0, tail = 0x0}, sub = {invalid = {head = 0x0, tail = 0x0}, channels = 0x2982af0, patterns = 0x2982b30}}
(gdb)

The ConnectCallbackProcess needs to handle empty reply from redis which can happen when server closes connection.

Tags: analytics
Revision history for this message
Megh Bhatt (meghb) wrote :
Download full text (5.2 KiB)

warning: core file may not match specified executable file.
[New LWP 15609]
[New LWP 15619]
[New LWP 15617]
[New LWP 15618]
[New LWP 15616]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-collector'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 OpServerProxy::OpServerImpl::toConnectCallbackProcess (this=0x1750af0, c=0x1751fc0, r=0x0, privdata=0x0) at controller/src/analytics/OpServerProxy.cc:229
229 controller/src/analytics/OpServerProxy.cc: No such file or directory.
(gdb) bt
#0 OpServerProxy::OpServerImpl::toConnectCallbackProcess (this=0x1750af0, c=0x1751fc0, r=0x0, privdata=0x0) at controller/src/analytics/OpServerProxy.cc:229
#1 0x00000000004db38a in operator() (a2=0x0, a1=0x0, a0=0x1751fc0, this=0x7fff5ddb1890) at /usr/include/boost/function/function_template.hpp:767
#2 RedisAsyncConnection::RAC_AsyncCmdCallback (c=0x1751fc0, r=0x0, privdata=0x0) at controller/src/analytics/redis_connection.cc:239
#3 0x000000000074b689 in __redisRunCallback (cb=0x7fff5ddb19a0, cb=0x7fff5ddb19a0, reply=0x0, ac=<optimized out>) at build/third_party/hiredis/src/async.c:219
#4 __redisAsyncFree (ac=0x1751fc0) at build/third_party/hiredis/src/async.c:233
#5 0x000000000074d5c9 in redisBoostClient::handle_read (this=0x1751550, ec=...) at build/third_party/hiredis/hiredis-boostasio-adapter/boostasio.cpp:62
#6 0x000000000074dc94 in call<boost::shared_ptr<redisBoostClient>, boost::system::error_code> (b1=<synthetic pointer>, u=..., this=<optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:156
#7 operator()<boost::shared_ptr<redisBoostClient> > (a1=..., u=..., this=<optimized out>) at /usr/include/boost/bind/mem_fn_template.hpp:171
#8 operator()<boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<const boost::system::error_code&, long unsigned int const&> > (a=<synthetic pointer>, f=..., this=<optimized out>) at /usr/include/boost/bind/bind.hpp:313
#9 operator()<boost::system::error_code, long unsigned int> (a2=<optimized out>, a1=..., this=<optimized out>) at /usr/include/boost/bind/bind_template.hpp:102
#10 operator() (this=<optimized out>) at /usr/include/boost/asio/detail/bind_handler.hpp:127
#11 asio_handler_invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<boost::_bi::value<boost::shared_ptr<redisBoostClient> >, boost::arg<1> (*)()> >, boost::system::error_code, unsigned long> > (function=...) at /usr/include/boost/asio/handler_invoke_hook.hpp:64
#12 invoke<boost::asio::detail::binder2<boost::_bi::bind_t<void, boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<boost::_bi::value<boost::shared_ptr<redisBoostClient> >, boost::arg<1> (*)()> >, boost::system::error_code, unsigned long>, boost::_bi::bind_t<void, boost::_mfi::mf1<void, redisBoostClient, boost::system::error_code>, boost::_bi::list2<boost::_bi::value<boost::shared_ptr<redisBoostClient> >, boost::arg<1> (*)()> > > (context=..., function=...)
    at /usr/include/boost/asio/detail/handl...

Read more...

Revision history for this message
Megh Bhatt (meghb) wrote :
Revision history for this message
Megh Bhatt (meghb) wrote :
Changed in juniperopenstack:
importance: Undecided → Medium
milestone: none → r3.0-fcs
tags: added: analytics
Changed in juniperopenstack:
assignee: nobody → Megh Bhatt (meghb)
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/14627
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/14627
Committed: http://github.org/Juniper/contrail-controller/commit/6786de564f945765fb0d8a4644c0bd6bbe80c9b8
Submitter: Zuul
Branch: master

commit 6786de564f945765fb0d8a4644c0bd6bbe80c9b8
Author: Megh Bhatt <email address hidden>
Date: Fri Oct 23 16:20:48 2015 -0700

Handle NULL redisReply in connect callback process in opserver
proxy code. This can happen during read EOF resulting in
redisAsyncFree calling the callbacks with NULL replies.
Closes-Bug: #1499129

Change-Id: Ic691f8075f314e34254a94c7a9aea21b577f5524

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/16888
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/16890
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/17044
Submitter: Megh Bhatt (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/16890
Committed: http://github.org/Juniper/contrail-controller/commit/b62c8dac34f18b1ed45cdcb0ae3ca9d11fbd6c13
Submitter: Zuul
Branch: R2.21.x

commit b62c8dac34f18b1ed45cdcb0ae3ca9d11fbd6c13
Author: Megh Bhatt <email address hidden>
Date: Fri Oct 23 16:20:48 2015 -0700

Handle NULL redisReply in connect callback process in opserver
proxy code. This can happen during read EOF resulting in
redisAsyncFree calling the callbacks with NULL replies.
Closes-Bug: #1499129

Conflicts:
 src/query_engine/QEOpServerProxy.cc

Change-Id: Ic691f8075f314e34254a94c7a9aea21b577f5524
(cherry picked from commit 6786de564f945765fb0d8a4644c0bd6bbe80c9b8)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/16888
Committed: http://github.org/Juniper/contrail-controller/commit/8fc52f62fc159ece2f5fcd10ff094edac345ead3
Submitter: Zuul
Branch: R2.20

commit 8fc52f62fc159ece2f5fcd10ff094edac345ead3
Author: Megh Bhatt <email address hidden>
Date: Fri Oct 23 16:20:48 2015 -0700

Handle NULL redisReply in connect callback process in opserver
proxy code. This can happen during read EOF resulting in
redisAsyncFree calling the callbacks with NULL replies.
Closes-Bug: #1499129

Conflicts:
 src/query_engine/QEOpServerProxy.cc

Change-Id: Ic691f8075f314e34254a94c7a9aea21b577f5524
(cherry picked from commit 6786de564f945765fb0d8a4644c0bd6bbe80c9b8)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/17044
Committed: http://github.org/Juniper/contrail-controller/commit/9cbae40077b869de83c1d735b62ca1e93ab14597
Submitter: Zuul
Branch: R2.22.x

commit 9cbae40077b869de83c1d735b62ca1e93ab14597
Author: Megh Bhatt <email address hidden>
Date: Fri Oct 23 16:20:48 2015 -0700

Handle NULL redisReply in connect callback process in opserver
proxy code. This can happen during read EOF resulting in
redisAsyncFree calling the callbacks with NULL replies.
Closes-Bug: #1499129

Conflicts:
 src/query_engine/QEOpServerProxy.cc

Change-Id: Ic691f8075f314e34254a94c7a9aea21b577f5524
(cherry picked from commit 6786de564f945765fb0d8a4644c0bd6bbe80c9b8)

information type: Proprietary → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.