Vcenter-as-compute: Query-engine crashes with basic_filebuf::underflow error during fab_setup_all(setup_collector)

Bug #1656139 reported by Sarath
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.1
Fix Committed
Critical
Arvind
R3.1.1.x
New
Critical
Arvind
R3.2
Invalid
Critical
Arvind
Trunk
Invalid
Critical
Arvind

Bug Description

This issue seen during Fresh install of Vcenter-as-compute build 3.1.1.0-51 ( kilo ) with ubuntu 14.0.4.5 and kernel is 3.13.0-100

[New LWP 19523]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/contrail-query-engine --conf_file /etc/contrail/contrail-query-engine.'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f9c97826c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007f9c97826c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f9c9782a028 in __GI_abort () at abort.c:89
#2 0x00007f9c9781fbf6 in __assert_fail_base (fmt=0x7f9c979703b8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x770ec9 "0",
    file=file@entry=0x71cc04 "controller/src/base/task.cc", line=line@entry=291, function=function@entry=0x71e780 "virtual tbb::task* TaskImpl::execute()") at assert.c:92
#3 0x00007f9c9781fca2 in __GI___assert_fail (assertion=0x770ec9 "0", file=0x71cc04 "controller/src/base/task.cc", line=291, function=0x71e780 "virtual tbb::task* TaskImpl::execute()") at assert.c:101
#4 0x000000000045d283 in ?? ()
#5 0x00007f9c985fdb3a in ?? () from /usr/lib/libtbb.so.2
#6 0x00007f9c985f9816 in ?? () from /usr/lib/libtbb.so.2
#7 0x00007f9c985f8f4b in ?? () from /usr/lib/libtbb.so.2
#8 0x00007f9c985f50ff in ?? () from /usr/lib/libtbb.so.2
#9 0x00007f9c985f52f9 in ?? () from /usr/lib/libtbb.so.2
#10 0x00007f9c98819184 in start_thread (arg=0x7f9c8dcba700) at pthread_create.c:312
#11 0x00007f9c978ea37d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

>> contrail-query-engine-log

2017-01-12 Thu 15:48:28:728.254 PST a6s38 [Thread 140310353766144, Pid 19500]: SANDESH: Send FAILED: 1484264908728217 [SYS_INFO]: QELog: In ConnectCallbackProcess.. NULL Reply controller/src/query_engine/QEOpServerProxy.cc 959
2017-01-12 Thu 15:48:28:728.323 PST a6s38 [Thread 140310353766144, Pid 19500]: SANDESH: Send FAILED: 1484264908728242 [SYS_INFO]: QELog: ConnDown.. DOWN.. Reconnect..2 controller/src/query_engine/QEOpServerProxy.cc 948
2017-01-12 Thu 15:48:28:728.283 PST a6s38 [Thread 140310370559744, Pid 19500]: !!!! ERROR !!!! Task caught fatal exception: basic_filebuf::underflow error reading the file TaskImpl: 0x7f9c907e7e40
2017-01-12 Thu 15:48:28:906.934 PST a6s38 [Thread 139892143634368, Pid 20752]: SANDESH: No Client: 1484264908906860 SandeshModuleClientTrace: data= [ name = a6s38:Analytics:contrail-query-engine:0 client_info= [ status = Idle successful_connections = 0 pid = 20752 http_port = 8091 start_time = 1484264908906661 collector_name = primary = 0.0.0.0:0 secondary = 0.0.0.0:0 rx_socket_stats= [ bytes = 0 calls = 0 average_bytes = 0 blocked_duration = 00:00:00 blocked_count = 0 average_blocked_duration = errors = 0 ] tx_socket_stats= [ bytes = 0 calls = 0 average_bytes = 0 blocked_duration = 00:00:00 blocked_count = 0 average_blocked_duration = errors = 0 ] ] msg_type_diff= [ [ _iter99->first = TcpServerMessageLog [ messages_sent = 0 messages_sent_dropped_no_queue = 0 messages_sent_dropped_no_client = 1 messages_sent_dropped_no_session = 0 messages_sent_dropped_queue_level = 0 messages_sent_dropped_client_send_failed = 0 messages_sent_dropped_session_not_connected = 0 messages_sent_dropped_header_write_failed = 0 messages_sent_dropped_write_failed = 0 messages_sent_dropped_wrong_client_sm_state = 0 messages_sent_dropped_validation_failed = 0 messages_sent_dropped_rate_limited = 0 ], ] ] tx_msg_diff= [ [ _iter103->first = dropped_no_client _iter103->second = 1, ] ] ]

Revision history for this message
Sarath (nsarath) wrote :

nsarath@ubuntu-build04:/auto/cores/1656139$ ls -l
total 327120
-rwxrwxrwx 1 nsarath test 116789248 Jan 12 17:11 core.contrail-query-.19500.a6s38.1484264908
-rwxrwxrwx 1 nsarath test 49940480 Jan 12 17:06 Ctrl-A-log.tar
-rwxrwxrwx 1 nsarath test 47656960 Jan 12 17:06 Ctrl-B-log.tar
-rwxrwxrwx 1 nsarath test 51752960 Jan 12 17:06 Ctrl-C-log.tar
-rwxrwxrwx 1 nsarath test 18698240 Jan 12 17:08 Esxi-1-log.tar
-rwxrwxrwx 1 nsarath test 18708480 Jan 12 17:08 Esxi-2-log.tar
-rwxrwxrwx 1 nsarath test 30074880 Jan 12 17:07 Kvm-1-log.tar
nsarath@ubuntu-build04:/auto/cores/1656139$

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.1

Review in progress for https://review.opencontrail.org/28076
Submitter: Arvind (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/28076
Committed: http://github.org/Juniper/contrail-controller/commit/3670d8a1bd0e231aab94694c492b4fdefa62c4e8
Submitter: Zuul (<email address hidden>)
Branch: R3.1

commit 3670d8a1bd0e231aab94694c492b4fdefa62c4e8
Author: arvindvis <email address hidden>
Date: Fri Jan 20 15:53:36 2017 -0800

Crash was repoted in 3.1.1.0 build 51 with qed. The TB's pointed
to a TaskTrigger that gets CpuLoadInfo. The CpuLoadInfo is defined in
base. We suspect an issue in the ifstream operation. Adding asserts
to confirm the same. Its not reproducible consistently.
Closes-Bug:#1656139

Change-Id: Icfe671d55cc99ddb270f292d80d9621b6ebab5cb

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.