Contrail 2.21-14: contrail-vrouter-agent Crash with no traffic, not recovering
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
Trunk |
Incomplete
|
High
|
Hari Prasad Killi |
Bug Description
Hit this coredump in one of the compute nodes - and the service never came back up.
root@ccra-07:~# contrail-status
== Contrail vRouter ==
supervisor-vrouter: active
contrail-
contrail-
========Run time service failures=
/var/crashes/
/var/crashes/
root@ccra-16:~# contrail-version
Package Version Build-ID | Repo | Package Name
-------
contrail-
contrail-
contrail-lib 2.21.1-14 14
contrail-nodemgr 2.21.1-14 14
contrail-nova-vif 2.21.1-14 14
contrail-
root@ccra-16:~# scp root@ccra-
The authenticity of host 'ccra-07 (10.102.28.81)' can't be established.
ECDSA key fingerprint is c2:09:b2:
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ccra-07,
root@ccra-07's password:
core.contrail-
root@ccra-16:~# which contrail-
/usr/bin/
root@ccra-16:~# gdb /usr/bin/
GNU gdb (Ubuntu 7.7.1-0ubuntu5~
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://
Find the GDB manual and other documentation resources online at:
<http://
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/
warning: core file may not match specified executable file.
[New LWP 4588]
[New LWP 4589]
[New LWP 4601]
[New LWP 4593]
[New LWP 4602]
[New LWP 4631]
[New LWP 4616]
[New LWP 4624]
[New LWP 4605]
[New LWP 4606]
[New LWP 4615]
[New LWP 4713]
[New LWP 4586]
[New LWP 4614]
[New LWP 4622]
[New LWP 4607]
[New LWP 4710]
[New LWP 4608]
[New LWP 4604]
[New LWP 4595]
[New LWP 4612]
[New LWP 4714]
[New LWP 4599]
[New LWP 4625]
[New LWP 4626]
[New LWP 4627]
[New LWP 4591]
[New LWP 4617]
[New LWP 4587]
[New LWP 4628]
[New LWP 4600]
[New LWP 4715]
[New LWP 4712]
[New LWP 4609]
[New LWP 4629]
[New LWP 4618]
[New LWP 4597]
[New LWP 4623]
[New LWP 4619]
[New LWP 4610]
[New LWP 4611]
[New LWP 4590]
[New LWP 4711]
[New LWP 4632]
[New LWP 4630]
[New LWP 4621]
[New LWP 3283]
[New LWP 4598]
[New LWP 4594]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_
Core was generated by `/usr/bin/
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f425e176cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/
56 ../nptl/
(gdb) bt
#0 0x00007f425e176cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/
#1 0x00007f425e17a0d8 in __GI_abort () at abort.c:89
#2 0x00007f425e16fb86 in __assert_fail_base (fmt=0x7f425e2c0830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=
file=
#3 0x00007f425e16fc32 in __GI___assert_fail (assertion=
function=
#4 0x00000000009e177d in VrfEntry:
#5 0x0000000000fe3eb9 in Timer::
#6 0x0000000000fdd8b0 in TaskImpl::execute() ()
#7 0x00007f425ed45b3a in ?? () from /usr/lib/
#8 0x00007f425ed41816 in ?? () from /usr/lib/
#9 0x00007f425ed40f4b in ?? () from /usr/lib/
#10 0x00007f425ed3d0ff in ?? () from /usr/lib/
#11 0x00007f425ed3d2f9 in ?? () from /usr/lib/
#12 0x00007f425ef61182 in start_thread (arg=0x7f4256ff
#13 0x00007f425e23a47d in clone () at ../sysdeps/
(gdb) bt full
#0 0x00007f425e176cc9 in __GI_raise (sig=sig@entry=6) at ../nptl/
resultvar = 0
pid = 3283
selftid = 4588
#1 0x00007f425e17a0d8 in __GI_abort () at abort.c:89
save_stage = 2
act = {__sigaction_
sigs = {__val = {32, 0 <repeats 15 times>}}
#2 0x00007f425e16fb86 in __assert_fail_base (fmt=0x7f425e2c0830 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=
file=
str = 0x7f424c0a7fb0 "\220\304\
total = 4096
#3 0x00007f425e16fc32 in __GI___assert_fail (assertion=
function=
No locals.
#4 0x00000000009e177d in VrfEntry:
No symbol table info available.
#5 0x0000000000fe3eb9 in Timer::
No symbol table info available.
#6 0x0000000000fdd8b0 in TaskImpl::execute() ()
No symbol table info available.
#7 0x00007f425ed45b3a in ?? () from /usr/lib/
No symbol table info available.
#8 0x00007f425ed41816 in ?? () from /usr/lib/
No symbol table info available.
#9 0x00007f425ed40f4b in ?? () from /usr/lib/
No symbol table info available.
#10 0x00007f425ed3d0ff in ?? () from /usr/lib/
No symbol table info available.
#11 0x00007f425ed3d2f9 in ?? () from /usr/lib/
No symbol table info available.
#12 0x00007f425ef61182 in start_thread (arg=0x7f4256ff
__res = <optimized out>
pd = 0x7f4256ffb700
now = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139922904168192, -19575361470367
pagesize_m1 = <optimized out>
sp = <optimized out>
freesize = <optimized out>
#13 0x00007f425e23a47d in clone () at ../sysdeps/
No locals.
information type: | Proprietary → Public |
tags: | added: vrouter |
Changed in juniperopenstack: | |
milestone: | none → r3.0-fcs |
Reboot didnt help too.
root@ccra-07:~# uptime
22:28:11 up 9 min, 1 user, load average: 0.13, 0.27, 0.20
Service restart also didnt help.