RCA required - Contrail-vrouter-agent status timeout on the controller nodes for the compute node XMPP hold timer message are printed
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Juniper Openstack | Status tracked in Trunk | |||||
Trunk |
Incomplete
|
High
|
vijaya kumar shankaran |
Bug Description
Contrail-
At the time of the issue
kw1bp-vscp0040n contrail]$ date; sudo contrail-status
Thu Dec 21 18:47:05 UTC 2017
== Contrail vRouter ==
supervisor-vrouter: active
contrail-
contrail-
From the contrail-
2017-12-21 Thu 17:02:29:863.012 UTC kw1bp-vscp0040n [Thread 47598182344448, Pid 13594]: XMPP [SYS_NOTICE]: XmppEventLog: Mode Client: Event: Tcp Connected peer ip: 10.3.135.73 ( <email address hidden> ) controller/
2017-12-21 Thu 19:21:20:023.090 UTC kw1bp-vscp0040n [Thread 46985195744064, Pid 3709]: vhost interface name : vhost0
2017-12-21 Thu 19:21:20:023.187 UTC kw1bp-vscp0040n [Thread 46985195744064, Pid 3709]: vhost IP Address : 10.2.128.111/26
2017-12-21 Thu 19:21:20:023.197 UTC kw1bp-vscp0040n [Thread 46985195744064, Pid 3709]: vhost gateway : 10.2.128.65
2017-12-21 Thu 19:21:20:023.201 UTC kw1bp-vscp0040n [Thread 46985195744064, Pid 3709]: Ethernet port : bond1.61
2017-12-21 Thu 19:21:20:023.208 UTC kw1bp-vscp0040n [Thread 46985195744064, Pid 3709]: Xmpp Authentication : 0
2017-12-21 Thu 19:21:20:023.215 UTC kw1bp-vscp0040n [Thread 46985195744064, Pid 3709]: XMPP Server-1 : 0.0.0.0
2017-12-21 Thu 19:21:20:023.220 UTC kw1bp-vscp0040n [Thread 46985195744064, Pid 3709]: XMPP Server-2 : 0.0.0.0
Output of Process status at the time of the issue
kw1bp-vscp0040n contrail]$ date; ps auxw | grep contrail | grep -v grep
Thu Dec 21 18:48:52 UTC 2017
root 7151 0.0 0.0 230324 13240 ? Ss Apr13 77:11 /usr/bin/python /usr/bin/
root 13593 0.0 0.0 335140 28196 ? Sl Apr13 244:28 python /usr/bin/
root 13594 19.2 0.1 1437088 341028 ? Sl Apr13 69952:24 /usr/bin/
-------
From the message log on the affected compute node we notice
Dec 21 17:57:01 kw1bp-vscp0040n-jp1 log_monitor.
Dec 21 18:00:01 kw1bp-vscp0040n-jp1 log_monitor.
Dec 21 19:21:01 kw1bp-vscp0040n-jp1 log_monitor.
Vrouter agent was restarted at Thu Dec 21 19:21:16 and the vrouter-agent came back to active
From supervisord-
2017-12-21 19:21:16,702 INFO stopped: contrail-
2017-12-21 19:21:16,744 INFO stopped: contrail-
2017-12-21 19:21:18,750 INFO spawned: 'contrail-
2017-12-21 19:21:18,752 INFO spawned: 'contrail-
2017-12-21 19:21:19,930 INFO success: contrail-
2017-12-21 19:21:24,010 INFO success: contrail-
kw1bp-vscp0040n contrail]$ date; sudo contrail-
Thu Dec 21 19:21:14 UTC 2017
== Contrail vRouter ==
supervisor-vrouter: active
contrail-
contrail-
========Run time service failures=
/var/crashes/
/var/crashes/
contrail-
contrail-
contrail-
contrail-
== Contrail vRouter ==
supervisor-vrouter: active
contrail-
contrail-
At the control nodes XMPP connection status was reported as expired for this compute node
< kw1np-coct0002n /var/log/
-------
2017-12-21 Thu 17:55:07:275.465 UTC kw1np-coct0002n [Thread 140517761337088, Pid 45236]: XMPP [SYS_NOTICE]: XmppEventLog: Mode Server: Event: HoldTimer Expired peer ip: 10.2.128.111 ( kw1bp-vscp0040n ) controller/
2017-12-21 Thu 17:55:07:303.237 UTC kw1np-coct0002n [Thread 140517895554816, Pid 45236]: SANDESH: Ratelimit Drop (100 messages/second): BGP [SYS_DEBUG]: XmppPeerTableLog: XMPP Peer kw1np-coct0002n
-------
< kw1np-coct0003n /var/log/
-------
2017-12-21 Thu 17:55:10:321.834 UTC kw1np-coct0003n [Thread 139650406352640, Pid 15291]: XMPP [SYS_NOTICE]: XmppEventLog: Mode Server: Event: HoldTimer Expired peer ip: 10.2.128.111 ( kw1bp-vscp0040n ) controller/
2017-12-21 Thu 17:55:10:468.395 UTC kw1np-coct0003n [Thread 139650397955840, Pid 15291]: SANDESH: Ratelimit Drop (100 messages/second): BGP [SYS_DEBUG]: XmppPeerTableLog: XMPP Peer kw1np-coct0003n
Looks related to https:/
Customer needs an RCA.
information type: | Proprietary → Public |
tags: | added: config contrail-control |
Hi Team,
can we have an updates on this issue?
Bets Regards,
Vijay Kumar