Agent: XMPP connection not attempted by agent

Bug #1446463 reported by Sandip Dey
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.0
Fix Committed
Medium
Nipa
R2.20
Fix Committed
Medium
Nipa
R2.21.x
Fix Released
Medium
Nipa
Trunk
Fix Committed
Medium
Nipa

Bug Description

Gcore saved at :http://10.204.216.50/Docs/bugs/<bug-id>

Sandip,

  Yes, I don't see any Xmpp connection attempts even in the logs/traces after April17th on this setup, but discovery responses seem to be fine. Please keep the setup, I have taken the gcore on the box, I will debug and get back.

-nipa

From: Sandip Dey <email address hidden>
Date: Monday, April 20, 2015 12:26 AM
To: Nipa Kumar <email address hidden>
Cc: Hari Prasad Killi <email address hidden>, Nagabhushana R <email address hidden>
Subject: xmpp not up in agent

Hi Nipa

Could you check the following setup.In Nodea29, xmpp not up.

host1 = 'root@10.204.216.7'
host2 = 'root@10.204.216.14'
host3 = 'root@10.204.216.15'
host4 = 'root@10.204.216.25'
host5 = 'root@10.204.217.75'

ext_routers = [('blr-mx1', '192.168.249.1')]
router_asn = 64510
public_vn_rtgt = 19006
public_vn_subnet = "10.204.219.80/29"

host_build = 'sandipd@10.204.216.4'

env.roledefs = {
    'all': [host1, host2, host3,host4, host5],
    'cfgm': [host1,host2,host3],
    'webui': [host1],
    'openstack': [host1],
    'control': [host2, host3],
    'collector': [host1],
    'database': [host1],
    'compute': [host4, host5],
    'build': [host_build]
}

env.hostnames = {
    'all': ['nodea11', 'nodea18', 'nodea19', 'nodea29', 'nodeg35']
}

Regards
Sandip

Tags: vrouter xmpp
Sandip Dey (sandipd)
tags: added: xmpp
Changed in juniperopenstack:
importance: Undecided → Medium
Revision history for this message
Nipa (nipak) wrote :

Logs is only to set to "SYS_NOTICE" and hence cannot capture the actual steps. Please enable atleast INFO or DEBUG if this is seen again.

SYS_NOTICE logs on server indicate connection from agent was attempted on 4/20, when introspect shows no connection between agent and control-node, but logs below indicate connection was attempted, timing on both client and server seems to be off between the message though.

contrail-control.log.1:2015-04-20 Mon 11:46:51:680.042 IST nodea18 [Thread 139923407603456, Pid 22337]: XMPP [SYS_NOTICE]: XmppEventLog: Mode Server: PassiveOpen in state: Idle peer: 192.168.251.5 controller/src/xmpp/xmpp_state_machine.cc 1328

contrail-control.log.1:2015-04-20 Mon 12:03:45:374.960 IST nodea18 [Thread 139896058955520, Pid 24851]: XMPP [SYS_NOTICE]: XmppEventLog: Mode Server: PassiveOpen in state: Idle peer: 192.168.251.5 controller/src/xmpp/xmpp_state_machine.cc 1328

2015-04-20 22:22:27.376 XmppRxStream: Received xmpp message from: 192.168.250.4 Port 41789 Size: 171 Packet: <stream:stream from="nodea29" <email address hidden>" version="1.0" xml:lang="en" xmlns="jabber:client" xmlns:stream="http://etherx.jabber.org/streams" > $ controller/src/xmpp/xmpp_connection.cc 474

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/10870
Submitter: Nipa Kumar (<email address hidden>)

Nischal Sheth (nsheth)
information type: Proprietary → Public
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/10870
Committed: http://github.org/Juniper/contrail-controller/commit/7811f4eab136ed45751e4a04d6d7710a78109d8a
Submitter: Zuul
Branch: master

commit 7811f4eab136ed45751e4a04d6d7710a78109d8a
Author: Nipa Kumar <email address hidden>
Date: Tue May 26 14:50:42 2015 -0700

Agent: XMPP connection not attempted by agent

Agent marks connection to Xmpp-Server DOWN after several attempts and rediscovers controllers.
If the controller list sent by Discovery is already marked DOWN, agent does not honor the
controller connection to be applied. This happens in the time period where discovery has not
yet marked the controller down (marked when 3 heartbeats are missing) and sends the stale list.
Additionally as discovery client has a checksum so the callbacks are throttled at source and
hence the clients never get called when controllers are UP and the information is lost.

Solution is to honor reponse from discovery irrespective of the state of the publisher
(both Xmpp Server advertised by controller and dns daemon)

Test cases added.

Change-Id: I5c6695f04d3dd4a384f0ea7d0da912475966c6eb
Closes-Bug:1446463
Closes-Bug:1457243

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.20

Review in progress for https://review.opencontrail.org/10929
Submitter: Nipa Kumar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/10929
Committed: http://github.org/Juniper/contrail-controller/commit/6d845c15ca2bd114ded81d4092aa134b929ec39e
Submitter: Zuul
Branch: R2.20

commit 6d845c15ca2bd114ded81d4092aa134b929ec39e
Author: Nipa Kumar <email address hidden>
Date: Tue May 26 14:50:42 2015 -0700

Agent: XMPP connection not attempted by agent

Agent marks connection to Xmpp-Server DOWN after several attempts and rediscovers controllers.
If the controller list sent by Discovery is already marked DOWN, agent does not honor the
controller connection to be applied. This happens in the time period where discovery has not
yet marked the controller down (marked when 3 heartbeats are missing) and sends the stale list.
Additionally as discovery client has a checksum so the callbacks are throttled at source and
hence the clients never get called when controllers are UP and the information is lost.

Solution is to honor reponse from discovery irrespective of the state of the publisher
(both Xmpp Server advertised by controller and dns daemon)

Test cases added.

Change-Id: I5c6695f04d3dd4a384f0ea7d0da912475966c6eb
Closes-Bug:1446463
Closes-Bug:1457243

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.0

Review in progress for https://review.opencontrail.org/12160
Submitter: Nipa Kumar (<email address hidden>)

Revision history for this message
Jeba Paulaiyan (jebap) wrote :

Testcase:

This bug was discovered by and verified using Contrail Sanity test suite.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.