Discovery showed a compute node not subscribed to any collector

Bug #1382833 reported by Vedamurthy Joshi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
New
High
Deepinder Setia

Bug Description

R1.10 Build 59 Ubuntu multi-node running Icehouse

After a couple of parallel sanity runs on this setup, started seeing the below issue.
Logs on the nodes will be in http://10.204.216.50/Docs/bugs/#

Pasting the mail chain..

From: Sandip Dey <email address hidden>
Date: Saturday, October 18, 2014 at 10:06 PM
To: Vedamurthy Joshi <email address hidden>, Sundaresan Rajangam <email address hidden>, Nagabhushana R <email address hidden>
Subject: Re: One vrouter is not subscribed to collector service

http://nodeh5:8085/Snh_DiscoveryClientSubscriberStatsReq?

subscriber
serviceName subscribe_sent subscribe_fail subscribe_rcvd subscribe_retries
Collector
19
0
19
0
dns-server
20
0
20
0
xmpp-server
20
0
20
0

http://nodeh5:8085/Snh_CollectorInfoRequest?
p
10.204.217.69
port
8086
status
Established
more
false
http://nodeh5:8085/Snh_SandeshTraceRequest?x=DiscoveryClient also looked normal.

From discovery.log

10/18/2014 04:39:32 PM [nodec22:DiscoveryService:Config:0]: __default__ [SYS_INFO]: discServiceLog: subscribe: service type=Collector, client=VRouterAgent:nodeh5:VRouterAgent, ttl=1186, asked=2 pubs=3/2, subs=0

See this crash in contrail-discovery-0.log

10.204.217.7 - - [2014-10-18 16:39:32] "POST /heartbeat HTTP/1.1" 200 121 0.006030
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/gevent/pywsgi.py", line 504, in handle_one_response
    self.run_application()
  File "/usr/lib/pymodules/python2.7/gevent/pywsgi.py", line 491, in run_application
    self.process_result()
  File "/usr/lib/pymodules/python2.7/gevent/pywsgi.py", line 482, in process_result
    self.write(data)
  File "/usr/lib/pymodules/python2.7/gevent/pywsgi.py", line 375, in write
    self._write_with_headers(data)
  File "/usr/lib/pymodules/python2.7/gevent/pywsgi.py", line 395, in _write_with_headers
    self._sendall(towrite)
  File "/usr/lib/pymodules/python2.7/gevent/pywsgi.py", line 350, in _sendall
    self.socket.sendall(data)
  File "/usr/lib/pymodules/python2.7/gevent/socket.py", line 458, in sendall
    data_sent += self.send(_get_memory(data, data_sent), flags)
  File "/usr/lib/pymodules/python2.7/gevent/socket.py", line 435, in send
    return sock.send(data, flags)
error: [Errno 104] Connection reset by peer
{'CONTENT_LENGTH': '81',
 'CONTENT_TYPE': 'application/xml',
 'GATEWAY_INTERFACE': 'CGI/1.1',
 'HTTP_ACCEPT': '*/*',
 'HTTP_HOST': '10.204.217.7:5998',
 'PATH_INFO': '/publish/nodec22',
 'QUERY_STRING': '',
 'REMOTE_ADDR': '10.204.217.7',
 'REMOTE_PORT': '50566',
 'REQUEST_METHOD': 'POST',
error: [Errno 104] Connection reset by peer
{'CONTENT_LENGTH': '81',
 'CONTENT_TYPE': 'application/xml',
 'GATEWAY_INTERFACE': 'CGI/1.1',
 'HTTP_ACCEPT': '*/*',
 'HTTP_HOST': '10.204.217.7:5998',
 'PATH_INFO': '/publish/nodec22',
 'QUERY_STRING': '',
 'REMOTE_ADDR': '10.204.217.7',
 'REMOTE_PORT': '50566',
 'REQUEST_METHOD': 'POST',
 'SCRIPT_NAME': '',
 'SERVER_NAME': 'ip6-localhost',
 'SERVER_PORT': '9110',
 'SERVER_PROTOCOL': 'HTTP/1.0',
 'SERVER_SOFTWARE': 'gevent/1.0 Python/2.7',
 'bottle.app': <bottle.Bottle object at 0x225c6d0>,
 'bottle.request': <LocalRequest: POST http://10.204.217.7:5998/publish/nodec22>,
 'bottle.request.body': <StringIO.StringIO instance at 0x277cb00>,
 'bottle.request.headers': <bottle.WSGIHeaderDict object at 0x26ac890>,
 'bottle.request.urlparts': SplitResult(scheme='http', netloc='10.204.217.7:5998', path='/publish/nodec22', query='', fragment=''),
 'bottle.route': <POST '/publish/<end_point>' <bound method DiscoveryServer.error_handler of <discovery.disc_server.DiscoveryServer instance at 0x24d3050>>>,
 'route.handle': <POST '/publish/<end_point>' <bound method DiscoveryServer.error_handler of <discovery.disc_server.DiscoveryServer instance at 0x24d3050>>>,
 'route.url_args': {'end_point': 'nodec22'},
 'wsgi.errors': <open file '<stderr>', mode 'w' at 0x7f0bb7b35270>,
 'wsgi.input': <StringIO.StringIO instance at 0x277cb00>,
 'wsgi.multiprocess': False,
 'wsgi.multithread': False,
 'wsgi.run_once': False,
 'wsgi.url_scheme': 'http',
 'wsgi.version': (1, 0)} failed with error

Vedu, looks like discovery issue, please raise a bug and assign to Deepinder.

Regards
Sandip
From: Vedamurthy Ananth Joshi <email address hidden>
Date: Saturday, October 18, 2014 8:18 PM
To: Sandip Dey <email address hidden>, Sundaresan Rajangam <email address hidden>, Nagabhushana R <email address hidden>
Subject: One vrouter is not subscribed to collector service

Guys,
On my setup: out of 3 vrouters, one of the vrouters is not subscribed to collector service. So the discovery testcase fails. Are there any known issues here ?
This is 1.2 Build 59 image.

http://10.204.217.7:5998/clients.json << No collector service for nodeh5:VRouterAgent

Vrouter is nodeh5(10.204.217.109)

Setup is in same state.

env.roledefs = {
    'all': [host1, host2, host3, host4, host5],
    'cfgm': [host1,host4,host3],
    'openstack': [host2],
    'control': [host1,host4],
    'compute': [host3,host4,host5],
    'collector': [host1,host4,host3],
    'webui': [host1],
    'database': [host1,host4,host3],
    'build': [host_build],
}

env.hostnames = {
    'all': ['nodec22', 'nodeg30', 'nodeg29', 'nodeh4', 'nodeh5']
}

Tags: config
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.