Comment 0 for bug 1779304

Revision history for this message
Nahian Chowdhury (nahian) wrote :

I am getting the following error, tried many times with some changes,

not reciveing the poll as operation.

Exception: Agents failed to join: {'shaker_tderis_slave_0': 'lost', 'shaker_tderis_master_0': 'lost'}

------
snippet of debug outcome:

---
Deployed agent:

2018-06-29 07:13:07.015 3633 DEBUG keystoneauth.session [-] REQ: curl -g -i -X GET http://192.168.2.9:8004/v1/62f36ee8851c42edbaa42be8c828f36a/stacks/shaker_tderis/294edb07-3ad7-498b-911d-549dd53897ef/outputs/shaker_tderis_master_0_ip -H "Accept: application/json" -H "User-Agent: python-heatclient" -H "X-Auth-Token: {SHA1}9b80b66a24e89e2fd8cfec0ddee5c454dd35e733" _http_log_request /root/venv/local/lib/python2.7/site-packages/keystoneauth1/session.py:448
2018-06-29 07:13:07.765 3633 DEBUG keystoneauth.session [-] RESP: [200] Content-Length: 122 Content-Type: application/json Date: Fri, 29 Jun 2018 07:13:07 GMT X-Openstack-Request-Id: req-224074cd-1717-4543-95de-4f29756d0d95 _http_log_response /root/venv/local/lib/python2.7/site-packages/keystoneauth1/session.py:479
2018-06-29 07:13:07.766 3633 DEBUG keystoneauth.session [-] RESP BODY: {"output": {"output_value": "10.0.0.6", "output_key": "shaker_tderis_master_0_ip", "description": "No description given"}} _http_log_response /root/venv/local/lib/python2.7/site-packages/keystoneauth1/session.py:511
2018-06-29 07:13:07.766 3633 DEBUG keystoneauth.session [-] GET call to orchestration for http://192.168.2.9:8004/v1/62f36ee8851c42edbaa42be8c828f36a/stacks/shaker_tderis/294edb07-3ad7-498b-911d-549dd53897ef/outputs/shaker_tderis_master_0_ip used request id req-224074cd-1717-4543-95de-4f29756d0d95 request /root/venv/local/lib/python2.7/site-packages/keystoneauth1/session.py:844
2018-06-29 07:13:07.767 3633 DEBUG shaker.engine.server [-] Deployed agents: {'shaker_tderis_slave_0': {'node': u'kolla-compute1', 'zone': u'nova', 'availability_zone': u'nova:kolla-compute1', 'ip': u'10.0.0.10', 'master': {'node': u'kolla-compute2', 'zone': u'nova', 'availability_zone': u'nova:kolla-compute2', 'ip': u'10.0.0.6', 'mode': 'master', 'slave_id': 'shaker_tderis_slave_0', 'id': 'shaker_tderis_master_0'}, 'mode': 'slave', 'master_id': 'shaker_tderis_master_0', 'id': 'shaker_tderis_slave_0'}, 'shaker_tderis_master_0': {'node': u'kolla-compute2', 'slave': {'node': u'kolla-compute1', 'zone': u'nova', 'availability_zone': u'nova:kolla-compute1', 'ip': u'10.0.0.10', 'mode': 'slave', 'master_id': 'shaker_tderis_master_0', 'id': 'shaker_tderis_slave_0'}, 'zone': u'nova', 'availability_zone': u'nova:kolla-compute2', 'ip': u'10.0.0.6', 'mode': 'master', 'slave_id': 'shaker_tderis_slave_0', 'id': 'shaker_tderis_master_0'}} play_scenario /root/venv/local/lib/python2.7/site-packages/shaker/engine/server.py:182
2018-06-29 07:13:07.767 3633 INFO shaker.engine.quorum [-] Waiting for quorum of agents: set(['shaker_tderis_slave_0', 'shaker_tderis_master_0'])

--
last many lines before the ERROR,
-----

2018-06-29 07:22:58.553 3639 DEBUG shaker.agent.agent [-] Polling task: {'operation': 'poll', 'agent_id': '__heartbeat'} poll_task /root/venv/local/lib/python2.7/site-packages/shaker/agent/agent.py:40
2018-06-29 07:22:58.555 3633 DEBUG shaker.engine.messaging [-] Received request: {'operation': 'poll', 'agent_id': '__heartbeat'} __iter__ /root/venv/local/lib/python2.7/site-packages/shaker/engine/messaging.py:51
2018-06-29 07:22:58.557 3633 DEBUG shaker.engine.messaging [-] Sent reply: {'operation': 'none'} reply_handler /root/venv/local/lib/python2.7/site-packages/shaker/engine/messaging.py:55
2018-06-29 07:22:58.557 3639 DEBUG shaker.agent.agent [-] Received: {'operation': 'none'} poll_task /root/venv/local/lib/python2.7/site-packages/shaker/agent/agent.py:43
2018-06-29 07:23:08.568 3639 DEBUG shaker.agent.agent [-] Polling task: {'operation': 'poll', 'agent_id': '__heartbeat'} poll_task /root/venv/local/lib/python2.7/site-packages/shaker/agent/agent.py:40
2018-06-29 07:23:08.569 3633 DEBUG shaker.engine.messaging [-] Received request: {'operation': 'poll', 'agent_id': '__heartbeat'} __iter__ /root/venv/local/lib/python2.7/site-packages/shaker/engine/messaging.py:51
2018-06-29 07:23:08.570 3633 DEBUG shaker.engine.messaging [-] Sent reply: {'operation': 'none'} reply_handler /root/venv/local/lib/python2.7/site-packages/shaker/engine/messaging.py:55
2018-06-29 07:23:08.571 3639 DEBUG shaker.agent.agent [-] Received: {'operation': 'none'} poll_task /root/venv/local/lib/python2.7/site-packages/shaker/agent/agent.py:43
2018-06-29 07:23:18.582 3639 DEBUG shaker.agent.agent [-] Polling task: {'operation': 'poll', 'agent_id': '__heartbeat'} poll_task /root/venv/local/lib/python2.7/site-packages/shaker/agent/agent.py:40
2018-06-29 07:23:18.583 3633 DEBUG shaker.engine.messaging [-] Received request: {'operation': 'poll', 'agent_id': '__heartbeat'} __iter__ /root/venv/local/lib/python2.7/site-packages/shaker/engine/messaging.py:51
2018-06-29 07:23:18.584 3633 DEBUG shaker.engine.messaging [-] Sent reply: {'operation': 'none'} reply_handler /root/venv/local/lib/python2.7/site-packages/shaker/engine/messaging.py:55
2018-06-29 07:23:18.584 3639 DEBUG shaker.agent.agent [-] Received: {'operation': 'none'} poll_task /root/venv/local/lib/python2.7/site-packages/shaker/agent/agent.py:43
2018-06-29 07:23:28.594 3639 DEBUG shaker.agent.agent [-] Polling task: {'operation': 'poll', 'agent_id': '__heartbeat'} poll_task /root/venv/local/lib/python2.7/site-packages/shaker/agent/agent.py:40
2018-06-29 07:23:28.595 3633 DEBUG shaker.engine.messaging [-] Received request: {'operation': 'poll', 'agent_id': '__heartbeat'} __iter__ /root/venv/local/lib/python2.7/site-packages/shaker/engine/messaging.py:51
2018-06-29 07:23:28.596 3633 DEBUG shaker.engine.messaging [-] Sent reply: {'operation': 'none'} reply_handler /root/venv/local/lib/python2.7/site-packages/shaker/engine/messaging.py:55
2018-06-29 07:23:28.596 3639 DEBUG shaker.agent.agent [-] Received: {'operation': 'none'} poll_task /root/venv/local/lib/python2.7/site-packages/shaker/agent/agent.py:43

----

I thought It's a network issues and I checked following,

Ping:

From External Router -- Both Instance >>> Worked
External Router -- deploy node (shaker deployed) >> worked
External Router -- Internet >> worked

What might be the cause, and at which point I should check for networking issues?

Nodes:

3 controller nodes
2 compute nodes
2 network nodes

Those are all in overcloud.

Thanks in Advance.