I use json file to create cluster from queens.But the cluster alway stays in "waiting" state. When I check log, see:
2018-04-11 19:15:45.572 17182 DEBUG sahara.utils.ssh_remote [req-e4c4ba37-a1f2-447e-843e-430fa1b0ebdb dc765092cbf54cf1a387fcc1daf2460d 89caddb625934909b096895bfe3eff4d - - -] [instance: 02326501-8160-47b7-a750-b08efefc2984, cluster: 5e508219-aadf-409f-8727-6bd9417ebe40] "Executing "ls .ssh/authorized_keys"" took 129.8 seconds to complete _log_command /usr/lib/python2.7/site-packages/sahara/utils/ssh_remote.py:932
2018-04-11 19:15:45.573 17182 DEBUG sahara.service.engine [req-e4c4ba37-a1f2-447e-843e-430fa1b0ebdb dc765092cbf54cf1a387fcc1daf2460d 89caddb625934909b096895bfe3eff4d - - -] [instance: none, cluster: 5e508219-aadf-409f-8727-6bd9417ebe40] Can't login to node, IP: 172.24.4.12, reason error: [Errno 110] Connection timed out
Error ID: 18b133a4-0495-4888-85fc-219351fca325 _is_accessible /usr/lib/python2.7/site-packages/sahara/service/engine.py:130
2018-04-11 19:15:45.581 17182 DEBUG sahara.utils.ssh_remote [req-e4c4ba37-a1f2-447e-843e-430fa1b0ebdb dc765092cbf54cf1a387fcc1daf2460d 89caddb625934909b096895bfe3eff4d - - -] [instance: 02326501-8160-47b7-a750-b08efefc2984, cluster: 5e508219-aadf-409f-8727-6bd9417ebe40] "Executing "ls .ssh/authorized_keys"" took 129.8 seconds to complete _log_command /usr/lib/python2.7/site-packages/sahara/utils/ssh_remote.py:932
2018-04-11 19:15:45.581 17182 DEBUG sahara.service.engine [req-e4c4ba37-a1f2-447e-843e-430fa1b0ebdb dc765092cbf54cf1a387fcc1daf2460d 89caddb625934909b096895bfe3eff4d - - -] [instance: 02326501-8160-47b7-a750-b08efefc2984, cluster: 5e508219-aadf-409f-8727-6bd9417ebe40] Can't login to node, IP: 172.24.4.9, reason error: [Errno 110] Connection timed out
and the end of log:
Error ID: 4cd00335-ed6b-4e9c-8195-4dd9af0794b5): ThreadException: An error occurred in thread 'wait-for-ssh-hello-worker-0': 'Operation with name 'Wait for instance accessibility'' timed out after 10800 second(s) and following timeout was violated: wait_until_accessible
Error ID: 07f83f07-62ba-4100-9163-5788b2edaf34
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/sahara/context.py", line 167, in _wrapper
func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/sahara/utils/cluster_progress_ops.py", line 139, in handler
add_fail_event(instance, e)
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/usr/lib/python2.7/site-packages/sahara/utils/cluster_progress_ops.py", line 136, in handler
value = func(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/sahara/service/engine.py", line 137, in _wait_until_accessible
self._is_accessible(instance)
File "/usr/lib/python2.7/site-packages/sahara/utils/poll_utils.py", line 161, in handler
poll(**poll_description)
File "/usr/lib/python2.7/site-packages/sahara/utils/poll_utils.py", line 125, in poll
raise ex.TimeoutException(timeout, operation_name, timeout_name)
TimeoutException: 'Operation with name 'Wait for instance accessibility'' timed out after 10800 second(s) and following timeout was violated: wait_until_accessible
Error ID: 07f83f07-62ba-4100-9163-5788b2edaf34
------------------------------------------------------------------------------------------------
the json file:
{
"name": "hello",
"plugin_name": "vanilla",
"hadoop_version": "2.7.1",
"default_image_id": "ee772c70-1523-411c-bff1-5e5afc28808d",
"node_groups": [
{
"name": "master",
"node_processes":
[
"namenode",
"resourcemanager"
],
"flavor_id": "2",
"floating_ip_pool": "1883e48a-2b46-42ad-b5ba-32fe2a89ef43",
"use_autoconfig": true,
"count": 1
},
{
"name": "worker",
"node_processes":
[
"nodemanager",
"datanode"
],
"flavor_id": "2",
"floating_ip_pool": "1883e48a-2b46-42ad-b5ba-32fe2a89ef43",
"use_autoconfig": true,
"count": 2
}
],
"neutron_management_network": "aedc90d6-cd9f-4241-91ad-18a3aacbca93",
"user_keypair_id": "data-keypair"
}
We don't track bugs on launchpad anymore: please report it on storyboard. openstack. org
That said, 99% is a configuration issue: sahara tries to reach the nodes through "1883e48a- 2b46-42ad- b5ba-32fe2a89ef 43", and for some reason it does not work (is 172.24.4.12 associated to the floating ip pool?)