I created a CDH-5.4 cluster instance on Openstack Liberty using sahara, and finished all the deploy steps without any error. Of course the cloudera-manager had started successfully. However, after hang a while, the cluster status changed to ERROR unexpectedly.
The sahara engine log as follows:
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [req-2d62bd63-0f3c-46d3-a416-6d13d2084339 ] [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] Error during operating on cluster (reason: <urlopen error [Errno 111] ECONNREFUSED>)
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] Traceback (most recent call last):
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] File "/usr/lib/python2.7/site-packages/sahara/service/ops.py", line 164, in wrapper
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] f(cluster_id, *args, **kwds)
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] File "/usr/lib/python2.7/site-packages/sahara/service/ops.py", line 268, in _provision_cluster
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] plugin.configure_cluster(cluster)
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] File "/usr/lib/python2.7/site-packages/sahara/plugins/cdh/plugin.py", line 47, in configure_cluster
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] cluster.hadoop_version).configure_cluster(cluster)
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] File "/usr/lib/python2.7/site-packages/sahara/plugins/cdh/v5_4_0/versionhandler.py", line 80, in configure_cluster
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] dp.configure_cluster(cluster)
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] File "/usr/lib/python2.7/site-packages/sahara/plugins/cdh/v5_4_0/deploy.py", line 74, in configure_cluster
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] CU.update_cloudera_password(cluster)
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] File "/usr/lib/python2.7/site-packages/sahara/plugins/cdh/cloudera_utils.py", line 83, in update_cloudera_password
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] user = api.get_user(self.CM_DEFAULT_USERNAME)
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] File "/usr/lib/python2.7/site-packages/sahara/plugins/cdh/client/api_client.py", line 128, in get_user
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] return users.get_user(self, username)
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] File "/usr/lib/python2.7/site-packages/sahara/plugins/cdh/client/users.py", line 38, in get_user
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] '%s/%s' % (USERS_PATH, username), ApiUser)
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] File "/usr/lib/python2.7/site-packages/sahara/plugins/cdh/client/types.py", line 149, in call
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] ret = method(path, params=params)
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] File "/usr/lib/python2.7/site-packages/sahara/plugins/cdh/client/resource.py", line 126, in get
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] raise e
2016-07-21 08:29:46.205 74685 ERROR sahara.service.ops [instance: none, cluster: 1db030c9-4391-45c3-bad6-2db1f3eceb5d] URLError: <urlopen error [Errno 111] ECONNREFUSED>
Environment:
Openstack Version: Liberty
Sahara Version: 3.0.0-5
CDH Version: 5.4
Configuration:
use_floating_ips= False
use_neutron = true
use_namespaces = true
use_rootwrap = true
This error reproduce 100% on my environment, I'm not sure if this is a bug.
cm_api (some part of which is now living in sahara repository) can't make a proxied http server to perform commands with cloudera manager. you should assign an floating ip for cloudera manager node, that will probably will help you right now as a work around