This bug is to backport fix for https://bugs.launchpad.net/sahara/+bug/1351624
==============
Original dscription
==============
I was using stable/icehouse and Vanilla plugin. When I provisioned a Hadoop cluster on a slow machine, the provisioning failed with the following exception:
DEBUG sahara.utils.ssh_remote [testcluster-master-001] Executing "sudo su - -c "hadoop-daemon.sh start namenode" hadoop" from (pid=17022) _log_command /home/ubuntu/sahara/sahara/sahara/utils/ssh_remote.py:407
DEBUG sahara.utils.ssh_remote [testcluster-master-001] _execute_command took 5.9 seconds to complete from (pid=17022) _log_command /home/ubuntu/sahara/sahara/sahara/utils/ssh_remote.py:407
DEBUG sahara.utils.ssh_remote [testcluster-master-001] Executing "sudo su - -c "yarn-daemon.sh start resourcemanager" hadoop" from (pid=17022) _log_command /home/ubuntu/sahara/sahara/sahara/utils/ssh_remote.py:407
DEBUG sahara.utils.ssh_remote [testcluster-master-001] _execute_command took 2.9 seconds to complete from (pid=17022) _log_command /home/ubuntu/sahara/sahara/sahara/utils/ssh_remote.py:407
DEBUG sahara.utils.ssh_remote [testcluster-worker-001] Executing "sudo su - -c "hadoop-daemon.sh start datanode" hadoop" from (pid=17022) _log_command /home/ubuntu/sahara/sahara/sahara/utils/ssh_remote.py:407
DEBUG sahara.utils.ssh_remote [testcluster-worker-001] _execute_command took 6.3 seconds to complete from (pid=17022) _log_command /home/ubuntu/sahara/sahara/sahara/utils/ssh_remote.py:407
DEBUG sahara.utils.ssh_remote [testcluster-worker-002] Executing "sudo su - -c "hadoop-daemon.sh start datanode" hadoop" from (pid=17022) _log_command /home/ubuntu/sahara/sahara/sahara/utils/ssh_remote.py:407
DEBUG sahara.utils.ssh_remote [testcluster-worker-002] _execute_command took 5.6 seconds to complete from (pid=17022) _log_command /home/ubuntu/sahara/sahara/sahara/utils/ssh_remote.py:407
INFO sahara.plugins.vanilla.v2_3_0.run_scripts Waiting 2 datanodes to start up
DEBUG sahara.plugins.vanilla.v2_3_0.run_scripts Checking datanode count from (pid=17022) _check_datanodes_count /home/ubuntu/sahara/sahara/sahara/plugins/vanilla/v2_3_0/run_scripts.py:148
DEBUG sahara.utils.ssh_remote [testcluster-master-001] Executing "sudo su -lc "hadoop dfsadmin -report" hadoop | grep 'Datanodes available:' | awk '{print $3}'" from (pid=17022) _log_command /home/ubuntu/sahara/sahara/sahara/utils/ssh_remote.py:407
DEBUG sahara.openstack.common.periodic_task Running periodic task SaharaPeriodicTasks.update_job_statuses from (pid=17022) run_periodic_tasks /home/ubuntu/sahara/sahara/sahara/openstack/common/periodic_task.py:171
DEBUG sahara.service.periodic Updating job statuses from (pid=17022) update_job_statuses /home/ubuntu/sahara/sahara/sahara/service/periodic.py:62
DEBUG sahara.openstack.common.loopingcall Dynamic looping call sleeping for 44.92 seconds from (pid=17022) _inner /home/ubuntu/sahara/sahara/sahara/openstack/common/loopingcall.py:130
DEBUG sahara.utils.ssh_remote [testcluster-master-001] _execute_command took 19.1 seconds to complete from (pid=17022) _log_command /home/ubuntu/sahara/sahara/sahara/utils/ssh_remote.py:407
DEBUG sahara.plugins.vanilla.v2_3_0.run_scripts Datanode count='' from (pid=17022) _check_datanodes_count /home/ubuntu/sahara/sahara/sahara/plugins/vanilla/v2_3_0/run_scripts.py:153
ERROR sahara.service.api Can't start services for cluster 'testcluster' (reason: invalid literal for int() with base 10: '')
TRACE sahara.service.api Traceback (most recent call last):
TRACE sahara.service.api File "/home/ubuntu/sahara/sahara/sahara/service/api.py", line 220, in _provision_cluster
TRACE sahara.service.api plugin.start_cluster(cluster)
TRACE sahara.service.api File "/home/ubuntu/sahara/sahara/sahara/plugins/vanilla/plugin.py", line 58, in start_cluster
TRACE sahara.service.api cluster.hadoop_version).start_cluster(cluster)
TRACE sahara.service.api File "/home/ubuntu/sahara/sahara/sahara/plugins/vanilla/v2_3_0/versionhandler.py", line 67, in start_cluster
TRACE sahara.service.api run.await_datanodes(cluster)
TRACE sahara.service.api File "/home/ubuntu/sahara/sahara/sahara/plugins/vanilla/v2_3_0/run_scripts.py", line 129, in await_datanodes
TRACE sahara.service.api if _check_datanodes_count(r, datanodes_count):
TRACE sahara.service.api File "/home/ubuntu/sahara/sahara/sahara/plugins/vanilla/v2_3_0/run_scripts.py", line 155, in _check_datanodes_count
TRACE sahara.service.api return exit_code == 0 and int(stdout) == count
TRACE sahara.service.api ValueError: invalid literal for int() with base 10: ''
RPM package sahara has been built for project openstack/sahara 1.mira2. git.0c123e1. b0661a2
Package version == 2014.1.3, package release == fuel5.1.
Changeset: https:/ /review. fuel-infra. org/1336 ci/fuel- 5.1.1/2014. 1.1
project: openstack/sahara
branch: openstack-
author: Alexander Ignatov
committer: Alexander Ignatov
subject: Fixed a ValueError on provisioning cluster
status: patchset-created
Files placed on repository: 2014.1. 3-fuel5. 1.1.mira2. git.0c123e1. b0661a2. noarch. rpm
sahara-
NOTE: Changeset is not merged, created temporary package repository. osci-obs. vm.mirantis. net:82/ centos- fuel-5. 1.1-stable- 1336/centos
RPM repository URL: http://