[IDH] Cluster failed to scale down

Bug #1289055 reported by Andrew Lazarev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Sahara
Fix Released
High
Andrew Lazarev

Bug Description

IDH 3.0.2

Trying to scale down.

Stacktrace:

2014-03-06 22:56:23.950 4441 DEBUG savanna.plugins.intel.client.rest [-] Sending GET to URL of https://172.18.168.127:9443/restapi/intelcloud/api/v2/cluster/al-idh/nodes/commands/datanodes/status get /home/ubuntu/savanna/savanna/plugins/intel/client/rest.py:54
2014-03-06 22:56:23.957 4441 ERROR savanna.context [-] Thread 'cluster-scaling-6463d35a-833a-478d-9120-65eb7ac241d1' fails with exception: 'Datanode service is is not installed on node 'al-idh-worker-004.novalocal''
2014-03-06 22:56:23.957 4441 TRACE savanna.context Traceback (most recent call last):
2014-03-06 22:56:23.957 4441 TRACE savanna.context File "/home/ubuntu/savanna/savanna/context.py", line 124, in _wrapper
2014-03-06 22:56:23.957 4441 TRACE savanna.context func(*args, **kwargs)
2014-03-06 22:56:23.957 4441 TRACE savanna.context File "/home/ubuntu/savanna/savanna/service/api.py", line 160, in _provision_scaled_cluster
2014-03-06 22:56:23.957 4441 TRACE savanna.context plugin.decommission_nodes(cluster, instances_to_delete)
2014-03-06 22:56:23.957 4441 TRACE savanna.context File "/home/ubuntu/savanna/savanna/plugins/intel/plugin.py", line 68, in decommission_nodes
2014-03-06 22:56:23.957 4441 TRACE savanna.context cluster.hadoop_version).decommission_nodes(cluster, instances)
2014-03-06 22:56:23.957 4441 TRACE savanna.context File "/home/ubuntu/savanna/savanna/plugins/intel/v3_0_2/versionhandler.py", line 87, in decommission_nodes
2014-03-06 22:56:23.957 4441 TRACE savanna.context ins.decommission_nodes(cluster, instances)
2014-03-06 22:56:23.957 4441 TRACE savanna.context File "/home/ubuntu/savanna/savanna/plugins/intel/v3_0_2/installer.py", line 412, in decommission_nodes
2014-03-06 22:56:23.957 4441 TRACE savanna.context host) == 'Decomissioned':
2014-03-06 22:56:23.957 4441 TRACE savanna.context File "/home/ubuntu/savanna/savanna/plugins/intel/client/services.py", line 112, in get_datanode_status
2014-03-06 22:56:23.957 4441 TRACE savanna.context "Datanode service is is not installed on node '%s'" % datanode)
2014-03-06 22:56:23.957 4441 TRACE savanna.context IntelPluginException: Datanode service is is not installed on node 'al-idh-worker-004.novalocal'

Manual request to manager:
[cloud-user@al-idh-manager-001 ~]$ curl -k -u admin:admin -g https://172.18.168.127:9443/restapi/intelcloud/api/v2/cluster/al-idh/nodes/commands/datanodes/status
{"items":[{"status":"Stopped","hostname":"al-idh-manager-001"},{"status":"Stopped","hostname":"al-idh-master-001"},{"status":"Running","hostname":"al-idh-worker-001"},{"status":"Running","hostname":"al-idh-worker-002"},{"status":"Running","hostname":"al-idh-worker-003"},{"status":"Decomissioned","hostname":"al-idh-worker-004"}]}

It looks like manager responds with "hostname":"al-idh-worker-004" but we are looking for "al-idh-worker-004.novalocal"

Not sure if this is a change in behavior since IDH 2.5.1.

Changed in savanna:
assignee: nobody → Andrew Lazarev (alazarev)
importance: Undecided → High
milestone: none → icehouse-rc1
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to savanna (master)

Fix proposed to branch: master
Review: https://review.openstack.org/78876

Changed in savanna:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to sahara (master)

Reviewed: https://review.openstack.org/78876
Committed: https://git.openstack.org/cgit/openstack/sahara/commit/?id=727adec2a593a27d907baf448776cfc7653888c7
Submitter: Jenkins
Branch: master

commit 727adec2a593a27d907baf448776cfc7653888c7
Author: Andrew Lazarev <email address hidden>
Date: Thu Mar 6 22:46:30 2014 -0800

    [IDH] Fixed cluster scale down

    There were two problems:
    1. Datanoded status started to return node without domain
    2. Wrong services were used to get processes status

    Change-Id: I34da2e36328f22fff9636aac9729887dd8fd94ce
    Closes-Bug: #1289055

Changed in sahara:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in sahara:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in sahara:
milestone: icehouse-rc1 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.