Fuel for OpenStack

reboot_plugin: Cluster failed on deploy because nodes are offline

Bug #1624439 reported by ElenaRossokhina on 2016-09-16

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Invalid	High	Alexey Stupnikov	Fuel for OpenStack 9.2

Bug Description

Detailed bug description:
Found on CI https://product-ci.infra.mirantis.net/job/9.x.system_test.ubuntu.fuel_plugin_reboot/62/testReport/(root)/deploy_cluster_with_reboot_plugin/
Steps to reproduce:
    1. Revert snapshot with 5 nodes
    2. Download and install fuel-plugin-builder
    3. Create plugin with reboot task
    4. Build plugin and copy it in var directory
    5. Install plugin to fuel
    6. Create cluster and enable plugin
    7. Provision nodes
    8. Collect timestamps from nodes
    9. Deploy cluster (fail here)
    10. Check if timestamps are changed

or system test deploy_cluster_with_reboot_plugin

Expected results:
Deploy is successful
Actual result:
Deploy failed with following message 'Nodes "slave-02_compute_ceph-osd (id=1, mac=64:2f:76:17:82:c4),slave-01_controller_ceph-osd (id=2, mac=64:c1:99:2c:34:33),slave-03_compute (id=3, mac=64:46:17:fb:91:a7)" are offline. Remove them from environment and try again.'
It is possible, time limit for reboot was expired.

Tags:

Revision history for this message

ElenaRossokhina (esolomina) wrote on 2016-09-16:

it's strange that slave-03_compute (id=3, mac=64:46:17:fb:91:a7) was offline
in test we suppose, that compute node is not rebooted by this task

Changed in fuel:
assignee:	nobody → Fuel QA Team (fuel-qa)
milestone:	none → 9.1

Revision history for this message

ElenaRossokhina (esolomina) wrote on 2016-09-28:

New failed case https://product-ci.infra.mirantis.net/job/9.x.system_test.ubuntu.fuel_plugin_reboot/75/testReport/(root)/deploy_cluster_with_reboot_plugin/

Alexander Kurenyshev (akurenyshev) on 2016-09-28

Changed in fuel:
assignee:	Fuel QA Team (fuel-qa) → Fuel Sustaining (fuel-sustaining-team)
importance:	Undecided → High

Dmitry Pyzhov (dpyzhov) on 2016-09-29

Changed in fuel:
status:	New → Confirmed
milestone:	9.1 → 9.2

Revision history for this message

Dmitry Belyaninov (dbelyaninov) wrote on 2016-10-04:

https://product-ci.infra.mirantis.net/job/9.x.system_test.ubuntu.fuel_plugin_reboot/82/testReport/(root)/deploy_cluster_with_reboot_plugin/deploy_cluster_with_reboot_plugin/

Revision history for this message

Alexey Stupnikov (astupnikov) wrote on 2016-11-25:

There was a bug #1617329 with the same symptoms: slave nodes just didn't finish reboot process in time. A time-out value was increased to fix this issue.

Revision history for this message

Alexey Stupnikov (astupnikov) wrote on 2016-11-25:

I have doublechecked everything and it looks like we have another situation at the moment (the reported problem was solved): tests are no longer failing because of offline nodes. Right now failures are caused by 500 error:

2016-11-24 22:29:09.533 ERROR [7f3540682880] (logger) Response code '500 Internal Server Error' for POST /api/clusters from 10.109.0.1:46530

This bug is invalid and a new one should be opened with new issue reported.