Nodes provision failed after environment reset

Bug #1646497 reported by Mikhail Chernik
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Vladimir Kozhukalov

Bug Description

MOS 9.2, snapshot #577

Detailed description:

Environment nods provision is failed if after either failed or successful deployment a cluster was reset and provisioned again. After provisioning process and reboot nodes boot into bootstrap again instead of booting from disk.

Error message: "Provision has failed. #<RuntimeError: Could not find any hosts in discovery data provided>"

We've run into this issue on hardware lab, here is diagnostic snapshot: http://mos-scale-share.mirantis.com/fuel-snapshot-2016-12-01_13-20-15.tar

Steps to reproduce:
1. Configure a new cluster
2. Deploy it (it does not matter whether deployment succeeds or not)
3. Reset cluster
4. Provision or deploy the cluster again

Expected result:
Provisioning is successful, deployment status is same as for step 2

Actual result:
Provision failed with message "Provision has failed. #<RuntimeError: Could not find any hosts in discovery data provided>" for every node.

Workaround:
Delete the cluster and create a new one from scratch

Changed in fuel:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :

The issue appears when a user resets an environment and then immediately starts redeployment (not waiting for reset to finish).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-astute (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/410658

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-astute (stable/mitaka)

Reviewed: https://review.openstack.org/410658
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=17500148daf4068ee8920438e8bc4be619c4f4b5
Submitter: Jenkins
Branch: stable/mitaka

commit 17500148daf4068ee8920438e8bc4be619c4f4b5
Author: Vladimir Kozhukalov <email address hidden>
Date: Wed Dec 14 12:56:28 2016 +0300

    Increase reboot timeout to 600

    Sometimes 240 seconds is not enough for a hardware node
    to reboot. If so, a user will encounter the error
    "Time detection (240 sec) for node reboot has expired"
    and reset+reprovision will fail.

    Change-Id: Icd734acfd82cbf6d38bb6dda86a3f3be012dd054
    Closes-Bug: #1646497

tags: added: in-stable-mitaka
Changed in fuel:
status: Confirmed → Fix Committed
tags: added: on-verification
Revision history for this message
Ekaterina Shutova (eshutova) wrote :

Verified on 9.2 snapshot #641.

tags: removed: on-verification
Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.