Backup & Restore: AIO-DX Controller hangs at "recover-ceph-data"

Bug #1857146 reported by Kristine Bujold on 2019-12-20
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Medium
Ovidiu Poncea

Bug Description

Brief Description
-----------------

In AIO-DX configuration, the restore hangs at"[recover-ceph-data : Update host config data to get ceph-mon size]". This was experienced in SM 5-6 and reproduced 3 times. Logs are located /folk/cgts_logs/logs/LP-1857146

TASK [restore-platform/restore-more-data : Restart services] ***********************************************************************************
changed: [localhost] => (item=openstack-keystone)
changed: [localhost] => (item=fminit)
changed: [localhost] => (item=fm-api)
changed: [localhost] => (item=sysinv-api)
changed: [localhost] => (item=sysinv-conductor)
changed: [localhost] => (item=sysinv-agent)
changed: [localhost] => (item=openstack-barbican-api)

TASK [restore-platform/restore-more-data : Bring up Maintenance Agent] *************************************************************************
changed: [localhost]

TASK [restore-platform/restore-more-data : Wait for 90 secs before check if services come up] **************************************************
ok: [localhost]

TASK [restore-platform/restore-more-data : Make sure admin-keystone is ready] ******************************************************************
changed: [localhost]

TASK [restore-platform/restore-more-data : Check controller-0 is in online state] **************************************************************
changed: [localhost]

TASK [restore-platform/restore-more-data : Inform user that restore_platform is not successful] ************************************************

TASK [restore-platform/restore-more-data : Check if setup has storage nodes] *******************************************************************
changed: [localhost]

TASK [restore-platform/restore-more-data : Retrieve system mode] *******************************************************************************
changed: [localhost]

TASK [restore-platform/restore-more-data : Fail if system mode is not defined] *****************************************************************

TASK [restore-platform/restore-more-data : Set system mode fact] *******************************************************************************
ok: [localhost]

TASK [restore-platform/restore-more-data : Create flag file in /etc/platform to skip wiping OSDs] **********************************************
changed: [localhost]

TASK [include_role : recover-ceph-data] ********************************************************************************************************

TASK [recover-ceph-data : Restore ceph.conf file] **********************************************************************************************
changed: [localhost]

TASK [recover-ceph-data : Set initial ceph-mon name] *******************************************************************************************
ok: [localhost]

TASK [recover-ceph-data : Update host config data to get ceph-mon size] ************************************************************************

Severity
--------
Critical: Unable to restore active controller

Steps to Reproduce
------------------
1. Bring up the AIO-DX system
2. Backup the system using ansible locally
3. Re-install the controller with the same load
4. Restore the active controller

Expected Behavior
------------------
The active controller should be successfully restored

Actual Behavior
----------------
Restore hangs

Reproducibility
---------------
Reproducible

System Configuration
--------------------
AIO-DX

Branch/Pull Time/Commit
-----------------------
BUILD_ID="2019-12-18_00-10-00"

Test Activity
-------------
Developer Testing

description: updated
description: updated
Ghada Khalil (gkhalil) wrote :

Marking as stx.4.0 / medium priority - needs further investigation

tags: added: stx.4.0 stx.update
tags: added: stx.config
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
assignee: nobody → Ovidiu Poncea (ovidiu.poncea)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers