ceph redeployment should not use an old clusters fetch directory

Bug #1824527 reported by John Fulton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Invalid
Medium
John Fulton

Bug Description

If the LocalCephAnsibleFetchDirectoryBackup [1] contains a backup for the ceph-ansible fetch directory for a ceph cluster from a previous deployment (with a different fsid), then do not use that fetch directory on a new deployment.

[1] https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/ceph-ansible/ceph-base.yaml#L177

tags: added: rocky-backport-potential
removed: stein-backport-potential
Changed in tripleo:
status: Triaged → In Progress
Changed in tripleo:
importance: High → Medium
Revision history for this message
John Fulton (jfulton-org) wrote :

Make sure you delete your overcloud correctly when testing the proposed fix [0] or you'll get the same FSID for each deployment.

If you just delete your overcloud with Heat like this, then the deployment plan is not deleted and the old FSID will be stored there. Then when you test the proposed change it won't work because you'll get the same FSID each time you deploy what should be a new overcloud.

 openstack stack delete $overcloud_name

This is because the FSID is generated by the tripleo client [1] and then stored in the deployment plan. The deployment plan is deleted when you correctly delete your overcloud using a command like this:

 openstack overcloud delete $overcloud_name

[0] https://review.openstack.org/#/c/652062
[1] https://github.com/openstack/tripleo-common/blob/master/tripleo_common/utils/passwords.py#L56

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by John Fulton (<email address hidden>) on branch: master
Review: https://review.opendev.org/652062
Reason: The fetch directory contains sub directories named after the FSID. Only the directory named after the FSID will be used during the deployment the others will be ignored. Though this could save a little time by skipping tasks to unpack the fetch directory it could also add extra complication and shouldn't be necessary.

Revision history for this message
John Fulton (jfulton-org) wrote :

Ceph redeployment will not use an old cluster's fetch directory because the fetch directory is only reused if the FSID matches. This is already the case because the items in fetch directory are only accessed using the path /path/to/fetch/$FSID/$item. Because the FSID is unique per deployment of Ceph, /path/to/fetch/$FSID/$item will always refer to the fetch directory appropriate for the deployment.

If the overcloud is not correctly deleted and the deployment plan remains because someone only deleted using Heat, "openstack stack delete overcloud", then that's a mistake. The correct way to delete the overcloud is "openstack overcloud delete overcloud". This will also remove the deployment plan so that the FSID in the deployment plan is not reused. Ideally we won't depend on the fetch directory but it shouldn't' harm anything provided that TripleO is used correctly.

Changed in tripleo:
status: In Progress → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.