tripleo

ceph redeployment should not use an old clusters fetch directory

Bug #1824527 reported by John Fulton on 2019-04-12

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	tripleo	Invalid	Medium	John Fulton	tripleo train-1

Bug Description

If the LocalCephAnsibleFetchDirectoryBackup [1] contains a backup for the ceph-ansible fetch directory for a ceph cluster from a previous deployment (with a different fsid), then do not use that fetch directory on a new deployment.

[1] https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/ceph-ansible/ceph-base.yaml#L177

Tags:

John Fulton (jfulton-org) on 2019-04-12

tags:

added: rocky-backport-potential
removed: stein-backport-potential

OpenStack Infra (hudson-openstack) on 2019-04-12

Changed in tripleo:
status:	Triaged → In Progress

John Fulton (jfulton-org) on 2019-04-12

Changed in tripleo:
importance:	High → Medium

Revision history for this message

John Fulton (jfulton-org) wrote on 2019-04-12:

Make sure you delete your overcloud correctly when testing the proposed fix [0] or you'll get the same FSID for each deployment.

If you just delete your overcloud with Heat like this, then the deployment plan is not deleted and the old FSID will be stored there. Then when you test the proposed change it won't work because you'll get the same FSID each time you deploy what should be a new overcloud.

openstack stack delete $overcloud_name

This is because the FSID is generated by the tripleo client [1] and then stored in the deployment plan. The deployment plan is deleted when you correctly delete your overcloud using a command like this:

openstack overcloud delete $overcloud_name

[0] https://review.openstack.org/#/c/652062
[1] https://github.com/openstack/tripleo-common/blob/master/tripleo_common/utils/passwords.py#L56

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-05-09: Change abandoned on tripleo-heat-templates (master)

Change abandoned by John Fulton (<email address hidden>) on branch: master
Review: https://review.opendev.org/652062
Reason: The fetch directory contains sub directories named after the FSID. Only the directory named after the FSID will be used during the deployment the others will be ignored. Though this could save a little time by skipping tasks to unpack the fetch directory it could also add extra complication and shouldn't be necessary.

Revision history for this message

John Fulton (jfulton-org) wrote on 2019-05-10:

Ceph redeployment will not use an old cluster's fetch directory because the fetch directory is only reused if the FSID matches. This is already the case because the items in fetch directory are only accessed using the path /path/to/fetch/$FSID/$item. Because the FSID is unique per deployment of Ceph, /path/to/fetch/$FSID/$item will always refer to the fetch directory appropriate for the deployment.

If the overcloud is not correctly deleted and the deployment plan remains because someone only deleted using Heat, "openstack stack delete overcloud", then that's a mistake. The correct way to delete the overcloud is "openstack overcloud delete overcloud". This will also remove the deployment plan so that the FSID in the deployment plan is not reused. Ideally we won't depend on the fetch directory but it shouldn't' harm anything provided that TripleO is used correctly.

Changed in tripleo:
status:	In Progress → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.