tripleo

When doing upgrade recoverable checks are leaving the cluster in a unknown state

Bug #1614907 reported by Sofer Athlan-Guyot on 2016-08-19

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	tripleo	Fix Released	High	Sofer Athlan-Guyot	tripleo newton-rc1 "newton-rc1"

Bug Description

Hi,

when doing the upgrade, numerous static checks are done during the major pacemaker upgrade step. They arrive in the script at various moments, like the check on rpm-python package, the disk size left of the bootstrap node and so on.

All those checks, if they fail, leave the cluster in more or less an unknown state. One has to go to the controller check what happen and put the cluster back into shape, fix the detected error and then maybe be able to upgrade again.

This is less than optimal situation.

A better way would be for all those tests to happen at the beginning of the upgrade. Then the operator would only have to fix the detected issue and re-run the upgrade again.

Tags:

Revision history for this message

Sofer Athlan-Guyot (sofer-athlan-guyot) wrote on 2016-08-19:

After re-reading the code again, they all happen before any serious change. But it would be nice to refactor them to make it obvious.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-08-19: Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.openstack.org/357750

Changed in tripleo:
assignee:	nobody → Sofer Athlan-Guyot (sofer-athlan-guyot)
status:	New → In Progress

Emilien Macchi (emilienm) on 2016-08-23

Changed in tripleo:
milestone:	none → newton-3
importance:	Undecided → High

Steven Hardy (shardy) on 2016-08-31

Changed in tripleo:
milestone:	newton-3 → newton-rc1

Sofer Athlan-Guyot (sofer-athlan-guyot) on 2016-08-31

tags:

removed: update-bugs

Emilien Macchi (emilienm) on 2016-09-16

Changed in tripleo:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-16: Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/357750
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=575e42b0287e37d3ef261c040fb3d331d3419801
Submitter: Jenkins
Branch: master

commit 575e42b0287e37d3ef261c040fb3d331d3419801
Author: Sofer Athlan-Guyot <email address hidden>
Date: Thu Aug 25 11:58:56 2016 +0200

Refactor upgrade checks.

    We make it clear that recoverable checks happen before starting the
    upgrade to be able to run the upgrade after the offending error has been
    manually corrected.

Add new check for the pcsd cluster status.

Add new check for galera password file: BZ 1357112

Closes-Bug: 1614907
Change-Id: If736c79121e1ffe0eaeb814bdb73ccbc0b64edcd

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.