M/N upgrades - A few major-upgrade issues
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Michele Baldessari |
Bug Description
We have a bunch of smaller problems in the major-upgrade logic currently:
1. We now explicitly disable/stop and then remove the resources that are moving to systemd. We do this because we want to make sure they are all stopped before doing a yum upgrade, which otherwise would take ages due to rabbitmq and galera being down. It is best if we do this via pcs while we do the HA Full -> HA NG migration because it is simpler to make sure all the services are stopped at that stage. For extra safety we can still do a check by hand. By doing it via pacemaker we have the guarantee that all the migrated services are down already when we stop the cluster (which happens to be a syncronization point between all controller nodes). That way we can be certain that they
are all down on all nodes before starting the yum upgrade process.
2. We actually need to start the systemd services in major_upgrade_
3. We need to use the proper bash variable name
4. Use is_bootstrap_node everywhere to make the code more consistent
1. Another reason it is best to stop them via pcs is that if they are stopped via systemd on non-bootstrap nodes, before the corresponding pcs resource is deleted, check_resource_ systemd will barf with something like:
Fri Sep 23 16:54:45 UTC 2016 1a9879f7- 3e6c-457f- 8e1f-3a9a16d521 93 tripleo-upgrade overcloud- controller- 1 Going to systemctl stop httpd 3e6c-457f- 8e1f-3a9a16d521 93 tripleo-upgrade overcloud- controller- 1 Going to check_resource_ systemd for httpd to be stopped\nERROR - httpd not found to be systemd managed.
Fri Sep 23 16:54:46 UTC 2016 1a9879f7-