Ensure non-controller are usable after upgrade and before converge.

Bug #1708115 reported by Sofer Athlan-Guyot
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Marios Andreou

Bug Description

Hi,

in the previous iteration we had that mechanism in place https://github.com/openstack/tripleo-heat-templates/blob/master/extraconfig/tasks/tripleo_upgrade_node.sh#L61-L69 for ensure that non-controller node were working after the upgrade and before the converge.

This is especially critical for compute node which should be able to get vm before the convergence step.

For compute node we also have to ensure that rpc pin/unpin happen within the nova_compute container using this parameter UpgradeLevelNovaCompute.

Thanks,

also discussed at https://bugzilla.redhat.com/show_bug.cgi?id=1477962

description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-common (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/490847

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/490848

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to python-tripleoclient (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/490852

Revision history for this message
Marios Andreou (marios-b) wrote :

Hello, as we discussed I spent some time looking here today. My current proposal is:

* continue to use the tripleo_upgrade_node.sh to essentially replace the upgrade_tasks (i.e. stop the things, as we have started adding already).

* Add execution of the ansible playbook into the upgrade-non-controller.sh. Shardy reviews, esp I96ec09bc788836584c4b39dcce5bf9b80e914c71 makes it so we can get the deploy-steps as stack output. We can then use the config download to execute this on the undercloud (where upgrade-non-controller.sh is running anyway).

Relevant reviews here are https://review.openstack.org/490847 (download and execute the deploy-steps) and https://review.openstack.org/490852 so we can specify a named path.

An alternative we might consider is to get actual upgrade_tasks instead of using the tripleo_upgrade_node.sh, however the biggest obstacle to that right now is we'd need to re-write the upgrade_tasks and break the current composable-steps upgrade workflow. I proposed https://review.openstack.org/490848 for the discussion anyway.

Revision history for this message
Marios Andreou (marios-b) wrote :

apologies, "shardy reviews, esp I96ec09bc788836584c4b39dcce5bf9b80e914c71" is here https://review.openstack.org/#/c/485731/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/491749

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on python-tripleoclient (master)

Change abandoned by Marios Andreou (<email address hidden>) on branch: master
Review: https://review.openstack.org/490852
Reason: don't think this is really necessary right now (maybe nice to have later) abandon for now to make the review chain clearer

Revision history for this message
Marios Andreou (marios-b) wrote :

Slight change in the proposal:

With the help of a utility function in https://review.openstack.org/#/c/491749/ (python-tripleoclient) we can use the upgrade_tasks playbook generated by the tripleo-heat-templates at https://review.openstack.org/#/c/490848/ (note: this depends on a few shardy tht reviews see shortlog).

So, in the upgrade-non-controller.sh script, we add download and execution of both the upgrade_tasks and deploy_steps playbooks with https://review.openstack.org/#/c/490847/ (tripleo-common).

The generated playbooks look like https://paste.fedoraproject.org/paste/gUi5Ckq2qoTT~ed5kItxRw/raw (while it lasts)... seems like most of the things we need for the compute and swift nodes are in the ugprade_tasks (e.g. stop openstack-nova-compute which we had to add recently into the tripleo_upgrade_node.sh).

Reviews:
     (tripleo-common): https://review.openstack.org/#/c/490847/ "Download and run upgrade/deploy_steps_playbooks for upgrade"
     |
     |Depends-On:
     |
     -->(tripleo-heat-templates): https://review.openstack.org/#/c/490848/ "Also write an upgrade_(batch)_tasks playbook" (&see shortlog!)
        |
        |Depends-On:
        |
        -->(python-tripleo-client): https://review.openstack.org/#/c/491749/ "Adds when in upgrade_tasks playbook written by config download"

Revision history for this message
Lee Yarwood (lyarwood) wrote :

A quick comment on c#8, maybe this is a known issue at the moment but the example playbook above only lists the upgrade_tasks from the original puppet service template [1] and not the newer docker service templates where we are disabling the original services on the host [2].

[1] https://github.com/openstack/tripleo-heat-templates/blob/6976b8f6502394b09fb502666a47c0b2fcbc5304/puppet/services/nova-compute.yaml#L214-L232
[2] https://github.com/openstack/tripleo-heat-templates/blob/6976b8f6502394b09fb502666a47c0b2fcbc5304/docker/services/nova-compute.yaml#L144-L147

Revision history for this message
Marios Andreou (marios-b) wrote :

13:01 < lyarwood> marios: when you get back, I've added a note in https://bugs.launchpad.net/tripleo/+bug/1708115 re the tasks we end up with, seems we are missing the docker/services/*.yaml upgrade_tasks.
13:03 < marios> lyarwood: yeah you'll get the 'right' ones when you run the upgrade with the docker env files... i.e. upgrade_tasks are collected for each service, and each is resolved in the registry/env files as to whether it will point to puppet/services/foo.yaml or docker/services/foo.yaml
13:03 < marios> lyarwood: and the upgrade_tasks will be the 'right' ones
13:04 < lyarwood> marios: cool, assumed it was something like that but just wanted to be sure

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to python-tripleoclient (master)

Reviewed: https://review.openstack.org/491749
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=c2ea613919170b13c1d39456b417f6536ac9c81b
Submitter: Jenkins
Branch: master

commit c2ea613919170b13c1d39456b417f6536ac9c81b
Author: marios <email address hidden>
Date: Tue Aug 8 13:34:48 2017 +0300

    Adds when in upgrade_tasks playbook written by config download

    As part of the bug below, this can be used to make sure the
    playbooks generated from the heat output can be iterated over
    using the loop variable.

    This adds a pre-processor for upgrade_tasks that adds a "when
    step == N" condition based on the value of the tags. That is
    "tags: step1" becomes "when: step == 1". When there is an
    existing when statement the new step condition is appended.

    Change-Id: Ief593dc758a2ffe33c1cbcbda9289393fcf023e4
    Related-Bug: 1708115

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/490848
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=060ff37c4f64f2baf3dca89f81cd1d4d0a278416
Submitter: Jenkins
Branch: master

commit 060ff37c4f64f2baf3dca89f81cd1d4d0a278416
Author: marios <email address hidden>
Date: Fri Aug 4 14:55:48 2017 +0300

    Also write an upgrade_tasks_playbook

    To get this to work upgrade_tasks need to be rewritten with 'when'
    statements like the update tasks (in parent review from shardy).
    So that we don't break the existing upgrades workflow, we add these
    as part of the config download see the depends on

    Related-Bug: 1708115
    Depends-On: Ief593dc758a2ffe33c1cbcbda9289393fcf023e4
    Change-Id: Ib01b96a2c26721747d81d98e3d57c4c388663004

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-common (master)

Reviewed: https://review.openstack.org/490847
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=c61618fd576638e88f5a01bf094739c47feab4b6
Submitter: Jenkins
Branch: master

commit c61618fd576638e88f5a01bf094739c47feab4b6
Author: marios <email address hidden>
Date: Fri Aug 4 15:24:40 2017 +0300

    Download and run upgrade/deploy_steps_playbooks for upgrade

    For non controller upgrade add download and execution of the
    upgrade and deploy steps playbooks

    We may want to remove the tripleo_upgrade_node.sh completely

    Related-Bug: 1708115
    Depends-On: Ib01b96a2c26721747d81d98e3d57c4c388663004
    Change-Id: I534eb282ab5c32f62965930924f791fd2da755b1

Changed in tripleo:
milestone: pike-rc1 → pike-rc2
Changed in tripleo:
assignee: nobody → Marios Andreou (marios-b)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/498776

Revision history for this message
Marios Andreou (marios-b) wrote :

just also posted https://review.openstack.org/498776 for disabling the puppet config run and related workarounds from the tripleo-upgrade-node.sh script. If testing you'll also need to apply this on your tripleo-heat-templates before running the major-upgrade-composable-steps-docker.yaml stage of the overcloud upgrade.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/498776
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=4c5b9c5c967105536106fa4a7e1ec2352b14b08c
Submitter: Jenkins
Branch: master

commit 4c5b9c5c967105536106fa4a7e1ec2352b14b08c
Author: marios <email address hidden>
Date: Tue Aug 29 14:29:37 2017 +0300

    Remove puppet run and workarounds from tripleo_upgrade_node.sh

    For bug 1708115 and the O..P upgrade, and for the upgrade of
    'non-controlers' we are now generating ansible playbooks from
    collected service upgrade_tasks and these are executed instead
    of the legacy tripleo_upgrade_node.sh.

    To clarify, by 'non-controllers' it is meant any node for which
    the corresponding roles_data.yaml role has the
    disable_upgrade_deployment flag set True.

    As a first pass, I am removing the workarounds from the script but
    keeping its delivery mechanism for now in case it is needed still.
    We can either update here to remove it or keep it until next cycle

    The most important part for now is that we no longer 'manually'
    run puppet here. Instead the post_deploy_steps are also collected
    into a playbook and will be executed after the upgrade_tasks
    (see the bug for discussion of the mechanism and related reviews)

    Change-Id: Ib017b0ab435ca9558cf8659d434489cdf01df955
    Related-Bug: 1708115

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/pike)

Related fix proposed to branch: stable/pike
Review: https://review.openstack.org/499625

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/pike)

Reviewed: https://review.openstack.org/499625
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=2b07da8aaaf27f6ce9fb33acd597a19522e29b67
Submitter: Jenkins
Branch: stable/pike

commit 2b07da8aaaf27f6ce9fb33acd597a19522e29b67
Author: marios <email address hidden>
Date: Tue Aug 29 14:29:37 2017 +0300

    Remove puppet run and workarounds from tripleo_upgrade_node.sh

    For bug 1708115 and the O..P upgrade, and for the upgrade of
    'non-controlers' we are now generating ansible playbooks from
    collected service upgrade_tasks and these are executed instead
    of the legacy tripleo_upgrade_node.sh.

    To clarify, by 'non-controllers' it is meant any node for which
    the corresponding roles_data.yaml role has the
    disable_upgrade_deployment flag set True.

    As a first pass, I am removing the workarounds from the script but
    keeping its delivery mechanism for now in case it is needed still.
    We can either update here to remove it or keep it until next cycle

    The most important part for now is that we no longer 'manually'
    run puppet here. Instead the post_deploy_steps are also collected
    into a playbook and will be executed after the upgrade_tasks
    (see the bug for discussion of the mechanism and related reviews)

    Change-Id: Ib017b0ab435ca9558cf8659d434489cdf01df955
    Related-Bug: 1708115
    (cherry picked from commit 4c5b9c5c967105536106fa4a7e1ec2352b14b08c)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to python-tripleoclient (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/500498

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to python-tripleoclient (master)

Reviewed: https://review.openstack.org/499540
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=53f79082ca89937cf5997e62f472f436cf886884
Submitter: Jenkins
Branch: master

commit 53f79082ca89937cf5997e62f472f436cf886884
Author: Marius Cornea <email address hidden>
Date: Thu Aug 31 11:39:16 2017 +0200

    Convert step to integer in when statement for upgrade tasks

    Currently the when conditionals in the upgrade tasks aren't evaluated
    correctly and as a result the upgrade tasks are skipped. This change
    converts the step variable in the when statement to an integer to get
    it evaluated properly.

    Related-Bug: 1708115
    Change-Id: I4ee1a2729d74442570f1b1f38b0d03a95ea7793f

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on python-tripleoclient (master)

Change abandoned by Marios Andreou (<email address hidden>) on branch: master
Review: https://review.openstack.org/500498
Reason: thanks for checking it mcornea abandoning for the original @ https://review.openstack.org/#/c/499517/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/499517
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=e2f00ef1dc98140087c81e202a520f549f9a0970
Submitter: Jenkins
Branch: master

commit e2f00ef1dc98140087c81e202a520f549f9a0970
Author: Marius Cornea <email address hidden>
Date: Thu Aug 31 10:32:30 2017 +0200

    Allow upgrade tasks to run when looping through steps

    Currently for non controller upgrades we're looping through the
    upgrade steps and run the upgrade tasks based on when conditionals
    including the step number and the existing upgrade task condition.
    Some of tasks fail because the variables used in when conditionals
    are not available through all steps. This change adds default values
    to these vars where possible or creates them for all steps to avoid
    failures.

    Related-Bug: 1708115
    Change-Id: I5c731043cec8e31fc82ca98972a301baa7294c4f

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/pike)

Related fix proposed to branch: stable/pike
Review: https://review.openstack.org/500596

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to python-tripleoclient (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/500749

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to python-tripleoclient (stable/pike)

Related fix proposed to branch: stable/pike
Review: https://review.openstack.org/500751

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: stable/pike
Review: https://review.openstack.org/500752

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: stable/pike
Review: https://review.openstack.org/500787

Changed in tripleo:
milestone: pike-rc2 → queens-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/pike)

Reviewed: https://review.openstack.org/500596
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=f2d0901270b6781a5b8ed37aab47da51b79e907c
Submitter: Jenkins
Branch: stable/pike

commit f2d0901270b6781a5b8ed37aab47da51b79e907c
Author: Marius Cornea <email address hidden>
Date: Thu Aug 31 10:32:30 2017 +0200

    Allow upgrade tasks to run when looping through steps

    Currently for non controller upgrades we're looping through the
    upgrade steps and run the upgrade tasks based on when conditionals
    including the step number and the existing upgrade task condition.
    Some of tasks fail because the variables used in when conditionals
    are not available through all steps. This change adds default values
    to these vars where possible or creates them for all steps to avoid
    failures.

    Related-Bug: 1708115
    Change-Id: I5c731043cec8e31fc82ca98972a301baa7294c4f
    (cherry picked from commit e2f00ef1dc98140087c81e202a520f549f9a0970)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to python-tripleoclient (master)

Reviewed: https://review.openstack.org/500749
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=fc159bc8d98a06549b6b34928897a092a751928b
Submitter: Jenkins
Branch: master

commit fc159bc8d98a06549b6b34928897a092a751928b
Author: marios <email address hidden>
Date: Tue Sep 5 12:28:27 2017 +0300

    Fix py27 tests - expand the regex when adding 'when' to playbook

    In I4ee1a2729d74442570f1b1f38b0d03a95ea7793f the 'when' condition
    written to the upgrade_tasks playbook got an |int but we didn't
    expand the search for an existing 'when' to take this into account

    Change-Id: I69f28a0fabd75eb19c0eedfcbdc037094a9ddb50
    Related-Bug: 1708115

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to python-tripleoclient (stable/pike)

Reviewed: https://review.openstack.org/500751
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=874ceb62d5975d5f189422ea565193ab46db7886
Submitter: Jenkins
Branch: stable/pike

commit 874ceb62d5975d5f189422ea565193ab46db7886
Author: marios <email address hidden>
Date: Tue Aug 8 13:34:48 2017 +0300

    Adds when in upgrade_tasks playbook written by config download

    As part of the bug below, this can be used to make sure the
    playbooks generated from the heat output can be iterated over
    using the loop variable.

    This adds a pre-processor for upgrade_tasks that adds a "when
    step == N" condition based on the value of the tags. That is
    "tags: step1" becomes "when: step == 1". When there is an
    existing when statement the new step condition is appended.

    Change-Id: Ief593dc758a2ffe33c1cbcbda9289393fcf023e4
    Related-Bug: 1708115
    (cherry picked from commit c2ea613919170b13c1d39456b417f6536ac9c81b)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/500787
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=7227574d7625e89d6cbd9edb1d5109aab4231cc7
Submitter: Jenkins
Branch: stable/pike

commit 7227574d7625e89d6cbd9edb1d5109aab4231cc7
Author: marios <email address hidden>
Date: Tue Sep 5 12:28:27 2017 +0300

    Fix py27 tests - expand the regex when adding 'when' to playbook

    In I4ee1a2729d74442570f1b1f38b0d03a95ea7793f the 'when' condition
    written to the upgrade_tasks playbook got an |int but we didn't
    expand the search for an existing 'when' to take this into account

    Change-Id: I69f28a0fabd75eb19c0eedfcbdc037094a9ddb50
    Related-Bug: 1708115
    (cherry picked from commit fc159bc8d98a06549b6b34928897a092a751928b)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/500752
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=b1a22aeaa80f532720d0cff9da4a604bd8162438
Submitter: Jenkins
Branch: stable/pike

commit b1a22aeaa80f532720d0cff9da4a604bd8162438
Author: Marius Cornea <email address hidden>
Date: Thu Aug 31 11:39:16 2017 +0200

    Convert step to integer in when statement for upgrade tasks

    Currently the when conditionals in the upgrade tasks aren't evaluated
    correctly and as a result the upgrade tasks are skipped. This change
    converts the step variable in the when statement to an integer to get
    it evaluated properly.

    Related-Bug: 1708115
    Change-Id: I4ee1a2729d74442570f1b1f38b0d03a95ea7793f
    (cherry picked from commit 53f79082ca89937cf5997e62f472f436cf886884)

Revision history for this message
Alex Schultz (alex-schultz) wrote :

I see a bunch of commits on this bug, is there still work to be done around this or is it completed?

Changed in tripleo:
status: Triaged → Fix Released
status: Fix Released → Fix Committed
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.