Minor update of HA services doesn't work if container image names changes

Bug #1854730 reported by Damien Ciabrini
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Damien Ciabrini

Bug Description

In HA overclouds, the container image to use for a HA service is
configured in a pacemaker resource definition. This information is
shared across all the nodes of the a cluster <node A, node B, node C>,
so changing the container image information restarts the service on
all nodes in the cluster.

To implement container update without service disruption, the HA
service doesn't directly use the container image configured by Heat,
e.g. <registry/imagename:tagX>. Instead, it uses a "floating" image
tag <registry/imagename:pcmklatest>, which points to the image that is
currently in use on the node (i.e. <registry/imagename:tagX>).

That way, when a service needs to be updated, the update task
pulls a new image <registry/imagename:tagY> locally on the node,
then retags <registry/imagename:pcmklatest> to point to the pulled
image <registry/imagename:tagY>, and restart the container locally on
the node.

The limiit of that approach is that only the tag part of the container
image is allowed to be updated during a minor update. If the name
prefix is changed, the minor update will fail when is it run on the
first cluster node (e.g. node A):

  . the new image <registry/NEWIMAGE:tagZ> is pulled on node A

  . tag <registry/NEWIMAGE:pcmklatest> is created and points to
    <registry/NEWIMAGE:tagZ> on node A

  . pacemaker resource config is changed from <registry/imagename:pcmklatest>
    to <registry/NEWIMAGE:pcmklatest> globally

  . pacemaker triggers a restart of the container on all the cluster
    nodes.

At this points, nodes B and C try to start the container with a
non existent image <registry/NEWIMAGE:pcmklatest>, that can't be
pulled from the registry. This leaves the resource in error in the
cluster and makes the minor update fail.

Related: rhbz#1749988 rhbz#1655250

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.opendev.org/697959

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/696869
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=a166ec6bcaae078a4f7ed91feb8e431fe031e0cb
Submitter: Zuul
Branch: master

commit a166ec6bcaae078a4f7ed91feb8e431fe031e0cb
Author: Damien Ciabrini <email address hidden>
Date: Mon Dec 2 13:01:45 2019 +0100

    HA: minor update of arbitrary container image name

    HA services get their container image name from a pacemaker
    resource configuration. This image name is shared between
    all cluster nodes.

    To achieve image update without service disruption, a pacemaker
    resource is configured to use an intermediate image name
    "<registry>/<namespace>/<servicename>:pcmklatest" pointing to
    the real image name configured in Heat. This tag can then be
    updated independently on every node during the minor update.

    In order to support the same rolling update when the <namespace>
    changes in the container image, we need a similar floating
    approach for the prefix part of the container image.

    Introduce a new Heat parameter ClusterCommonTag that, when enabled,
    sets the intermediate image name to
    "cluster-common-tag/<servicename>:pcmklatest". By default, this
    parameter is disabled and the original naming scheme is conserved.

    Note: by introducing this new naming scheme, we stop seeing a
    meaningful image name prefix when doing a "pcs status", but since
    we already can't tell what image ID the :pcmklatest tag points to,
    we don't lose much information really.

    Related-Bug: #1854730

    Change-Id: Id369154d147cd5cf0a6f997bf806084fc7580e01

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/698274

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/697959
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=d4c1c84561b14109ad2b239ad0b5c15b395cf977
Submitter: Zuul
Branch: master

commit d4c1c84561b14109ad2b239ad0b5c15b395cf977
Author: Damien Ciabrini <email address hidden>
Date: Mon Dec 9 09:45:43 2019 +0100

    HA: enable cluster-common-tag naming scheme by default

    Since Id369154d147cd5cf0a6f997bf806084fc7580e01, HA services
    can now be configured to use a floating container image name
    "cluster-common-tag/<servicename>:pcmklatest" to allow
    minor update to a new image without service disruption, even
    when the namespace changes in the image URI configured in Heat.

    Enable that new naming scheme by default for pacemaker deployments.

    Closes-Bug: #1854730

    to allow minor update to arbitrary use container image name whose

    Change-Id: I7a63e8e2d9457c5025f3d70aeed6922e24958049

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/train)

Reviewed: https://review.opendev.org/698274
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=44b6e6b8520bc9660dd1c6ac44d91523c7c7c84a
Submitter: Zuul
Branch: stable/train

commit 44b6e6b8520bc9660dd1c6ac44d91523c7c7c84a
Author: Damien Ciabrini <email address hidden>
Date: Mon Dec 2 13:01:45 2019 +0100

    HA: minor update of arbitrary container image name

    HA services get their container image name from a pacemaker
    resource configuration. This image name is shared between
    all cluster nodes.

    To achieve image update without service disruption, a pacemaker
    resource is configured to use an intermediate image name
    "<registry>/<namespace>/<servicename>:pcmklatest" pointing to
    the real image name configured in Heat. This tag can then be
    updated independently on every node during the minor update.

    In order to support the same rolling update when the <namespace>
    changes in the container image, we need a similar floating
    approach for the prefix part of the container image.

    Introduce a new Heat parameter ClusterCommonTag that, when enabled,
    sets the intermediate image name to
    "cluster-common-tag/<servicename>:pcmklatest". By default, this
    parameter is disabled and the original naming scheme is conserved.

    Note: by introducing this new naming scheme, we stop seeing a
    meaningful image name prefix when doing a "pcs status", but since
    we already can't tell what image ID the :pcmklatest tag points to,
    we don't lose much information really.

    Related-Bug: #1854730

    Change-Id: Id369154d147cd5cf0a6f997bf806084fc7580e01
    (cherry picked from commit a166ec6bcaae078a4f7ed91feb8e431fe031e0cb)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/699511

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (stable/train)

Change abandoned by Emilien Macchi (<email address hidden>) on branch: stable/train
Review: https://review.opendev.org/699511
Reason: Clearing the gate now, see https://bugs.launchpad.net/tripleo/+bug/1856864
Do not restore the patch yet, I'll take care of it when the gate is back online.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/train)

Reviewed: https://review.opendev.org/699511
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=3b2f7e603826ce8ed6723c89afac8c9c6ccc5092
Submitter: Zuul
Branch: stable/train

commit 3b2f7e603826ce8ed6723c89afac8c9c6ccc5092
Author: Damien Ciabrini <email address hidden>
Date: Mon Dec 9 09:45:43 2019 +0100

    HA: enable cluster-common-tag naming scheme by default

    Since Id369154d147cd5cf0a6f997bf806084fc7580e01, HA services
    can now be configured to use a floating container image name
    "cluster-common-tag/<servicename>:pcmklatest" to allow
    minor update to a new image without service disruption, even
    when the namespace changes in the image URI configured in Heat.

    Enable that new naming scheme by default for pacemaker deployments.

    Closes-Bug: #1854730

    to allow minor update to arbitrary use container image name whose

    Change-Id: I7a63e8e2d9457c5025f3d70aeed6922e24958049
    (cherry picked from commit d4c1c84561b14109ad2b239ad0b5c15b395cf977)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 11.3.1

This issue was fixed in the openstack/tripleo-heat-templates 11.3.1 release.

Changed in tripleo:
milestone: none → ussuri-1
importance: Medium → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 12.1.0

This issue was fixed in the openstack/tripleo-heat-templates 12.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/stein)

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/713406

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.opendev.org/713411

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (stable/queens)

Related fix proposed to branch: stable/queens
Review: https://review.opendev.org/713412

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/stein)

Reviewed: https://review.opendev.org/713406
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=f366adbd151193ba524d70c4f904be35f1d048a0
Submitter: Zuul
Branch: stable/stein

commit f366adbd151193ba524d70c4f904be35f1d048a0
Author: Damien Ciabrini <email address hidden>
Date: Mon Dec 2 13:01:45 2019 +0100

    HA: minor update of arbitrary container image name

    HA services get their container image name from a pacemaker
    resource configuration. This image name is shared between
    all cluster nodes.

    To achieve image update without service disruption, a pacemaker
    resource is configured to use an intermediate image name
    "<registry>/<namespace>/<servicename>:pcmklatest" pointing to
    the real image name configured in Heat. This tag can then be
    updated independently on every node during the minor update.

    In order to support the same rolling update when the <namespace>
    changes in the container image, we need a similar floating
    approach for the prefix part of the container image.

    Introduce a new Heat parameter ClusterCommonTag that, when enabled,
    sets the intermediate image name to
    "cluster-common-tag/<servicename>:pcmklatest". By default, this
    parameter is disabled and the original naming scheme is conserved.

    Note: by introducing this new naming scheme, we stop seeing a
    meaningful image name prefix when doing a "pcs status", but since
    we already can't tell what image ID the :pcmklatest tag points to,
    we don't lose much information really.

    Related-Bug: #1854730

    Change-Id: Id369154d147cd5cf0a6f997bf806084fc7580e01
    (cherry picked from commit a166ec6bcaae078a4f7ed91feb8e431fe031e0cb)
    (cherry picked from commit 44b6e6b8520bc9660dd1c6ac44d91523c7c7c84a)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/rocky)

Reviewed: https://review.opendev.org/713411
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=2487becc6a6d263e55c9a642d51994a72c1cb9aa
Submitter: Zuul
Branch: stable/rocky

commit 2487becc6a6d263e55c9a642d51994a72c1cb9aa
Author: Damien Ciabrini <email address hidden>
Date: Mon Dec 2 13:01:45 2019 +0100

    HA: minor update of arbitrary container image name

    HA services get their container image name from a pacemaker
    resource configuration. This image name is shared between
    all cluster nodes.

    To achieve image update without service disruption, a pacemaker
    resource is configured to use an intermediate image name
    "<registry>/<namespace>/<servicename>:pcmklatest" pointing to
    the real image name configured in Heat. This tag can then be
    updated independently on every node during the minor update.

    In order to support the same rolling update when the <namespace>
    changes in the container image, we need a similar floating
    approach for the prefix part of the container image.

    Introduce a new Heat parameter ClusterCommonTag that, when enabled,
    sets the intermediate image name to
    "cluster-common-tag/<servicename>:pcmklatest". By default, this
    parameter is disabled and the original naming scheme is conserved.

    Note: by introducing this new naming scheme, we stop seeing a
    meaningful image name prefix when doing a "pcs status", but since
    we already can't tell what image ID the :pcmklatest tag points to,
    we don't lose much information really.

    Related-Bug: #1854730

    Change-Id: Id369154d147cd5cf0a6f997bf806084fc7580e01
    (cherry picked from commit a166ec6bcaae078a4f7ed91feb8e431fe031e0cb)
    (cherry picked from commit 44b6e6b8520bc9660dd1c6ac44d91523c7c7c84a)
    (cherry picked from commit f366adbd151193ba524d70c4f904be35f1d048a0)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/queens)

Reviewed: https://review.opendev.org/713412
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=9441ca86d5a55e46e688145f9a8db05a72ddbafa
Submitter: Zuul
Branch: stable/queens

commit 9441ca86d5a55e46e688145f9a8db05a72ddbafa
Author: Damien Ciabrini <email address hidden>
Date: Mon Dec 2 13:01:45 2019 +0100

    HA: minor update of arbitrary container image name

    HA services get their container image name from a pacemaker
    resource configuration. This image name is shared between
    all cluster nodes.

    To achieve image update without service disruption, a pacemaker
    resource is configured to use an intermediate image name
    "<registry>/<namespace>/<servicename>:pcmklatest" pointing to
    the real image name configured in Heat. This tag can then be
    updated independently on every node during the minor update.

    In order to support the same rolling update when the <namespace>
    changes in the container image, we need a similar floating
    approach for the prefix part of the container image.

    Introduce a new Heat parameter ClusterCommonTag that, when enabled,
    sets the intermediate image name to
    "cluster-common-tag/<servicename>:pcmklatest". By default, this
    parameter is disabled and the original naming scheme is conserved.

    Note: by introducing this new naming scheme, we stop seeing a
    meaningful image name prefix when doing a "pcs status", but since
    we already can't tell what image ID the :pcmklatest tag points to,
    we don't lose much information really.

    Related-Bug: #1854730

    Change-Id: Id369154d147cd5cf0a6f997bf806084fc7580e01
    (cherry picked from commit 2487becc6a6d263e55c9a642d51994a72c1cb9aa)

tags: added: in-stable-queens
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.