Running an OpenSearch upgrade where the container will not change causes shard allocation to remain disabled

Bug #2049512 reported by Matt Crees
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
Status tracked in Caracal
Antelope
Fix Committed
Medium
Unassigned
Bobcat
Fix Released
Medium
Unassigned
Caracal
Fix Released
Medium
Matt Crees
Zed
Fix Released
Medium
Unassigned

Bug Description

Shard allocation is disabled at the start of the OpenSearch upgrade task: https://github.com/openstack/kolla-ansible/blob/77c18fa615cc592976ae65a52c0198a14e054876/ansible/roles/opensearch/tasks/upgrade.yml#L2C9-L2C33.
This is set as a transient setting, meaning it will be removed once the containers are restarted. However, if there is not change in the OpenSearch container it will not be restarted so the cluster is left in a broken state: unable to allocate shards.

This is hit, for example, if all services are being upgraded and another non-opensearch service fails. An operator will want to rerun the full upgrade again, assuming this is safe due to idempotency, so that the remaining services are still upgraded in order.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (master)
Changed in kolla-ansible:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/905851
Committed: https://opendev.org/openstack/kolla-ansible/commit/e502b65ba1ef7ae0b321ed001948d96d29c57f08
Submitter: "Zuul (22348)"
Branch: master

commit e502b65ba1ef7ae0b321ed001948d96d29c57f08
Author: Matt Crees <email address hidden>
Date: Wed Jan 17 10:54:05 2024 +0000

    Fix OpenSearch upgrade tasks idempotency

    Shard allocation is disabled at the start of the OpenSearch upgrade
    task. This is set as a transient setting, meaning it will be removed
    once the containers are restarted. However, if there is not change in
    the OpenSearch container it will not be restarted so the cluster is left
    in a broken state: unable to allocate shards.

    This patch moves the pre-upgrade tasks to within the handlers, so shard
    allocation and the flush are only performed when the OpenSearch
    container is going to be restarted.

    Closes-Bug: #2049512
    Change-Id: Ia03ba23bfbde7d50a88dc16e4f117dec3c98a448

Changed in kolla-ansible:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (stable/2023.2)

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/906493

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/2023.2)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/906493
Committed: https://opendev.org/openstack/kolla-ansible/commit/d30ee119fc6089120fab5e4efa90c118408adde3
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit d30ee119fc6089120fab5e4efa90c118408adde3
Author: Matt Crees <email address hidden>
Date: Wed Jan 17 10:54:05 2024 +0000

    Fix OpenSearch upgrade tasks idempotency

    Shard allocation is disabled at the start of the OpenSearch upgrade
    task. This is set as a transient setting, meaning it will be removed
    once the containers are restarted. However, if there is not change in
    the OpenSearch container it will not be restarted so the cluster is left
    in a broken state: unable to allocate shards.

    This patch moves the pre-upgrade tasks to within the handlers, so shard
    allocation and the flush are only performed when the OpenSearch
    container is going to be restarted.

    Closes-Bug: #2049512
    Change-Id: Ia03ba23bfbde7d50a88dc16e4f117dec3c98a448
    (cherry picked from commit e502b65ba1ef7ae0b321ed001948d96d29c57f08)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 17.2.0

This issue was fixed in the openstack/kolla-ansible 17.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/914208

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (stable/zed)

Fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/kolla-ansible/+/914209

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/914209
Committed: https://opendev.org/openstack/kolla-ansible/commit/4a43107ca92b2613e893cff1281a24df2747f21f
Submitter: "Zuul (22348)"
Branch: stable/zed

commit 4a43107ca92b2613e893cff1281a24df2747f21f
Author: Matt Crees <email address hidden>
Date: Wed Jan 17 10:54:05 2024 +0000

    Fix OpenSearch upgrade tasks idempotency

    Shard allocation is disabled at the start of the OpenSearch upgrade
    task. This is set as a transient setting, meaning it will be removed
    once the containers are restarted. However, if there is not change in
    the OpenSearch container it will not be restarted so the cluster is left
    in a broken state: unable to allocate shards.

    This patch moves the pre-upgrade tasks to within the handlers, so shard
    allocation and the flush are only performed when the OpenSearch
    container is going to be restarted.

    Closes-Bug: #2049512
    Change-Id: Ia03ba23bfbde7d50a88dc16e4f117dec3c98a448
    (cherry picked from commit e502b65ba1ef7ae0b321ed001948d96d29c57f08)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/kolla-ansible/+/914208
Committed: https://opendev.org/openstack/kolla-ansible/commit/d019bd13a3497736516c39d03c040668616b4efd
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit d019bd13a3497736516c39d03c040668616b4efd
Author: Matt Crees <email address hidden>
Date: Wed Jan 17 10:54:05 2024 +0000

    Fix OpenSearch upgrade tasks idempotency

    Shard allocation is disabled at the start of the OpenSearch upgrade
    task. This is set as a transient setting, meaning it will be removed
    once the containers are restarted. However, if there is not change in
    the OpenSearch container it will not be restarted so the cluster is left
    in a broken state: unable to allocate shards.

    This patch moves the pre-upgrade tasks to within the handlers, so shard
    allocation and the flush are only performed when the OpenSearch
    container is going to be restarted.

    Closes-Bug: #2049512
    Change-Id: Ia03ba23bfbde7d50a88dc16e4f117dec3c98a448
    (cherry picked from commit e502b65ba1ef7ae0b321ed001948d96d29c57f08)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible zed-eom

This issue was fixed in the openstack/kolla-ansible zed-eom release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.