Missing "z" flag for resources created by pacemaker

Bug #1943459 reported by Cédric Jeanneret
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Cédric Jeanneret

Bug Description

We're missing "z" flag for most of the rw mounts nested in pacemaker managed containers.

This hits master down to Train, when we introduced podman.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (master)
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

Some updates:
- https://review.opendev.org/c/openstack/puppet-tripleo/+/808774 will target only master and wallaby
- a new, train-only patch is needed in order to add an FFU task relabeling recursively some of the locations (at least /var/lib/openvswitch/ovn)

This was discussed with PIDONE and Upgrades folks in order to avoid an outage during day-2 operations on train (osp-16.x).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/train)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (master)

Reviewed: https://review.opendev.org/c/openstack/puppet-tripleo/+/808774
Committed: https://opendev.org/openstack/puppet-tripleo/commit/e91aac28229f7b4eb5b054f3b88d9839ffc50fdb
Submitter: "Zuul (22348)"
Branch: master

commit e91aac28229f7b4eb5b054f3b88d9839ffc50fdb
Author: Cédric Jeanneret <email address hidden>
Date: Mon Sep 13 15:54:34 2021 +0200

    Add missing "z" flag for specific mounts

    Depending on the host history, it may happen some directory content
    don't have the correct SELinux type. This has been seen with OVN
    service, during a Queens -> Train FFU:

    while the /var/lib/openvswitch/ovn directory had the correct
    container_file_t type, some files in this location were typed with
    openvswitch_var_lib_t, leading to errors during the deploy part of the
    upgrade (after the OS upgrade, when the deploy is running on the cleaned
    host).
    The specific issue depends on the actual files with the wrong label, but
    usually it involves a container crash/error, leading to a deploy error,
    and a manual intervention in order to correct the SELinux type in the
    location.

    This situation may happen when first deployed on Queens, since it was
    using Docker. For the records, back then Docker Daemon was configured in
    order to disable the SELinux support, so it didn't really care about
    labels; but the situation is different with Podman, and we have a full
    SELinux support at all levels on the OS, leading to the issue.

    For the records, tripleo-heat-templates as well as tripleo-ansible are
    setting the "setype: container_file_t" on the directories, but we don't
    use the "recurse: true" in order to avoid performance issues - some
    locations might be huge, and it would take too much time to relabel
    everything via ansible.

    This patch aims to converge all the mounts to the same options, and
    ensure no SELinux denial can prevent the actual container startup and
    function.

    Change-Id: Ic3e427156fc82c524c763d1896937fcc3c49fabb
    Closes-Bug: #1943459

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to puppet-tripleo (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/puppet-tripleo/+/809132

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/train)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/808964
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/bb3e0234a4588325455705f23f33c66883618564
Submitter: "Zuul (22348)"
Branch: stable/train

commit bb3e0234a4588325455705f23f33c66883618564
Author: Cédric Jeanneret <email address hidden>
Date: Tue Sep 14 16:28:42 2021 +0200

    [TRAIN-ONLY] Ensure OVN directory content is podman-compatible

    When running an FFU from an OVN enabled Queens (osp-13) environment, it
    may happen some files in the /var/lib/openvswitch/ovn locations are
    tagged with openvswitch_var_lib_t instead of container_file_t.

    While most of the other mounts are mounted from other containers, mostly
    managed via tripleo-heat-templates, that specific location seems to be
    used only by pacemaker managed services. Those services are missing the
    "z" flag allowing to relabel the content.

    While https://review.opendev.org/c/openstack/puppet-tripleo/+/808774 is
    adding this missing flag for master and stable/wallaby, we can't do this
    for stable/train since the modification of pacemaker resources will
    create a complete outage.
    In order to avoid such an issue, we'd rather silently relabel things.

    This is possible for OVN since the recursion depth is only 1 level, and
    the amount of files located there is really, really low (less than a
    dozen).

    Also, doing this during step_2 should ensure we don't prevent any host
    preparation, and should ensure all is ready on time for the actual data
    usage.

    Change-Id: I9b73a5833276fac080615d6f01d5b813631a662f
    Resolve-Bug: #1943459

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to puppet-tripleo (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/puppet-tripleo/+/809132
Committed: https://opendev.org/openstack/puppet-tripleo/commit/848f2acd5b176dc4d0d82bc2518a9e0e330f9c56
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 848f2acd5b176dc4d0d82bc2518a9e0e330f9c56
Author: Cédric Jeanneret <email address hidden>
Date: Mon Sep 13 15:54:34 2021 +0200

    Add missing "z" flag for specific mounts

    Depending on the host history, it may happen some directory content
    don't have the correct SELinux type. This has been seen with OVN
    service, during a Queens -> Train FFU:

    while the /var/lib/openvswitch/ovn directory had the correct
    container_file_t type, some files in this location were typed with
    openvswitch_var_lib_t, leading to errors during the deploy part of the
    upgrade (after the OS upgrade, when the deploy is running on the cleaned
    host).
    The specific issue depends on the actual files with the wrong label, but
    usually it involves a container crash/error, leading to a deploy error,
    and a manual intervention in order to correct the SELinux type in the
    location.

    This situation may happen when first deployed on Queens, since it was
    using Docker. For the records, back then Docker Daemon was configured in
    order to disable the SELinux support, so it didn't really care about
    labels; but the situation is different with Podman, and we have a full
    SELinux support at all levels on the OS, leading to the issue.

    For the records, tripleo-heat-templates as well as tripleo-ansible are
    setting the "setype: container_file_t" on the directories, but we don't
    use the "recurse: true" in order to avoid performance issues - some
    locations might be huge, and it would take too much time to relabel
    everything via ansible.

    This patch aims to converge all the mounts to the same options, and
    ensure no SELinux denial can prevent the actual container startup and
    function.

    Change-Id: Ic3e427156fc82c524c763d1896937fcc3c49fabb
    Closes-Bug: #1943459
    (cherry picked from commit e8c4e9304fa803288b38bb81a96c6183e88fdd93)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (stable/train)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/809427
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/938370e59fd56d34078906ae2fa062db954cdfc0
Submitter: "Zuul (22348)"
Branch: stable/train

commit 938370e59fd56d34078906ae2fa062db954cdfc0
Author: Cédric Jeanneret <email address hidden>
Date: Thu Sep 16 17:31:03 2021 +0200

    [TRAIN-ONLY] Correct typo in argument

    Correct the typo in:
    https://review.opendev.org/808964

    Related-Bug: #1943459

    Change-Id: I946f071ac0e3371557bfece82deffb2a8a5e1b9a

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/puppet-tripleo 16.0.0

This issue was fixed in the openstack/puppet-tripleo 16.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.