master/wallaby deployments are failing with "SELinux boolean os_enable_vtpm does not exist."

Bug #1977873 reported by Ronelle Landy
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

master and wallaby tests are failing deployment on check/gate/integration lines with the following error:

2022-06-07 11:32:28.999284 | fa163e21-bda1-59d5-acc2-0000000007d9 | FATAL | Enable os_enable_vtpm SELinux boolean for vTPM | standalone | error={"changed": false, "msg": "SELinux boolean os_enable_vtpm does not exist."}

https://codesearch.opendev.org/?q=os_enable_vtpm&i=nope&literal=nope&files=&excludeFiles=&repos=

finds this variable referenced in:

https://codesearch.opendev.org/?q=os_enable_vtpm&i=nope&literal=nope&files=&excludeFiles=&repos=

and

https://codesearch.opendev.org/?q=os_enable_vtpm&i=nope&literal=nope&files=&excludeFiles=&repos=

Example failing jobs:

https://logserver.rdoproject.org/openstack-component-security/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-standalone-security-master/f647294/logs/undercloud/home/zuul/standalone_deploy.log.txt.gz

https://zuul.openstack.org/builds?job_name=tripleo-ci-centos-9-standalone&skip=0

The failure started on 06/07.

Ronelle Landy (rlandy)
Changed in tripleo:
milestone: none → zed-1
importance: Undecided → Critical
status: New → Triaged
tags: added: promotion-blocker
Revision history for this message
Ronelle Landy (rlandy) wrote :

Possible fall out from selinux update:

https://composes.stream.centos.org/production/latest-CentOS-Stream/compose/BaseOS/x86_64/os/Packages/?C=M;O=A

[ ] libselinux-3.4-1.el9.i686.rpm 2022-06-06 09:46 93K
[ ] libselinux-3.4-1.el9.x86_64.rpm 2022-06-06 09:46 86K
[ ] libselinux-utils-3.4-1.el9.x86_64.rpm 2022-06-06 09:46 183K
[ ] libsemanage-3.4-1.el9.i686.rpm 2022-06-06 09:41 128K
[ ] libsemanage-3.4-1.el9.x86_64.rpm 2022-06-06 09:41 119K
[ ] libsepol-3.4-1.1.el9.i686.rpm 2022-06-06 09:44 332K
[ ] libsepol-3.4-1.1.el9.x86_64.rpm 2022-06-06 09:44 316K

Revision history for this message
Ronelle Landy (rlandy) wrote :

Passing jobs have:

libselinux.x86_64 3.3-2.el9 @baseos
libselinux-ruby.x86_64 3.3-2.el9 @quickstart-centos-appstreams
libselinux-utils.x86_64 3.3-2.el9 @baseos
libsemanage.x86_64 3.3-3.el9 @baseos
libsepol.x86_64 3.3-2.el9 @baseos

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/845016

Revision history for this message
Ronelle Landy (rlandy) wrote :
Revision history for this message
Lon Hohberger (lhh) wrote :

The new libselinux-3.4 seems to be the culprit, but I'm not entirely sure how at this point.

1. The same error occurs on fresh installed hosts
2. Rebuilding all policy modules and installing them works (and the missing booleans appear, BUT
3. The neutron module does not compile with the new libselinux 3.4, but produces no actionable errors:

[root@standalone openstack-selinux-0.8.31]# make os-neutron.pp
make -f /usr/share/selinux/devel/Makefile os-neutron.pp
make[1]: Entering directory '/root/rpmbuild/BUILD/openstack-selinux-0.8.31'
Compiling targeted os-neutron module
Creating targeted os-neutron.pp policy package /usr/bin/semodule_package: Error while reading policy module from tmp/os-neutron.mod
make[1]: *** [/usr/share/selinux/devel/include/Makefile:165: os-neutron.pp] Error 1
rm tmp/os-neutron.mod tmp/os-neutron.mod.fc
make[1]: Leaving directory '/root/rpmbuild/BUILD/openstack-selinux-0.8.31'
make: *** [Makefile:15: os-neutron.pp] Error 2

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tripleo-heat-templates (master)

Change abandoned by "Ronelle Landy <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/845016
Reason: not the right fix

Revision history for this message
Lon Hohberger (lhh) wrote :

Rafael Castillo notes that removing the import of self:process setpgid fixes the build on 3.4.

Doing this: https://github.com/redhat-openstack/openstack-selinux/pull/91

... and rebuilding the openstack-selinux causes a clean installation and all booleans to be present.

Revision history for this message
Lon Hohberger (lhh) wrote :

The principal reason for this bug is that we need to rebuild openstack-selinux due to breaking changes between libselinux-3.3 and libselinux-3.4. However, the above needs to merge before openstack-selinux will cleanly build on libselinux-3.4

Revision history for this message
Lon Hohberger (lhh) wrote :

Other credits: Cedric Jeanneret is the one who suggested checking the libselinux version.

Revision history for this message
Julie Pichon (jpichon) wrote :

I wonder if perhaps class/type checking was improved and picked up on an existing problem.

Looking at the access vectors, setpgid should have been defined as "class process" [0] but it was mistakenly set to "class capability" [1] like dac_override [2] and setpcap [3].

Once I updated libselinux to the "bad" version, I failed to build neutron. But if I change the definition to "class process setpgid;" it works again.

[0] https://github.com/fedora-selinux/selinux-policy/blob/0846d11/policy/flask/access_vectors#L356
[1] https://github.com/redhat-openstack/openstack-selinux/blob/8d0bf6c851aad1cedcc4b38f1c6fda4c8e62ba81/os-neutron.te#L23
[2] https://github.com/fedora-selinux/selinux-policy/blob/0846d11/policy/flask/access_vectors#L144
[3] https://github.com/fedora-selinux/selinux-policy/blob/0846d11/policy/flask/access_vectors#L151

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

Good catch, Julie!

So, 2 things:

- why is it working (better) when we remove that inclusion? is it something that is present by default in the build namespace, or injected by the tooling used to build the policy ?

- do you want to make a follow-up patch re-adding the setpgid as a class process, just to be consistent and clear ?

I think SELinux maintainers indeed added some better error catching in the 3.4 release[1] - but at the same time, I'm sing some reverts happening, leading to a 3.4-2 (at least downstream). Guess it's not stable yet, and we may face some other issues in the near future :/.

[1] https://github.com/SELinuxProject/selinux/releases/tag/3.4

Revision history for this message
Julie Pichon (jpichon) wrote :

We usually don't need to define classes and most files don't have them, just the source and target types. We could add a follow-up patch with 'class process' but I don't think it's necessary personally, though I wouldn't block it either.

I've been looking through the commits for 3.4 and finding a lot of "validation"-type commits too, so I think that makes sense.

Revision history for this message
Julie Pichon (jpichon) wrote :

I also opened a bug upstream [1] with the minimum viable reproducer to see if it might be possible to improve the error message for easier debugging, when such failures happen.

[1] https://github.com/SELinuxProject/selinux/issues/356

Revision history for this message
Soniya Murlidhar Vyas (svyas) wrote :
Revision history for this message
Steve Baker (steve-stevebaker) wrote :

I'm not sure if this is related, but diskimage-builder currently installs upstream libselinux to work around issues in the centos-stream-9 distro packaged version:

https://review.opendev.org/c/openstack/diskimage-builder/+/845189/4/diskimage_builder/elements/redhat-common/post-install.d/05-selinux-9-stream

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-ci (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/tripleo-ci/+/845573

Revision history for this message
Soniya Murlidhar Vyas (svyas) wrote :

All the following jobs are green now:
1. periodic-tripleo-ci-centos-9-standalone-security-master
2. tripleo-ci-centos-9-standalone
3. periodic-tripleo-ci-centos-9-standalone-full-tempest-api-compute-master

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-ci (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ci/+/845573
Committed: https://opendev.org/openstack/tripleo-ci/commit/b48f287493088b70f87de436d61b2c6283617fe3
Submitter: "Zuul (22348)"
Branch: master

commit b48f287493088b70f87de436d61b2c6283617fe3
Author: Chandan Kumar (raukadah) <email address hidden>
Date: Mon Jun 13 15:09:08 2022 +0530

    Make sure 05-selinux-9-stream is executable

    In FS02 CS9 job, 05-selinux-9-stream script is ignored due
    to non-executable files.

    It was added to workaround libselinux-3.4-1 issue[1].

    This patch adds the workaround to make it executable for
    CentOS-Stream-9.
    It will fix image build issue in all ovb jobs.

    Closes-Bug: #1978456
    Related-Bug: #1977873

    [1]. https://review.opendev.org/c/openstack/diskimage-builder/+/845189
    Signed-off-by: Chandan Kumar (raukadah) <email address hidden>
    Change-Id: I8957aa5181835194d98df4ee2e4d3100ef50f027

Ronelle Landy (rlandy)
Changed in tripleo:
status: Triaged → Fix Released
Revision history for this message
Ananya Banerjee (frenzyfriday) wrote :

Seeing this again in master, wallaby c9 component line standalone jobs

Revision history for this message
Julie Pichon (jpichon) wrote :

This is a red herring that only means the openstack-selinux package wasn't installed properly. There are debugging tips at https://github.com/redhat-openstack/openstack-selinux/blob/master/doc/TROUBLESHOOTING.md#how-to-resolve-selinux-boolean-os_enable_vtpm-does-not-exist that can help to debug and find the real issue.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.