Sorry, user {cinder,nova,heat} is not allowed to execute '/usr/sbin/ss -ntuap' as ... on controller-0.

Bug #1860569 reported by Cédric Jeanneret
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
In Progress
Medium
Cédric Jeanneret

Bug Description

(copied from RHBZ#1778881)

On OSP16 /usr/sbin/ss is not found in rootwrap:

undercloud.ctlplane.localdomain:8787/rh-osbs/rhosp16-openstack-cinder-scheduler:20191126.1, name=cinder_scheduler)
Dec 1 03:31:13 controller-0 podman[868579]: Sorry, user cinder is not allowed to execute '/usr/sbin/ss -ntuap' as cinder on controller-0.
Dec 1 03:31:13 controller-0 systemd[1]: Started cinder_scheduler healthcheck.

This install is using RHOS_TRUNK-16.0-RHEL-8-20191126.n.2

============

For the records, OSP-15 is based on train. More conversations are also listed on the original BZ.

Proposal: allow to run healthchecks as root, adding "--user root" to the systemd unit.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to paunch (master)

Fix proposed to branch: master
Review: https://review.opendev.org/703816

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/703818

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to paunch (master)

Reviewed: https://review.opendev.org/703816
Committed: https://git.openstack.org/cgit/openstack/paunch/commit/?id=3012fe75aa1385896e45abf797e6adb2ee5c72ae
Submitter: Zuul
Branch: master

commit 3012fe75aa1385896e45abf797e6adb2ee5c72ae
Author: Cédric Jeanneret <email address hidden>
Date: Wed Jan 22 16:11:21 2020 +0100

    Execute healthchecks as root

    Some containers doesn't have the "default" user set to root (which is
    good). This lead to healthcheck_port() function to return a message
    because the non-root user isn't allowed to call "ss" command as itself.

    Ensuring we're running the healthchecks as root will also allow to stop
    duplicating some commands, making them faster and smaller for the
    system.

    This was discovered and discussed on Red Hat bugzilla first, then ported
    to Launchpad.

    Change-Id: I2e49d4dd5b385237f4f79929c70365424f6fa22d
    Closes-Bug: 1860569
    Related: https://bugzilla.redhat.com/show_bug.cgi?id=1778881

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to paunch (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/704270

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/703818
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=21787448dee4029cccc1b46bd9a6203f486d72c1
Submitter: Zuul
Branch: master

commit 21787448dee4029cccc1b46bd9a6203f486d72c1
Author: Cédric Jeanneret <email address hidden>
Date: Wed Jan 22 16:19:03 2020 +0100

    Execute healthchecks as root

    Some containers doesn't have the "default" user set to root (which is
    good). This lead to healthcheck_port() function to return a message
    because the non-root user isn't allowed to call "ss" command as itself.

    Ensuring we're running the healthchecks as root will also allow to stop
    duplicating some commands, making them faster and smaller for the
    system.

    This was discovered and discussed on Red Hat bugzilla first, then ported
    to Launchpad.

    This patch is the port of I2e49d4dd5b385237f4f79929c70365424f6fa22d to
    tripleo-ansible "container-manage" role.

    Change-Id: I0e6883cd86157b73f18ab63f96f633a8a05e82bf
    Related-Bug: 1860569
    Related: https://bugzilla.redhat.com/show_bug.cgi?id=1778881

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to paunch (stable/train)

Reviewed: https://review.opendev.org/704270
Committed: https://git.openstack.org/cgit/openstack/paunch/commit/?id=592dab7a847140171e534bba395fcb1be42ce44e
Submitter: Zuul
Branch: stable/train

commit 592dab7a847140171e534bba395fcb1be42ce44e
Author: Cédric Jeanneret <email address hidden>
Date: Wed Jan 22 16:11:21 2020 +0100

    Execute healthchecks as root

    Some containers doesn't have the "default" user set to root (which is
    good). This lead to healthcheck_port() function to return a message
    because the non-root user isn't allowed to call "ss" command as itself.

    Ensuring we're running the healthchecks as root will also allow to stop
    duplicating some commands, making them faster and smaller for the
    system.

    This was discovered and discussed on Red Hat bugzilla first, then ported
    to Launchpad.

    Change-Id: I2e49d4dd5b385237f4f79929c70365424f6fa22d
    Closes-Bug: 1860569
    Related: https://bugzilla.redhat.com/show_bug.cgi?id=1778881
    (cherry picked from commit 3012fe75aa1385896e45abf797e6adb2ee5c72ae)

tags: added: in-stable-train
Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

Apparently there's another issue, needs some more debugging and digging.

Changed in tripleo:
status: Fix Released → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-ansible (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/706360

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-ansible (stable/train)

Reviewed: https://review.opendev.org/706360
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=ae0f7d25988938c1e55e6cc4a33eec9b22f38d1e
Submitter: Zuul
Branch: stable/train

commit ae0f7d25988938c1e55e6cc4a33eec9b22f38d1e
Author: Cédric Jeanneret <email address hidden>
Date: Wed Jan 22 16:19:03 2020 +0100

    Execute healthchecks as root

    Some containers doesn't have the "default" user set to root (which is
    good). This lead to healthcheck_port() function to return a message
    because the non-root user isn't allowed to call "ss" command as itself.

    Ensuring we're running the healthchecks as root will also allow to stop
    duplicating some commands, making them faster and smaller for the
    system.

    This was discovered and discussed on Red Hat bugzilla first, then ported
    to Launchpad.

    This patch is the port of I2e49d4dd5b385237f4f79929c70365424f6fa22d to
    tripleo-ansible "container-manage" role.

    Change-Id: I0e6883cd86157b73f18ab63f96f633a8a05e82bf
    Related-Bug: 1860569
    Related: https://bugzilla.redhat.com/show_bug.cgi?id=1778881
    (cherry picked from commit 21787448dee4029cccc1b46bd9a6203f486d72c1)

tags: added: stein-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to paunch (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/707400

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to paunch (stable/stein)

Reviewed: https://review.opendev.org/707400
Committed: https://git.openstack.org/cgit/openstack/paunch/commit/?id=754c7885f4e86405c6206e339dbb12fc380368b9
Submitter: Zuul
Branch: stable/stein

commit 754c7885f4e86405c6206e339dbb12fc380368b9
Author: Cédric Jeanneret <email address hidden>
Date: Wed Jan 22 16:11:21 2020 +0100

    Execute healthchecks as root

    Some containers doesn't have the "default" user set to root (which is
    good). This lead to healthcheck_port() function to return a message
    because the non-root user isn't allowed to call "ss" command as itself.

    Ensuring we're running the healthchecks as root will also allow to stop
    duplicating some commands, making them faster and smaller for the
    system.

    This was discovered and discussed on Red Hat bugzilla first, then ported
    to Launchpad.

    Change-Id: I2e49d4dd5b385237f4f79929c70365424f6fa22d
    Closes-Bug: 1860569
    Related: https://bugzilla.redhat.com/show_bug.cgi?id=1778881
    (cherry picked from commit 3012fe75aa1385896e45abf797e6adb2ee5c72ae)
    (cherry picked from commit 592dab7a847140171e534bba395fcb1be42ce44e)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/paunch 6.0.1

This issue was fixed in the openstack/paunch 6.0.1 release.

Revision history for this message
Wojciech (suzumushi) wrote :

Using docker with Train and centos7 fresh deployment we can see exactly the same problem

"Sorry, user neutron is not allowed to execute '/usr/sbin/ss -ntuap' as neutron on NODE"

at the end many containers are in unhealthy status
nova, cinder, heat, neutron

Regards

W

Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

Hello Wojciech,

even with everything up-to-date? Care to ensure you have the proper paunch version in train? If so, we might need to backport the following down to Train as well: https://review.opendev.org/#/c/708339/9

Thank you for your feedback!

C.

Revision history for this message
Wojciech (suzumushi) wrote :

looks like so

(undercloud) [stack@tripleo]$ rpm -qf /usr/share/openstack-tripleo-common/healthcheck/common.sh
openstack-tripleo-common-11.3.3-0.20200321061241.da2cc62.el7.noarch

undercloud was installed using latest deployment manual
and latest tripleo-repos from train release.

i will update patch with https://review.opendev.org/#/c/708339/9
and give You feedback

regards

w

Revision history for this message
Wojciech (suzumushi) wrote :

Overcloud nodes
python2-paunch-5.3.2-0.20200320163249.ebc49c4.el7.noarch
paunch-services-5.3.2-0.20200320163249.ebc49c4.el7.noarch

undercloud node
python2-paunch-5.3.2-0.20200320163249.ebc49c4.el7.noarch
paunch-services-5.3.2-0.20200320163249.ebc49c4.el7.noarch

those are latest available in stable as for now.

but still exactly the same prob,
nova/neutron/cinder/etcd containers on DCN nodes

octavia/neutron/nova/cinder/heat containers on control nodes

please let me know if i can debug it deeper, it will be awesome to know how those containers are build on deployment.

regards

w

wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-3 → ussuri-rc3
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-rc3 → victoria-1
Changed in tripleo:
milestone: victoria-1 → victoria-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/paunch stein-eol

This issue was fixed in the openstack/paunch stein-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.