armada-api container not using the correct user

Bug #1924579 reported by Marcus Secato
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Marcus Secato

Bug Description

Brief Description
-----------------
Commands run in armada-api container via 'kubectl exec' are not using the proper user.

Severity
--------
Minor: System/Feature is usable with minor issue

Steps to Reproduce
------------------
Run a 'kubectl exec -n armada <armada-api container name> -- ps -eaf'. Notice that processes are running with user 'nobody'.

Expected Behavior
------------------
The 'armada' user should be used for all operations in armada-api

Actual Behavior
----------------
The 'nobody' user is used for all operations in armada-api

Reproducibility
---------------
100% reproducible

System Configuration
--------------------
Seen in AIO-SX Subcloud but would be seen in any other configuration

Branch/Pull Time/Commit
-----------------------
As of armada migrated to Kubernetes

Last Pass
---------
Seen since armada started being deployed in Kubernetes cluster. Previously, the armada user was always enforced using 'docker exec' command.

Timestamp/Logs
--------------
UID PID PPID C STIME TTY TIME CMD
nobody 1 0 0 10:43 ? 00:00:00 uwsgi -b 32768 --die-on-term --http :8000 --http-timeout 3600 --enable-threads -L --lazy-apps --master --paste config:/etc/armada/api-paste.ini --pyargv --config-file /etc/armada/armada.conf --threads 1 --workers 4
nobody 6 1 0 10:43 ? 00:00:01 uwsgi -b 32768 --die-on-term --http :8000 --http-timeout 3600 --enable-threads -L --lazy-apps --master --paste config:/etc/armada/api-paste.ini --pyargv --config-file /etc/armada/armada.conf --threads 1 --workers 4
nobody 7 1 0 10:43 ? 00:00:01 uwsgi -b 32768 --die-on-term --http :8000 --http-timeout 3600 --enable-threads -L --lazy-apps --master --paste config:/etc/armada/api-paste.ini --pyargv --config-file /etc/armada/armada.conf --threads 1 --workers 4
nobody 8 1 0 10:43 ? 00:00:01 uwsgi -b 32768 --die-on-term --http :8000 --http-timeout 3600 --enable-threads -L --lazy-apps --master --paste config:/etc/armada/api-paste.ini --pyargv --config-file /etc/armada/armada.conf --threads 1 --workers 4
nobody 9 1 0 10:43 ? 00:00:01 uwsgi -b 32768 --die-on-term --http :8000 --http-timeout 3600 --enable-threads -L --lazy-apps --master --paste config:/etc/armada/api-paste.ini --pyargv --config-file /etc/armada/armada.conf --threads 1 --workers 4
nobody 10 1 0 10:43 ? 00:00:00 uwsgi -b 32768 --die-on-term --http :8000 --http-timeout 3600 --enable-threads -L --lazy-apps --master --paste config:/etc/armada/api-paste.ini --pyargv --config-file /etc/armada/armada.conf --threads 1 --workers 4

Test Activity
-------------
Developer testing

Workaround
----------
N/A

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to integ (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/integ/+/786511

Changed in starlingx:
status: New → In Progress
Marcus Secato (mviniciu)
Changed in starlingx:
assignee: nobody → Marcus Secato (mviniciu)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

screening: marking as low as there doesn't appear to be any negative system impact. can be fixed in stx master, but not required for the r/stx.5.0 release branch

Changed in starlingx:
importance: Undecided → Low
tags: added: stx.containers
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to integ (master)

Reviewed: https://review.opendev.org/c/starlingx/integ/+/786511
Committed: https://opendev.org/starlingx/integ/commit/3924cfe7ae390678ae4df9b544acf8b373440183
Submitter: "Zuul (22348)"
Branch: master

commit 3924cfe7ae390678ae4df9b544acf8b373440183
Author: Marcus Secato <email address hidden>
Date: Thu Apr 15 17:52:58 2021 -0400

    Set proper user ID for armada-api container

    Since armada application moved to Kubernetes cluster, processes and
    commands are not executed with the 'armada' user in armada-api
    container. Previously when armada was a separated container user was
    enforced through 'docker exec'.

    Closes-Bug: 1924579

    Signed-off-by: Marcus Secato <email address hidden>
    Change-Id: I5600974c0b9c3ade73a58dae300e8f3b18c6aefd

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to integ (f/centos8)

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/c/starlingx/integ/+/793754

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to integ (f/centos8)
Download full text (37.0 KiB)

Reviewed: https://review.opendev.org/c/starlingx/integ/+/793754
Committed: https://opendev.org/starlingx/integ/commit/a13966754d4e19423874ca31bf1533f057380c52
Submitter: "Zuul (22348)"
Branch: f/centos8

commit b310077093fd567944c6a46b7d0adcabe1f2b4b9
Author: Mihnea Saracin <email address hidden>
Date: Sat May 22 18:19:54 2021 +0300

    Fix resize of filesystems in puppet logical_volume

    After system reinstalls there is stale data on the disk
    and puppet fails when resizing, reporting some wrong filesystem
    types. In our case docker-lv was reported as drbd when
    it should have been xfs.

    This problem was solved in some cases e.g:
    when doing a live fs resize we wipe the last 10MB
    at the end of partition:
    https://opendev.org/starlingx/stx-puppet/src/branch/master/puppet-manifests/src/modules/platform/manifests/filesystem.pp#L146

    Our issue happened here:
    https://opendev.org/starlingx/stx-puppet/src/branch/master/puppet-manifests/src/modules/platform/manifests/filesystem.pp#L65
    Resize can happen at unlock when a bigger size is detected for the
    filesystem and the 'logical_volume' will resize it.
    To fix this we have to wipe the last 10MB of the partition after the
    'lvextend' cmd in the 'logical_volume' module.

    Tested the following scenarios:

    B&R on SX with default sizes of filesystems and cgts-vg.

    B&R on SX with with docker-lv of size 50G, backup-lv also 50G and
    cgts-vg with additional physical volumes:

    - name: cgts-vg
        physicalVolumes:
        - path: /dev/disk/by-path/pci-0000:00:0d.0-ata-1.0
        size: 50
        type: partition
        - path: /dev/disk/by-path/pci-0000:00:0d.0-ata-1.0
        size: 30
        type: partition
        - path: /dev/disk/by-path/pci-0000:00:0d.0-ata-3.0
        type: disk

    B&R on DX system with backup of size 70G and cgts-vg
    with additional physical volumes:

    physicalVolumes:
    - path: /dev/disk/by-path/pci-0000:00:0d.0-ata-1.0
        size: 50
        type: partition
    - path: /dev/disk/by-path/pci-0000:00:0d.0-ata-1.0
        size: 30
        type: partition
    - path: /dev/disk/by-path/pci-0000:00:0d.0-ata-3.0
        type: disk

    Closes-Bug: 1926591
    Change-Id: I55ae6954d24ba32e40c2e5e276ec17015d9bba44
    Signed-off-by: Mihnea Saracin <email address hidden>

commit 3225570530458956fd642fa06b83360a7e4e2e61
Author: Mihnea Saracin <email address hidden>
Date: Thu May 20 14:33:58 2021 +0300

    Execute once the ceph services script on AIO

    The MTC client manages ceph services via ceph.sh which
    is installed on all node types in
    /etc/service.d/{controller,worker,storage}/ceph.sh

    Since the AIO controllers have both controller and worker
    personalities, the MTC client will execute the ceph script
    twice (/etc/service.d/worker/ceph.sh,
    /etc/service.d/controller/ceph.sh).
    This behavior will generate some issues.

    We fix this by exiting the ceph script if it is the one from
    /etc/services.d/worker on AIO systems.

    Closes-Bug: 1928934
    Change-Id: I3e4dc313cc3764f870b8f6c640a60338...

tags: added: in-f-centos8
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.