stx-openstack application apply fails due to auth issue

Bug #1819720 reported by Bart Wensley
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Angie Wang

Bug Description

Title
-----
stx-openstack application apply fails due to auth issue

Brief Description
-----------------
When the stx-openstack application is applied and the ceph-pools-audit pod happens to be scheduled on controller-1, the pod gets stuck attempting to pull its image. This is due to an authentication failure - here are the logs:

2019-03-12T13:27:14.254 controller-1 dockerd[1127]: info time="2019-03-12T13:27:14.253778750Z" level=info msg="Attempting next endpoint for pull after error: Get https://192.168.204.2:9001/v2/docker.io/port/ceph-config-helper/manifests/v1.10.3: unauthorized: authentication required"
2019-03-12T13:27:14.255 controller-1 dockerd[1127]: info time="2019-03-12T13:27:14.253868934Z" level=error msg="Handler for POST /v1.37/images/create returned error: Get https://192.168.204.2:9001/v2/docker.io/port/ceph-config-helper/manifests/v1.10.3: unauthorized: authentication required"
2019-03-12T13:27:14.256 controller-1 kubelet[23323]: info E0312 13:27:14.255791 23323 remote_image.go:112] PullImage "192.168.204.2:9001/docker.io/port/ceph-config-helper:v1.10.3" from image service failed: rpc error: code = Unknown desc = Error response from daemon: Get https://192.168.204.2:9001/v2/docker.io/port/ceph-config-helper/manifests/v1.10.3: unauthorized: authentication required
2019-03-12T13:27:14.256 controller-1 kubelet[23323]: info E0312 13:27:14.255891 23323 kuberuntime_image.go:51] Pull image "192.168.204.2:9001/docker.io/port/ceph-config-helper:v1.10.3" failed: rpc error: code = Unknown desc = Error response from daemon: Get https://192.168.204.2:9001/v2/docker.io/port/ceph-config-helper/manifests/v1.10.3: unauthorized: authentication required
2019-03-12T13:27:14.256 controller-1 kubelet[23323]: info E0312 13:27:14.256047 23323 kuberuntime_manager.go:744] container start failed: ErrImagePull: rpc error: code = Unknown desc = Error response from daemon: Get https://192.168.204.2:9001/v2/docker.io/port/ceph-config-helper/manifests/v1.10.3: unauthorized: authentication required
2019-03-12T13:27:14.256 controller-1 kubelet[23323]: info E0312 13:27:14.256114 23323 pod_workers.go:186] Error syncing pod f77d92ad-44c3-11e9-ae62-0800277d25e7 ("ceph-pools-audit-1552394400-f29p9_openstack(f77d92ad-44c3-11e9-ae62-0800277d25e7)"), skipping: failed to "StartContainer" for "ceph-pools-audit-ceph-store" with ErrImagePull: "rpc error: code = Unknown desc = Error response from daemon: Get https://192.168.204.2:9001/v2/docker.io/port/ceph-config-helper/manifests/v1.10.3: unauthorized: authentication required"

Angie investigated and thinks that the problem is likely a missing imagePullSecrets entry in the ceph-pools-audit helm chart.

Severity
--------
Major

Steps to Reproduce
------------------
Apply the stx-openstack application.

Expected Behavior
------------------
The application is installed.

Actual Behavior
----------------
The application installation fails.

Reproducibility
---------------
Intermittent

System Configuration
--------------------
Any system with two controllers. I saw it in a 2+2 system.

Branch/Pull Time/Commit
-----------------------
SW_VERSION="19.01"
BUILD_TARGET="Unknown"
BUILD_TYPE="Informal"
BUILD_ID="n/a"

JOB="n/a"
BUILD_BY="bwensley"
BUILD_NUMBER="n/a"
BUILD_HOST="yow-bwensley-lx-vm2"
BUILD_DATE="2019-03-11 12:58:14 -0500"

BUILD_DIR="/"
WRS_SRC_DIR="/localdisk/designer/bwensley/starlingx-0/cgcs-root"
WRS_GIT_BRANCH="HEAD"
CGCS_SRC_DIR="/localdisk/designer/bwensley/starlingx-0/cgcs-root/stx"
CGCS_GIT_BRANCH="HEAD"

Timestamp/Logs
--------------
See above

Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Angie Wang (angiewang)
tags: added: stx.containers
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-config (master)

Fix proposed to branch: master
Review: https://review.openstack.org/642886

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-config (master)

Reviewed: https://review.openstack.org/642886
Committed: https://git.openstack.org/cgit/openstack/stx-config/commit/?id=90182333622edef4cfb60ef2ab0fabde5d3100d8
Submitter: Zuul
Branch: master

commit 90182333622edef4cfb60ef2ab0fabde5d3100d8
Author: Angie Wang <email address hidden>
Date: Tue Mar 12 14:25:33 2019 -0400

    Create a service account for ceph-pools-audit cron job

    A service account for ceph-pools-audit cron job should be
    created to provide identity.
    The commit updates the ceph pools audit chart to leverage
    helm-toolkit to create its service account which also
    provides the image credentials.

    Change-Id: I2da135047ee9fa47274c49a989d6b905104f3ffd
    Closes-Bug: 1819720
    Signed-off-by: Angie Wang <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → High
tags: added: stx.2019.05
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-config (master)

Fix proposed to branch: master
Review: https://review.openstack.org/642943

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-config (master)

Reviewed: https://review.openstack.org/642943
Committed: https://git.openstack.org/cgit/openstack/stx-config/commit/?id=f54af2a7d9a07dcf997cd8ba3b336ef822417db6
Submitter: Zuul
Branch: master

commit f54af2a7d9a07dcf997cd8ba3b336ef822417db6
Author: Angie Wang <email address hidden>
Date: Tue Mar 12 22:57:48 2019 -0400

    Fix the dependencies in the ceph-pools-audit chart

    The static dependencies was missing in the previous commit
    https://review.openstack.org/#/c/642886/.

    Change-Id: I4f9135046178cdfa244755e3280ce1e5257ab3e1
    Closes-Bug: 1819720
    Signed-off-by: Angie Wang <email address hidden>

Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.