AIO-DX: Failed to pull and unpack snapshot-controller image

Bug #2051844 reported by Gabriel de Araújo Cabral
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Gabriel de Araújo Cabral

Bug Description

Brief Description
-----------------
Force rebooted active controller and waited till standy become active controller and while repeating the reboot operation on current active controller makes "volume-snapshot-controller-0" pod status to be as "ErrImagePull"

Severity
--------
Major

Steps to Reproduce
------------------
Reboot the active controller and wait till standy becomes active.

Repeat the reboot operation on current active controller and check the pod (volume-snapshot-controller-0) status

Expected Behavior
------------------
Reboot should happen fine and all pods should be up and running.

Actual Behavior
----------------
After rebooting active controller the pod (volume-snapshot-controller-0) in "ErrImagePull" status.

Reproducibility
---------------
100% reproducible

System Configuration
--------------------
Two node system

Timestamp/Logs
--------------
Events:
  Type Reason Age From Message
  ---- ------ ---- ---- -------
  Normal Scheduled 9h default-scheduler Successfully assigned kube-system/volume-snapshot-controller-0 to controller-1
  Normal AddedInterface 9h multus Add eth0 [172.16.166.179/32] from chain
  Warning Failed 9h (x4 over 9h) kubelet Failed to pull image "registry.local:9001/registry.k8s.io/sig-storage/snapshot-controller:v6.1.0": rpc error: code = Unknown desc = failed to pull and unpack image "registry.local:9001/registry.k8s.io/sig-storage/snapshot-controller:v6.1.0": failed to resolve reference "registry.local:9001/registry.k8s.io/sig-storage/snapshot-controller:v6.1.0": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
  Warning Failed 9h (x4 over 9h) kubelet Error: ErrImagePull
  Warning Failed 9h (x6 over 9h) kubelet Error: ImagePullBackOff
  Normal Pulling 152m (x82 over 9h) kubelet Pulling image "registry.local:9001/registry.k8s.io/sig-storage/snapshot-controller:v6.1.0"
  Normal BackOff 2m53s (x2415 over 9h) kubelet Back-off pulling image "registry.local:9001/registry.k8s.io/sig-storage/snapshot-controller:v6.1.0"

Test Activity
-------------
Regression Testing

Changed in starlingx:
assignee: nobody → Gabriel de Araújo Cabral (g-cabral)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ansible-playbooks (master)
Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ansible-playbooks (master)

Reviewed: https://review.opendev.org/c/starlingx/ansible-playbooks/+/907322
Committed: https://opendev.org/starlingx/ansible-playbooks/commit/504167c3acd5e0d69a2d4fc351741a45fcb6b936
Submitter: "Zuul (22348)"
Branch: master

commit 504167c3acd5e0d69a2d4fc351741a45fcb6b936
Author: Gabriel de Araújo Cabral <email address hidden>
Date: Wed Jan 31 10:04:31 2024 -0300

    Fix for snapshot-controller image pull error

    After rebooting the active controller, the snapshot-controller pod
    is no longer able to pull the image, changing to "ErrImagePull"
    status.

    To fix the error, the 'imagePullSecrets' field was added to the
    service account of the snapshot controller from K8s 1.25 to 1.27,
    specifying the secret required to authenticate to the registry
    and pull the image.

    Test plan:
     PASS: AIO-DX fresh install
     PASS: Reboot the active controller, checking when it returns that
           the snapshot-controller pod is running
     PASS: Reboot both controllers, checking when they return that the
           snapshot-controller pod is running

    Closes-bug: 2051844

    Change-Id: I68f6014898e193e561774f298c7997a3ce89a4eb
    Signed-off-by: Gabriel de Araújo Cabral <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → High
tags: added: stx.9.0 stx.containers
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.