[CaaS] Magnum clusters can't pull outdated images

Bug #2069638 reported by Matt Verran
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Snap
New
Undecided
Unassigned

Bug Description

Sunbeam 2024.1 (536)

When deploying Magnum the cluster has issues with multiple images. It appears from https://github.com/kubernetes/k8s.io/blob/main/registry.k8s.io/images/k8s-staging-provider-os/images.yaml that the images being requested are too old and no longer offered?

$ kubectl --kubeconfig $KUBECONFIG get pods -A
E0617 18:54:09.772818 2108337 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0617 18:54:09.777305 2108337 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0617 18:54:09.782513 2108337 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0617 18:54:09.787281 2108337 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-56448757b9-4ngtf 1/1 Running 0 86m
kube-system coredns-56448757b9-jbcs5 1/1 Running 0 86m
kube-system dashboard-metrics-scraper-67f57ff746-8gwf7 0/1 Pending 0 86m
kube-system k8s-keystone-auth-2pq7m 0/1 ImagePullBackOff 0 86m
kube-system kube-dns-autoscaler-6d5b5dc777-nsvp2 0/1 Pending 0 86m
kube-system kube-flannel-ds-6brqt 1/1 Running 0 86m
kube-system kube-flannel-ds-m8stw 1/1 Running 0 82m
kube-system kubernetes-dashboard-7b88d986b4-fw5bc 0/1 Pending 0 86m
kube-system magnum-metrics-server-6c4c77844b-j9kr7 0/1 Pending 0 86m
kube-system openstack-cloud-controller-manager-d74rn 0/1 ImagePullBackOff 0 86m

$ kubectl --kubeconfig $KUBECONFIG describe pod k8s-keystone-auth-2pq7m -n kube-system | grep Image:
E0617 18:54:15.340561 2108612 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0617 18:54:15.347795 2108612 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0617 18:54:15.355951 2108612 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0617 18:54:15.362955 2108612 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
    Image: registry.k8s.io/provider-os/k8s-keystone-auth:v1.18.0

$ kubectl --kubeconfig $KUBECONFIG describe pod openstack-cloud-controller-manager-d74rn -n kube-system | grep Image:
E0617 18:54:31.713179 2109997 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0617 18:54:31.720333 2109997 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0617 18:54:31.729922 2109997 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0617 18:54:31.736720 2109997 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
    Image: registry.k8s.io/provider-os/openstack-cloud-controller-manager:v1.23.1

Revision history for this message
Matt Verran (mv-2112) wrote :

Based on info here https://hub.docker.com/r/k8scloudprovider/k8s-keystone-auth suspect the registry change had a cutoff. The default is to pull k8s tag 1.27.x which needs the new location, but old images.

Technically not a Sunbeam bug i suspect but more into the charm/magnum space.

summary: - Magnum clusters can't pull outdated images
+ [CaaS] Magnum clusters can't pull outdated images
Revision history for this message
Matt Verran (mv-2112) wrote :

Mitigation: Update the image definitions based on a version listed here: https://github.com/kubernetes/k8s.io/blob/main/registry.k8s.io/images/k8s-staging-provider-os/images.yaml.

kubectl --kubeconfig $KUBECONFIG edit ds -n kube-system openstack-cloud-controller-manager
kubectl --kubeconfig $KUBECONFIG edit ds -n kube-system k8s-keystone-auth

In the output below, both were updated to 1.24.6 as proof of concept although we should probably match supported levels.

$ kubectl --kubeconfig $KUBECONFIG get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-56448757b9-gcwvl 1/1 Running 0 168m
kube-system coredns-56448757b9-zrxn5 1/1 Running 0 168m
kube-system dashboard-metrics-scraper-67f57ff746-7bngc 1/1 Running 0 168m
kube-system k8s-keystone-auth-vqrh8 1/1 Running 0 114s
kube-system kube-dns-autoscaler-6d5b5dc777-c2s8g 1/1 Running 0 168m
kube-system kube-flannel-ds-4qhj2 1/1 Running 0 164m
kube-system kube-flannel-ds-pjnpz 1/1 Running 0 168m
kube-system kubernetes-dashboard-7b88d986b4-8lrhw 1/1 Running 0 168m
kube-system magnum-metrics-server-6c4c77844b-jzsj5 1/1 Running 0 168m
kube-system npd-z885t 1/1 Running 0 2m6s
kube-system openstack-cloud-controller-manager-jz5j5 1/1 Running 0 2m32s

Revision history for this message
Matt Verran (mv-2112) wrote :

Looks like more of a cascade... enabling cinder as a volume driver adds more

kube-system csi-cinder-controllerplugin-6df7854d74-kkz2s 5/6 ImagePullBackOff 0 12m
kube-system csi-cinder-nodeplugin-5vpzg 2/3 ImagePullBackOff 0 12m
kube-system csi-cinder-nodeplugin-mksx4 2/3 ImagePullBackOff 0 8m19s

Attempting to up the COE version beyond 1.18.x fails when labels are used to attempt to bring other versions of kubernetes in.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.