Unable to redeploy a k8s application with the same name as before

Bug #1990369 reported by Tom Haddon
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
High
Unassigned

Bug Description

We have a juju k8s model (2.9.32) attached to an openstack HA controller in which we had a number of applications deployed, including two versions of https://charmhub.io/redis-k8s - named in this model as `redis-cache` and `redis-broker`.

The initial deployment was using pod-spec charms, and we wanted to switch to sidecar charms (which are published in the edge channel of this charm). In local testing, we found that we couldn't just run a charm upgrade, but needed to remove the application and then redeploy.

However, when we came to do this on our staging instance, we found ourselves unable to redeploy the applications. The units are stuck in "installing agent" status and there are no pods deployed in kubernetes.

We have the same issue for both versions of the redis-k8s applications that we've deployed (redis-cache and redis-broker). We can deploy them fine if we use a different application name (e.g. redis-cache2 or redis-broker2).

We have done some live debugging with Joe Phillips (thx!) and I'll attach some engine reports from the controllers for reference.

We're not able to reproduce this yet in another model, but are happy to provide any information or access to the model via a screenshare as needed to help debug.

Tags: canonical-is
Revision history for this message
Tom Haddon (mthaddon) wrote :
tags: added: canonical-is
Revision history for this message
Tom Haddon (mthaddon) wrote :
Revision history for this message
Tom Haddon (mthaddon) wrote :
description: updated
description: updated
Revision history for this message
Tom Haddon (mthaddon) wrote :

It turned out there was a stale kubernetes service with the name we were trying to deploy with. The reason was that we had a problem deleting the service due to the following error https://paste.ubuntu.com/p/q4XyRRHmWM/. This was fixed by adding the "load-balancer_observer" role to the relevant account, and then the service deletion completed. We could then redeploy the application with the same name.

As discussed with Ian Booth, we think Juju should include a more useful error message explaining why it can't provision the new application (e.g. because there's a k8s service with a conflicting name).

Changed in juju:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Joseph Phillips (manadart)
John A Meinel (jameinel)
Changed in juju:
assignee: Joseph Phillips (manadart) → nobody
milestone: none → 2.9-next
Ian Booth (wallyworld)
Changed in juju:
milestone: 2.9-next → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.