Unable to remove offers when 2 endpoints are offered with the same application

Bug #1873472 reported by Camille Rodriguez
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Ian Booth

Bug Description

I offered 2 endpoints from the prometheus application. Then, I tried to remove them, and it was failing. I tried to remove the application prometheus, with --force, and it is failing. I tried to remove the whole model with --force and it still hangs forever with the application and offers from prometheus not wanting to delete.

canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju offers
Offer User Relation id Status Endpoint Interface Role Ingress subnets
prometheus-jobs -
prometheus-target -

canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju remove-offer prometheus-jobs --force --no-wait
ERROR option provided but not defined: --no-wait
canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju remove-offer prometheus-jobs --force
WARNING! This command will remove offers: admin/lma.prometheus-jobs
This includes all relations to those offers.

Continue [y/N]? y
canonicalcr@dbnk8dev01:~/cpe-deployments/config$
canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju offers
Offer User Relation id Status Endpoint Interface Role Ingress subnets
prometheus-jobs -
prometheus-target -

canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju remove-offer prometheus-jobs --debug
10:19:10 INFO juju.cmd supercommand.go:83 running juju [2.7.6 gc go1.10.4]
10:19:10 DEBUG juju.cmd supercommand.go:84 args: []string{"/snap/juju/11285/bin/juju", "remove-offer", "prometheus-jobs", "--debug"}
10:19:10 INFO juju.juju api.go:67 connecting to API addresses: [10.1.232.13:17070 10.1.232.14:17070 10.1.232.15:17070]
10:19:10 DEBUG juju.api apiclient.go:1092 successfully dialed "wss://10.1.232.15:17070/api"
10:19:10 INFO juju.api apiclient.go:624 connection established to "wss://10.1.232.15:17070/api"
10:19:10 DEBUG juju.api monitor.go:35 RPC connection died
ERROR cannot delete application offer "prometheus-jobs": state changing too quickly; try again soon
10:19:10 DEBUG cmd supercommand.go:519 error stack:
/build/juju/parts/juju/go/src/github.com/juju/juju/apiserver/params/params.go:103: cannot delete application offer "prometheus-jobs": state changing too quickly; try again soon
canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju remove-application prometheus --force
removing application prometheus failed: another user was updating application; please try again
canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju remove-application prometheus --force
removing application prometheus failed: another user was updating application; please try again
canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju remove-application prometheus --force --no-wait
removing application prometheus failed: another user was updating application; please try again
canonicalcr@dbnk8dev01:~/cpe-deployments/config$
canonicalcr@dbnk8dev01:~/cpe-deployments/config$
canonicalcr@dbnk8dev01:~/cpe-deployments/config$
canonicalcr@dbnk8dev01:~/cpe-deployments/config$
canonicalcr@dbnk8dev01:~/cpe-deployments/config$
canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju remove-application prometheus --force
removing application prometheus failed: another user was updating application; please try again
canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju remove-application prometheus --force --debug
10:21:26 INFO juju.cmd supercommand.go:83 running juju [2.7.6 gc go1.10.4]
10:21:26 DEBUG juju.cmd supercommand.go:84 args: []string{"/snap/juju/11285/bin/juju", "remove-application", "prometheus", "--force", "--debug"}
10:21:26 INFO juju.juju api.go:67 connecting to API addresses: [10.1.232.13:17070 10.1.232.14:17070 10.1.232.15:17070]
10:21:26 DEBUG juju.api apiclient.go:1092 successfully dialed "wss://10.1.232.15:17070/model/4b641038-add9-4708-8213-19e44d87f232/api"
10:21:26 INFO juju.api apiclient.go:624 connection established to "wss://10.1.232.15:17070/model/4b641038-add9-4708-8213-19e44d87f232/api"
10:21:26 INFO juju.juju api.go:302 API endpoints changed from [10.1.232.15:17070 10.1.232.14:17070 10.1.232.13:17070] to [10.1.232.15:17070 10.1.232.13:17070 10.1.232.14:17070]
10:21:26 INFO cmd removeapplication.go:247 removing application prometheus failed: another user was updating application; please try again
10:21:26 DEBUG juju.api monitor.go:35 RPC connection died
10:21:27 INFO cmd supercommand.go:525 command finished
canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju offers
Offer User Relation id Status Endpoint Interface Role Ingress subnets
prometheus-jobs -
prometheus-target -

canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju remove-offer prometheus-jobs
ERROR cannot delete application offer "prometheus-jobs": state changing too quickly; try again soon
canonicalcr@dbnk8dev01:~/cpe-deployments/config$ juju remove-offer prometheus-jobs --debug
10:21:47 INFO juju.cmd supercommand.go:83 running juju [2.7.6 gc go1.10.4]
10:21:47 DEBUG juju.cmd supercommand.go:84 args: []string{"/snap/juju/11285/bin/juju", "remove-offer", "prometheus-jobs", "--debug"}
10:21:47 INFO juju.juju api.go:67 connecting to API addresses: [10.1.232.14:17070 10.1.232.13:17070 10.1.232.15:17070]
10:21:47 DEBUG juju.api apiclient.go:1092 successfully dialed "wss://10.1.232.15:17070/api"
10:21:47 INFO juju.api apiclient.go:624 connection established to "wss://10.1.232.15:17070/api"
10:21:47 DEBUG juju.api monitor.go:35 RPC connection died
ERROR cannot delete application offer "prometheus-jobs": state changing too quickly; try again soon
10:21:47 DEBUG cmd supercommand.go:519 error stack:
/build/juju/parts/juju/go/src/github.com/juju/juju/apiserver/params/params.go:103: cannot delete application offer "prometheus-jobs": state changing too quickly; try again soon

canonicalcr@dbnk8dev01:~$ juju remove-application prometheus
removing application prometheus
canonicalcr@dbnk8dev01:~$ juju status
Model Controller Cloud/Region Version SLA Timestamp Notes
lma foundation_vsphere vsphere/Dearborn-PRD 2.7.6 unsupported 10:36:50-04:00 attempt 26 to destroy model failed (will retry): model not empty, found 1 application (model not empty)

App Version Status Scale Charm Store Rev OS Notes
prometheus waiting 0 prometheus2 jujucharms 14 ubuntu

canonicalcr@dbnk8dev01:~$ juju status
Model Controller Cloud/Region Version SLA Timestamp Notes
lma foundation_vsphere vsphere/Dearborn-PRD 2.7.6 unsupported 10:37:07-04:00 attempt 26 to destroy model failed (will retry): model not empty, found 1 application (model not empty)

App Version Status Scale Charm Store Rev OS Notes
prometheus waiting 0 prometheus2 jujucharms 14 ubuntu

canonicalcr@dbnk8dev01:~$ juju remove-application prometheus --force
removing application prometheus failed: another user was updating application; please try again

canonicalcr@dbnk8dev01:~$ juju remove-application prometheus --force --debug
10:38:36 INFO juju.cmd supercommand.go:83 running juju [2.7.6 gc go1.10.4]
10:38:36 DEBUG juju.cmd supercommand.go:84 args: []string{"/snap/juju/11285/bin/juju", "remove-application", "prometheus", "--force", "--debug"}
10:38:36 INFO juju.juju api.go:67 connecting to API addresses: [10.1.232.15:17070 10.1.232.13:17070 10.1.232.14:17070]
10:38:36 DEBUG juju.api apiclient.go:1092 successfully dialed "wss://10.1.232.14:17070/model/4b641038-add9-4708-8213-19e44d87f232/api"
10:38:36 INFO juju.api apiclient.go:624 connection established to "wss://10.1.232.14:17070/model/4b641038-add9-4708-8213-19e44d87f232/api"
10:38:36 INFO juju.juju api.go:302 API endpoints changed from [10.1.232.14:17070 10.1.232.15:17070 10.1.232.13:17070] to [10.1.232.14:17070 10.1.232.13:17070 10.1.232.15:17070]
10:38:36 INFO cmd removeapplication.go:247 removing application prometheus failed: another user was updating application; please try again
10:38:36 DEBUG juju.api monitor.go:35 RPC connection died
10:38:36 INFO cmd supercommand.go:525 command finished

juju destroy-model lma --force --no-wait does not work. Please advise on what to do?

tags: added: cpe-onsite
Revision history for this message
Ian Booth (wallyworld) wrote :

Without more information like a model dump, it's difficult to advise exactly how to proceed.
If you really do want to simply destroy the model, it may be that you need to manually delete the application offers using the mongo client and this should unblock the application removal and hence model deletion.

Changed in juju:
milestone: none → 2.8-beta1
status: New → Triaged
importance: Undecided → High
Revision history for this message
Ian Booth (wallyworld) wrote :

I'm having trouble reproducing this. On 2.7.6 I did

$ juju deploy prometheus
$ juju offer prometheus:scrape prometheus-scrape
$ juju offer prometheus:scrape prometheus-target
$ juju offers
Offer User Relation id Status Endpoint Interface Role Ingress subnets
prometheus-scrape -
prometheus-target -

$ juju remove-offer prometheus-target --force
WARNING! This command will remove offers: admin/controller.prometheus-target
This includes all relations to those offers.

Continue [y/N]? y
$

Works without --force also.

I can also

$ juju remove-application prometheus

(--force not needed because there are no active offers with connections).

From the juju offers output printed in the bug report, there were no connections to the offers when the remove was done. Had there been previous connections?
Was endpoint was offered for the prometheus-jobs offer? There's no jobs endpoint on the charm from what I can see.

Can you provide exact reproduction steps, including what was related to the offers etc?

Or a model-dump of the model where things have gone wrong?

Changed in juju:
status: Triaged → Incomplete
Changed in juju:
milestone: 2.8-beta1 → 2.8-rc1
Tim Penhey (thumper)
tags: added: cross-model
Changed in juju:
milestone: 2.8-rc1 → none
Revision history for this message
Ryan Farrell (whereisrysmind) wrote :

I believe that I have encountered a similar issue, if not the same as this one.

In my case I'm using two different localhost controllers+models:

Model1 "local"
ubuntu
  - telegraf
prometheus2
grafana

Model2 "localtest"
ubuntu
  - telegraf

With Model2 as current context:

$ juju offer telegraf:dashboards
$ juju offer telegraf:prometheus-client
ERROR cannot add application offer "telegraf": application offer already exists

^ I appeared to have been blocked on creating the second offer from the same application.

juju switch local
juju consume localtest:admin/default.telegraf r-telegraf #OK
juju add-relation r-telegraf:dashboards grafana:dashboards #OK
juju remove-relation r-telegraf:dashboards grafana:dashboards #OK
juju remove-saas r-telegraf

juju switch localtest
juju remove-offer telegraf
ERROR cannot delete application offer "telegraf": offer has 1 relation
juju remove-offer --debug telegraf
19:48:26 INFO juju.cmd supercommand.go:83 running juju [2.7.6 gc go1.10.4]
19:48:26 DEBUG juju.cmd supercommand.go:84 args: []string{"/snap/juju/11454/bin/juju", "remove-offer", "--debug", "telegraf"}
19:48:26 INFO juju.juju api.go:67 connecting to API addresses: [10.132.251.163:17070]
19:48:26 DEBUG juju.api apiclient.go:1092 successfully dialed "wss://10.132.251.163:17070/api"
19:48:26 INFO juju.api apiclient.go:624 connection established to "wss://10.132.251.163:17070/api"
19:48:26 DEBUG juju.api monitor.go:35 RPC connection died
ERROR cannot delete application offer "telegraf": offer has 1 relation
19:48:26 DEBUG cmd supercommand.go:519 error stack:
/build/juju/parts/juju/go/src/github.com/juju/juju/apiserver/params/params.go:103: cannot delete application offer "telegraf": offer has 1 relation

juju offers
Offer User Relation id Status Endpoint Interface Role Ingress subnets
telegraf admin 1 joined dashboards grafana-dashboard provider 10.132.251.130/32

juju switch local
19:50:18:rfarr@fiddlestyx:~/sandbox/crashdump (master)$ juju list-offers
Offer User Relation id Status Endpoint Interface Role Ingress subnets

I grabbed a juju-crashdump, if that helps:
https://private-fileshare.canonical.com/~rfarr/lp1873472/

Revision history for this message
Ryan Farrell (whereisrysmind) wrote :

To be clear "local" and "localtest" are controllers, not models ^

Revision history for this message
Ian Booth (wallyworld) wrote :

A couple of clarifications below, please let me know if I have missed anything.

RE:
$ juju offer telegraf:dashboards
$ juju offer telegraf:prometheus-client
ERROR cannot add application offer "telegraf": application offer already exists

^ I appeared to have been blocked on creating the second offer from the same application.
---

By default juju will name the offer after the application. If you want 2 different offers for the same application, you need to supply the offername as an optional arg, see juju help offer. Note also, there's not necessarily any need to create 2 offers - you can have a single offer contain multiple endpoints, eg

$ juju offer telegraf:dashboards,prometheus

The reason you would want 2 different offers is so that you could set up different permissions on who could use each offer.

RE:
juju remove-offer telegraf
ERROR cannot delete application offer "telegraf": offer has 1 relation
juju remove-offer --debug telegraf
---

If there's a consumer using the offer, you can't remove it unless you use --force.

You can see that the offer has a consumer connected because running juju offers (as you did do)

juju offers
Offer User Relation id Status Endpoint Interface Role Ingress subnets
telegraf admin 1 joined dashboards grafana-dashboard provider 10.132.251.130/32

So you need to "juju remove-relation 1" to terminate the consumer relation to the offer, or use --force, if you want to remove the offer.

Revision history for this message
Camille Rodriguez (camille.rodriguez) wrote :
Download full text (6.7 KiB)

I am working on reproducing this issue. Here are my bundles:

1st model (prometheus) :

$ cat prometheus.yaml
series: bionic
applications:
  prometheus:
    charm: cs:prometheus2
    num_units: 1

$ cat prometheus-offers.yaml
applications:
  prometheus:
    offers:
      prometheus-target:
        endpoints:
        - target
        acl:
          admin: admin
      prometheus-jobs:
        endpoints:
        - manual-jobs
        acl:
          admin: admin

2nd model :
$ cat consumers.yaml
applications:
  kubernetes-master:
    charm: cs:~containers/kubernetes-master
    num_units: 2
    options:
      allow-privileged: "true"
      channel: 1.17/stable
      # XXX: bug 1841800
      authorization-mode: "RBAC,Node"

relations:
- - kubernetes-master:kube-api-endpoint
  - prometheus-target:target
- - kubernetes-master:prometheus
  - prometheus-jobs:manual-jobs

saas:
  prometheus-target:
    url: openstack-serverstack:admin/prometheus.prometheus-target
  prometheus-jobs:
    url: openstack-serverstack:admin/prometheus.prometheus-jobs

After deploying both models, I tried removing both offers.
Removing the 1st offer, prometheus-jobs, went well.

$ juju remove-offer prometheus-jobs
ERROR cannot delete application offer "prometheus-jobs": offer has 1 relation
ubuntu@camille-bastion:~/offers-bug$ juju remove-offer prometheus-jobs --force
WARNING! This command will remove offers: admin/prometheus.prometheus-jobs
This includes all relations to those offers.

Continue [y/N]? y
$ juju offers
Offer User Relation id Status Endpoint Interface Role Ingress subnets
prometheus-target admin 2 joined target http requirer

We can see that there is only prometheus-target left now. Next step is to remove the offer for prometheus-target. This gets trickier.

$ juju remove-offer prometheus-target --force
WARNING! This command will remove offers: admin/prometheus.prometheus-target
This includes all relations to those offers.

Continue [y/N]? y

$ juju offers
Offer User Relation id Status Endpoint Interface Role Ingress subnets
prometheus-target admin 2 joined target http requirer

$ juju remove-offer prometheus-target --force
WARNING! This command will remove offers: admin/prometheus.prometheus-target
This includes all relations to those offers.

Continue [y/N]? y
$ juju offers
Offer User Relation id Status Endpoint Interface Role Ingress subnets
prometheus-target admin 2 joined target http requirer

g$ juju remove-offer prometheus-target --force --debug
15:21:03 INFO juju.cmd supercommand.go:91 running juju [2.8.0 0 d816abe62fbf6787974e5c4e140818ca08586e44 gc go1.14.4]
15:21:03 DEBUG juju.cmd supercommand.go:92 args: []string{"/snap/juju/12370/bin/juju", "remove-offer", "prometheus-target", "--force", "--debug"}
WARNING! This command will remove offers: admin/prometheus.prometheus-target
This includes all relations to those offers.

Continue [y/N]? y
15:21:05 INFO juju.juju api.go:67 connecting to API addresses: [10.5.0.12:17070 252.0.12.1:17070]
15:21:05 DEBUG juju.api apiclient.go:1105 successfully dial...

Read more...

Revision history for this message
Camille Rodriguez (camille.rodriguez) wrote :

juju 2.8.0

tags: removed: cpe-onsite
Revision history for this message
John A Meinel (jameinel) wrote :

It seems likely that the relationship isn't being torn down cleanly because on of the units on the relation is in error (and thus isn't acknowledging that the relation is going away).

Usually that is what '--force' represents (continue even though the units are not responding), though that doesn't seem to be how --force was used in this case. (i'm also surprised to see that you supply --force and then Juju still prompts you to confirm.)

I would guess that running 'juju resolved --no-retry prometheus/0' would let it progress to a point where it can get the next relation-broken hook and be able to finally clean up the database.

Side note: I would have thought we would have prompting *or* force, not force *and* prompting. Having --force only mean 'remove the offer even if it has relations' means we lose the ability to have --force mean 'remove the offer even if relations aren't responding'.

Note that force is usually a bad choice, since it inherently means "ignore some of the safety checks" regardless. I do realize that remove-offer is a bit special, because you may not have access to the models that are consuming your offer, but you still might need to tear down your application.
I wonder if we should have used a different flag for it.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for juju because there has been no activity for 60 days.]

Changed in juju:
status: Incomplete → Expired
Revision history for this message
Márton Kiss (marton-kiss) wrote :

I see the same with a cmr lma - openstack deployment, juju version 2.8.5

It was impossible to remove the graylog application (other applications would be able to remove only using the --force flag, but those at least got removed):
$ juju status
Model Controller Cloud/Region Version SLA Timestamp Notes
lma foundations-maas maas_cloud/default 2.8.5 unsupported 18:49:56Z attempt 37 to destroy model failed (will retry): model not empty, found 1 application (model not empty)

App Version Status Scale Charm Store Rev OS Notes
graylog unknown 0 graylog jujucharms 44 ubuntu

$ juju remove-application graylog --force
removing application graylog failed: another user was updating application; please try again

juju destroy model failed to destroy the model even with the force no-wait options:
$ juju destroy-model lma --force --no-wait
WARNING! This command will destroy the "lma" model.
This includes all machines, applications, data and other resources.

Continue [y/N]? y
Destroying model
Waiting for model to be removed, 1 application(s)...............................
................................................................................
............................................................

Juju offers were empty in this case:

$ juju offers
Offer User Relation id Status Endpoint Interface Role Ingress subnets

The only solution was the tear-down the whole juju controller, and redeploy everything from scratch.

Changed in juju:
status: Expired → New
Revision history for this message
Márton Kiss (marton-kiss) wrote :

On controller tear-down I mean to delete the juju controller vms, because those controller destroy attempts had been failed as well:

juju destroy-controller foundations-maas --destroy-all-models
WARNING! This command will destroy the "foundations-maas" controller.
This includes all machines, applications, data and other resources.

Continue? (y/N):y
Destroying controller
Waiting for hosted model resources to be reclaimed
Waiting on 2 models, 3 applications
Waiting on 2 models, 3 applications
Waiting on 1 model, 3 applications
Waiting on 1 model, 3 applications
Waiting on 1 model, 2 applications
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
...

juju kill-controller foundations-maas
WARNING! This command will destroy the "foundations-maas" controller.
This includes all machines, applications, data and other resources.

Continue? (y/N):y
Destroying controller "foundations-maas"
Waiting for resources to be reclaimed
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application
Waiting on 1 model, 1 application, will kill machines directly in 4m25s
...
Waiting on 1 model, 1 application, will kill machines directly in 7s
Waiting on 1 model, 1 application, will kill machines directly in 2s
Killing admin/lma directly
  done
All hosted models destroyed, cleaning up controller machines

The juju-kill controller destroyed the controller after 5 minutes.

Revision history for this message
Pen Gale (pengale) wrote :

Dropping this into the 3.0.0 milestone, as part of the work to improve teardown.

Changed in juju:
milestone: none → 3.0.0
Ian Booth (wallyworld)
Changed in juju:
status: New → Triaged
tags: added: destroy-model remove-application
Revision history for this message
Ian Booth (wallyworld) wrote :

The core issue of not being able to remove an offer if there's more than one offer against the same underlying application should be fixed here

https://github.com/juju/juju/pull/13337

Changed in juju:
milestone: 3.0.0 → 2.9.15
assignee: nobody → Ian Booth (wallyworld)
status: Triaged → In Progress
Revision history for this message
Ian Booth (wallyworld) wrote :

Marked as fixed - if there's additional issues we can open a new bug.

Changed in juju:
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.