ceilometer-upgrade action does not work as expected

Bug #1920620 reported by Cedric Lemarchand
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Ceilometer Charm
Invalid
Undecided
Unassigned

Bug Description

I am facing an issue when deploying an Openstack telemetry bundle, where the ceilometer charm is stuck in blocked status, waiting for the ceilometer-upgrade action, which does nothing when triggered.

What has been tried so far without success:

    * deploy openstack-charmers and openstack-charmers-next charms versions for ceilometer and gnocchi
    * inspection of logs (ceilometer, gnocchi and juju agent)
    * running the “ceilometer-upgrade” inside the unit:
        # ceilometer-upgrade --debug
        2021-03-17 15:35:40.581 27392 DEBUG ceilometer.cmd.storage [-] Upgrading Gnocchi resource types upgrade /usr/lib/python3/dist-packages/ceilometer/cmd/storage.py:42

As far as I can understand the bundle seems correct (based on somewhat official https://github.com/openstack-charmers/openstack-bundles/blob/master/stable/openstack-telemetry/bundle.yaml, see attachement)

Revision history for this message
Cedric Lemarchand (cedric-lemarchand) wrote :
description: updated
description: updated
description: updated
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Hi Cedric

In order to help we need some additional information.

1. Ubuntu version (or other)
2. OpenStack version
3. charms version (e.g. current stable - 21.01, or charm revision numbers)
4. The output of "juju status"
5. If possible atttachments of the ceilometer logs from install -> where the actions didn't work.
6. The result of the ceilometer action command.

Thanks.

Changed in charm-ceilometer:
status: New → Incomplete
Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :
Download full text (12.7 KiB)

For some odd reason, the following seemed to only be sent to me from Cedric (odd launchpad hiccup??) (with better formatting than the previous comment):

Hi Alex,

1. 20.04/Focal
2. Ussuri (openstack-origin: distro)
3. lastest from openstack-charmers (see juju status)
4.

```
Model Controller Cloud/Region Version SLA Timestamp
liquid-rmv juju-controller brane-liquid/default 2.8.9 unsupported 08:54:14+01:00

App Version Status Scale Charm Store Rev OS Notes
aodh 10.0.0 active 1 aodh jujucharms 46 ubuntu
aodh-mysql-router 8.0.23 active 1 mysql-router jujucharms 6 ubuntu
ceilometer 14.0.0 blocked 1 ceilometer jujucharms 389 ubuntu
ceilometer-agent 14.0.0 active 1 ceilometer-agent jujucharms 340 ubuntu
ceph-mon 15.2.8 active 3 ceph-mon jujucharms 53 ubuntu
ceph-osd 15.2.8 active 3 ceph-osd jujucharms 308 ubuntu
ceph-radosgw 15.2.8 active 1 ceph-radosgw jujucharms 294 ubuntu
cinder 16.2.1 active 1 cinder jujucharms 308 ubuntu
cinder-ceph 16.2.1 active 1 cinder-ceph jujucharms 260 ubuntu
cinder-mysql-router 8.0.23 active 1 mysql-router jujucharms 6 ubuntu
glance 20.0.1 active 1 glance jujucharms 303 ubuntu
glance-mysql-router 8.0.23 active 1 mysql-router jujucharms 6 ubuntu
gnocchi 4.3.4 active 1 gnocchi jujucharms 128 ubuntu
gnocchi-mysql-router 8.0.23 active 1 mysql-router jujucharms 6 ubuntu
keystone 17.0.0 active 1 keystone jujucharms 321 ubuntu
keystone-mysql-router 8.0.23 active 1 mysql-router jujucharms 6 ubuntu
memcached active 1 memcached jujucharms 32 ubuntu
mysql-innodb-cluster 8.0.23 active 3 mysql-innodb-cluster jujucharms 5 ubuntu
neutron-api 16.2.0 active 1 neutron-api jujucharms 292 ubuntu
neutron-api-mysql-router 8.0.23 active 1 mysql-router jujucharms 6 ubuntu
neutron-gateway 16.2.0 active 1 neutron-gateway jujucharms 289 ubuntu
neutron-openvswitch 16.2.0 active 1 neutron-openvswitch jujucharms 280 ubuntu
nova-cloud-controller 21.1.1 active 1 nova-cloud-controller jujucharms 353 ubuntu
nova-cloud-controller-mysql-router 8.0.23 active 1 mysql-router jujucharms 6 ubuntu
nova-compute 21.1.1 ...

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Cedric

Thanks for the additional info (although strangely it didn't appear in the bug, only in my email??)

Anyway, there's still not enough information to go on. What is the output of:

juju run-action ceilometer/0 ceilometer-upgrade --wait

return?

Does the command fail or simply just do nothing. Beforehand, please turn on logging on the charm as much as possible:

juju model-config -m logging-config=debug

Also, check the output from the ceilometer unit with:

juju debug-log --replay -i unit-ceilomter-0

Thanks
Alex.

Revision history for this message
Cedric Lemarchand (cedric-lemarchand) wrote :

Previously the action returned nothing, the unit went in "executing" state then fails back to "blocked".

Adding the "--wait" option actually succeed, ceilometer is now happy.

```
juju run-action ceilometer/0 ceilometer-upgrade --wait
unit-ceilometer-0:
  UnitId: ceilometer/0
  id: "6"
  results:
    Stderr: |
      E: Unable to locate package openstack-release
      E: Unable to locate package openstack-release
      E: Unable to locate package openstack-release
      E: Unable to locate package openstack-release
      E: Unable to locate package openstack-release
    Stdout: "Reading package lists...\nBuilding dependency tree...\nReading state
      information...\nReading package lists...\nBuilding dependency tree...\nReading
      state information...\nReading package lists...\nBuilding dependency tree...\nReading
      state information...\n2021-03-24 10:41:33.525 463491 DEBUG ceilometer.cmd.storage
      [-] Upgrading Gnocchi resource types upgrade /usr/lib/python3/dist-packages/ceilometer/cmd/storage.py:42\e[00m\nReading
      package lists...\nBuilding dependency tree...\nReading state information...\nReading
      package lists...\nBuilding dependency tree...\nReading state information...\nactive\nOpenStack
      Release: {}\nOpenStack Release: {}\nOpenStack Release: {}\nOpenStack Release:
      {}\nOpenStack Release: {}\n"
    outcome: success, ceilometer-upgrade completed.
  status: completed
  timing:
    completed: 2021-03-24 10:41:37 +0000 UTC
    enqueued: 2021-03-24 10:41:29 +0000 UTC
    started: 2021-03-24 10:41:30 +0000 UTC
```

Is the option "--wait" mandatory for this specific ceilometer action ?

Let me now if you need further information.

Revision history for this message
Cedric Lemarchand (cedric-lemarchand) wrote :

Side note for completeness: "juju model-config -m logging-config=debug" fails:

```
ERROR opening API connection: model name "clemarch/logging-config=debug" not valid
```

Juju client and controller running v2.8.9.

Cheers

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Side note for completeness: "juju model-config -m logging-config=debug"

Should be

juju model-config -m model-name logging-config=debug

(if model-name is not the current model - the one with the * next to it in the output of "juju models")

otherwise:

juju model-config logging-config=debug

(sorry, there was a typo in my original command)

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

> Previously the action returned nothing, the unit went in "executing" state then fails back to "blocked".

> Adding the "--wait" option actually succeed, ceilometer is now happy.

So "all" the --wait option does is wait for the command to fail or succeed. The run-action command without the --wait option simply queues the command to run. The "actions" command shows the list of actions and their ids. The "show-action-output" and "show-action-status" would then show the state and output from those actions ids. --wait just simplifies the user experience to wait for an action to complete (fail/success).

So, no --wait shouldn't make a difference. I'm curious as to what might have done, but that ship may have sailed now, if there's nothing in the debug log that indicates what might be wrong.

Unfortunately, juju now defaults to 'info' for logging which means that the error may never have been logged (debug logging tends to give more logging).

I'm pleased it's now working for you; however, I'm leaving the bug as incomplete as I'm not sure that the information is available in the debug-log. If you could attach the result from "juju debug-log --replay -i unit-ceilometer-0" that *may* provide some info.

Thanks.

Revision history for this message
Cedric Lemarchand (cedric-lemarchand) wrote :

I reproduce the issue on freshly deployed bundle:

```
juju model-config logging-config
debug

juju run ceilometer-upgrade --unit ceilometer/leader

juju show-action-status
actions:
- action: juju-run
  completed at: "2021-03-24 15:12:58"
  id: "2"
  status: completed
  unit: ceilometer/0

juju show-action-output 2
UnitId: ceilometer/0
id: "2"
results: {}
status: completed
timing:
  completed: 2021-03-24 15:12:58 +0000 UTC
  enqueued: 2021-03-24 15:12:46 +0000 UTC
  started: 2021-03-24 15:12:46 +0000 UTC

juju status ceilometer
Model Controller Cloud/Region Version SLA Timestamp
liquid-rmv juju-controller brane-liquid/default 2.8.9 unsupported 16:28:03+01:00

App Version Status Scale Charm Store Rev OS Notes
ceilometer 14.0.0 blocked 1 ceilometer jujucharms 280 ubuntu

Unit Workload Agent Machine Public address Ports Message
ceilometer/0* blocked idle 4 10.140.10.71 Run the ceilometer-upgrade action on the leader to initialize ceilometer and gnocchi

Machine State DNS Inst id Series AZ Message
4 started 10.140.10.71 VM-25 focal default Deployed
```

Unfortunatly "juju debug-log --replay -i unit-ceilometer/0" does not return anything.

Hope it helps to track down the issue.

Revision history for this message
Cedric Lemarchand (cedric-lemarchand) wrote :

I finally get the debug log for ceilometer, see attached file.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Cedric, Thanks for the additional info; I'll take a look at the debug log.

FYI, for "reasons" the command for the juju debug-log to include (-i) the unit-ceilometer/0 is:

juju debug-log --replay -i unit-ceilomter-0

Note the '-' rather than the '/'. It catches me out all the time.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

I'm a bit confused about the timings in the debug-log. It seems to start at 16:23 yet the action was run at 15:12 ... do you have the logs that overlap with when the action was run. The /var/log/juju/machine-lock.log file indicates what/when things run on the machine that ceilometer/0 is on.

Anyway, I'll deploy it this evening and have a poke around and see what happens.

Revision history for this message
Cedric Lemarchand (cedric-lemarchand) wrote :

This is weird, on my side (2.8.9-bigsur-amd64), using "juju debug-log -i ceilometer/0" works fine, were "juju debug-log -i ceilometer-0" didn't work. Maybe another glitch then.

Thanks for the time being involved, let me now if you need something.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :
Download full text (4.2 KiB)

So I tried to reproduce it using the stable charm but my version of the command:

juju run-action ceilometer/leader ceilometer-upgrade

did run the upgrade, and it completed okay.

My equivalent debug log is:

unit-ceilometer-0: 20:32:34 DEBUG juju.worker.uniter.operation preparing operation "run action 2" for ceilometer/0
unit-ceilometer-0: 20:32:34 DEBUG juju.worker.uniter.operation executing operation "run action 2" for ceilometer/0
unit-ceilometer-0: 20:32:34 DEBUG juju.worker.uniter [AGENT-STATUS] executing: running action ceilometer-upgrade
unit-ceilometer-0: 20:32:34 DEBUG juju.worker.uniter.runner running action "ceilometer-upgrade" on 1
unit-ceilometer-0: 20:32:34 DEBUG juju.worker.uniter.runner starting jujuc server {unix @/var/lib/juju/agents/unit-ceilometer-0/agent.socket <nil>}
unit-ceilometer-0: 20:32:35 DEBUG unit.ceilometer/0.ceilometer-upgrade Reading package lists...
unit-ceilometer-0: 20:32:35 DEBUG unit.ceilometer/0.ceilometer-upgrade Building dependency tree...
unit-ceilometer-0: 20:32:35 DEBUG unit.ceilometer/0.ceilometer-upgrade Reading state information...
unit-ceilometer-0: 20:32:35 WARNING unit.ceilometer/0.ceilometer-upgrade E: Unable to locate package openstack-release
unit-ceilometer-0: 20:32:36 DEBUG unit.ceilometer/0.ceilometer-upgrade Reading package lists...
unit-ceilometer-0: 20:32:36 DEBUG unit.ceilometer/0.ceilometer-upgrade Building dependency tree...
unit-ceilometer-0: 20:32:36 DEBUG unit.ceilometer/0.ceilometer-upgrade Reading state information...
unit-ceilometer-0: 20:32:36 WARNING unit.ceilometer/0.ceilometer-upgrade E: Unable to locate package openstack-release
unit-ceilometer-0: 20:32:37 DEBUG unit.ceilometer/0.ceilometer-upgrade none
unit-ceilometer-0: 20:32:37 DEBUG unit.ceilometer/0.ceilometer-upgrade Reading package lists...
unit-ceilometer-0: 20:32:37 DEBUG unit.ceilometer/0.ceilometer-upgrade Building dependency tree...
unit-ceilometer-0: 20:32:37 DEBUG unit.ceilometer/0.ceilometer-upgrade Reading state information...
unit-ceilometer-0: 20:32:38 WARNING unit.ceilometer/0.ceilometer-upgrade E: Unable to locate package openstack-release
unit-ceilometer-0: 20:32:39 DEBUG unit.ceilometer/0.ceilometer-upgrade 2021-03-24 20:32:39.890 23930 DEBUG ceilometer.cmd.storage [-] Upgrading Gnocchi resource types upgrade /usr/l
ib/python3/dist-packages/ceilometer/cmd/storage.py:42
unit-ceilometer-0: 20:32:50 DEBUG juju.worker.uniter.remotestate got leader settings change for ceilometer/0: ok=true
unit-ceilometer-0: 20:32:50 DEBUG unit.ceilometer/0.ceilometer-upgrade Reading package lists...
unit-ceilometer-0: 20:32:51 DEBUG unit.ceilometer/0.ceilometer-upgrade Building dependency tree...
unit-ceilometer-0: 20:32:51 DEBUG unit.ceilometer/0.ceilometer-upgrade Reading state information...
unit-ceilometer-0: 20:32:51 WARNING unit.ceilometer/0.ceilometer-upgrade E: Unable to locate package openstack-release
unit-ceilometer-0: 20:32:51 DEBUG unit.ceilometer/0.ceilometer-upgrade none
unit-ceilometer-0: 20:32:52 DEBUG unit.ceilometer/0.ceilometer-upgrade Reading package lists...
unit-ceilometer-0: 20:32:52 DEBUG unit.ceilometer/0.ceilometer-upgrade Building dependency tree...
unit-ceilometer-0: 20:32:52 DEBUG unit.c...

Read more...

Revision history for this message
Cedric Lemarchand (cedric-lemarchand) wrote :

Yes it seems:

juju show-action-status 8
actions:
- action: juju-run
  completed at: "2021-03-25 10:31:40"
  id: "8"
  status: completed
  unit: ceilometer/0

unit-ceilometer-0: 10:31:37 DEBUG juju.worker.uniter.remotestate got action change for ceilometer/0: [8] ok=true
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.uniter.operation running operation run action 8 for ceilometer/0
unit-ceilometer-0: 10:31:37 DEBUG juju.machinelock acquire machine lock for ceilometer/0 uniter (run action 8)
unit-ceilometer-0: 10:31:37 DEBUG juju.machinelock machine lock acquired for ceilometer/0 uniter (run action 8)
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.uniter.operation preparing operation "run action 8" for ceilometer/0
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.uniter.operation executing operation "run action 8" for ceilometer/0
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.uniter [AGENT-STATUS] executing: running action juju-run
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.uniter.runner juju-run action is running
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.uniter.runner starting jujuc server {unix @/var/lib/juju/agents/unit-ceilometer-0/agent.socket <nil>}
unit-ceilometer-0: 10:31:40 DEBUG juju.worker.uniter.operation committing operation "run action 8" for ceilometer/0
unit-ceilometer-0: 10:31:40 DEBUG juju.machinelock machine lock released for ceilometer/0 uniter (run action 8)
unit-ceilometer-0: 10:31:40 DEBUG juju.worker.uniter.operation lock released for ceilometer/0
unit-ceilometer-0: 10:31:40 DEBUG juju.worker.uniter no operations in progress; waiting for changes
unit-ceilometer-0: 10:31:40 DEBUG juju.worker.uniter [AGENT-STATUS] idle:

Unit remain in "blocked" state:

juju status ceilometer
Model Controller Cloud/Region Version SLA Timestamp
liquid-rmv juju-controller brane-liquid/default 2.8.9 unsupported 10:32:45Z

App Version Status Scale Charm Store Rev OS Notes
ceilometer 14.0.0 blocked 1 ceilometer jujucharms 280 ubuntu

Unit Workload Agent Machine Public address Ports Message
ceilometer/0* blocked idle 4 10.140.10.68 Run the ceilometer-upgrade action on the leader to initialize ceilometer and gnocchi

Machine State DNS Inst id Series AZ Message
4 started 10.140.10.68 VM-24 focal default Deployed

I also join the related bundle, could be good to compare mine and yours.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :
Download full text (4.6 KiB)

Okay, let's try a different tack. let's look at the relations between the 3 main components here (with ceilometer). This is from my working system. Let's compare:

juju status ceilometer --relations
Model Controller Cloud/Region Version SLA Timestamp
test tinwood2-serverstack serverstack/serverstack 2.9-rc6 unsupported 18:37:28Z

App Version Status Scale Charm Store Rev OS Notes
ceilometer 14.0.0 active 1 ceilometer local 0 ubuntu

Unit Workload Agent Machine Public address Ports Message
ceilometer/0* active idle 16 172.20.1.5 Unit is ready

Machine State DNS Inst id Series AZ Message
16 started 172.20.1.5 2d140d44-c7f8-4aad-b327-e1649b669b2c focal nova ACTIVE

Relation provider Requirer Interface Type Message
ceilometer:ceilometer-service ceilometer-agent:ceilometer-service ceilometer regular
ceilometer:cluster ceilometer:cluster ceilometer-ha peer
gnocchi:metric-service ceilometer:metric-service gnocchi regular
keystone:identity-credentials ceilometer:identity-credentials keystone-credentials regular
keystone:identity-notifications ceilometer:identity-notifications keystone-notifications regular
rabbitmq-server:amqp ceilometer:amqp rabbitmq regular

juju status gnocchi --relations
Model Controller Cloud/Region Version SLA Timestamp
test tinwood2-serverstack serverstack/serverstack 2.9-rc6 unsupported 18:39:08Z

App Version Status Scale Charm Store Rev OS Notes
gnocchi 4.3.4 active 1 gnocchi jujucharms 46 ubuntu
gnocchi-mysql-router 8.0.23 active 1 mysql-router jujucharms 14 ubuntu

Unit Workload Agent Machine Public address Ports Message
gnocchi/0* active idle 17 172.20.1.20 8041/tcp Unit is ready
  gnocchi-mysql-router/0* active idle 172.20.1.20 Unit is ready

Machine State DNS Inst id Series AZ Message
17 started 172.20.1.20 18aee2ea-2a4f-4985-b44d-0fa643040e6c focal nova ACTIVE

Relation provider Requirer Interface Type Message
ceph-mon:client gnocchi:storage-ceph ceph-client regular
gnocchi-mysql-router:shared-db gnocchi:shared-db mysql-shared subordinate
gnocchi:cluster gnocchi:cluster openstack-ha peer
gnocchi:metric-service ceilometer:metric-service gnocchi regular
keystone:identity-service gnocchi:identity-service keystone regular
memcached:cache gnocchi:coordinator-memcached memcache regular
mysql-innodb-clust...

Read more...

Revision history for this message
Cedric Lemarchand (cedric-lemarchand) wrote :
Download full text (5.0 KiB)

Thanks Alex, so far it seems relations are pretty the same (note we don't run the same controller version):

juju status ceilometer --relations
Model Controller Cloud/Region Version SLA Timestamp
liquid-rmv juju-controller brane-liquid/default 2.8.9 unsupported 08:59:26+02:00

App Version Status Scale Charm Store Rev OS Notes
ceilometer 14.0.0 blocked 1 ceilometer jujucharms 280 ubuntu

Unit Workload Agent Machine Public address Ports Message
ceilometer/0* blocked idle 4 10.140.10.71 Run the ceilometer-upgrade action on the leader to initialize ceilometer and gnocchi

Machine State DNS Inst id Series AZ Message
4 started 10.140.10.71 VM-25 focal default Deployed

Relation provider Requirer Interface Type Message
ceilometer:ceilometer-service ceilometer-agent:ceilometer-service ceilometer regular
ceilometer:cluster ceilometer:cluster ceilometer-ha peer
gnocchi:metric-service ceilometer:metric-service gnocchi regular
keystone:identity-credentials ceilometer:identity-credentials keystone-credentials regular
keystone:identity-notifications ceilometer:identity-notifications keystone-notifications regular
rabbitmq-server:amqp ceilometer:amqp rabbitmq regular

quark:admin$ juju status gnocchi --relations
Model Controller Cloud/Region Version SLA Timestamp
liquid-rmv juju-controller brane-liquid/default 2.8.9 unsupported 09:01:06+02:00

App Version Status Scale Charm Store Rev OS Notes
gnocchi 4.3.4 active 1 gnocchi jujucharms 46 ubuntu
gnocchi-mysql-router 8.0.23 active 1 mysql-router jujucharms 6 ubuntu

Unit Workload Agent Machine Public address Ports Message
gnocchi/0* active idle 0/lxd/1 10.140.10.75 8041/tcp Unit is ready
  gnocchi-mysql-router/0* active idle 10.140.10.75 Unit is ready

Machine State DNS Inst id Series AZ Message
0 started 10.140.10.1 VM-11 focal default Deployed
0/lxd/1 started 10.140.10.75 juju-2d66be-0-lxd-1 focal default Container started

Relation provider Requirer Interface Type Message
ceph-mon:client gnocchi:storage-ceph ceph-client regular
gnocchi-mysql-router:shared-db gnocchi:shared-db mysql-shared subordinate
gnocchi:cluster gnocchi:cluster openstack-ha peer
gnocchi:metric-service ceilometer:metric-service gnocchi regular
keystone:identity-service gnocchi:identity-service keystone regular
memcached:cache gnocchi:coordinator-memcached memcache regular
mysql-innodb-cluster:db-router gnocchi...

Read more...

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

I must admit to being completely stumped. If you are still having the problem, let's start again.

If you could try again, but then use: https://github.com/juju/juju-crashdump to grab a crashdump of the entire model, that might help in tracking down. This with the failing version of the action (i.e. without --wait). And then check that the action has completed.

There must be something weird going on, and it would be good to get to the bottom of it.

Many thanks.

Revision history for this message
Cedric Lemarchand (cedric-lemarchand) wrote :

So here it is:

juju run ceilometer-upgrade --unit ceilometer/leader

juju show-action-status
actions:
- action: juju-run
  completed at: "2021-04-10 18:38:13"
  id: "2"
  status: completed
  unit: ceilometer/0

juju show-action-output 2
UnitId: ceilometer/0
id: "2"
results: {}
status: completed
timing:
  completed: 2021-04-10 18:38:13 +0000 UTC
  enqueued: 2021-04-10 18:38:01 +0000 UTC
  started: 2021-04-10 18:38:01 +0000 UTC

File generated by juju-crashdump attached.

Thanks

Revision history for this message
Aurelien Lourot (aurelien-lourot) wrote :

@Cedric thanks for reporting. I know it's not intuitive, but for next time you'll want to set the bug back to New (from Incomplete) when you answer otherwise we may just miss your reply. Doing it now.

Changed in charm-ceilometer:
status: Incomplete → New
Revision history for this message
Cedric Lemarchand (cedric-lemarchand) wrote :

Thanks Aurélien, ack.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

> So here it is:

> juju run ceilometer-upgrade --unit ceilometer/leader

Is that a typo? i.e. it should be "juju run-action --unit ceilometer/leader ceilometer-upgrade

"juju run" (on juju < 3.0) is for running commands (not actions) as root on the unit as though you had "juju ssh" and then "command" on that unit from the CLI.

Revision history for this message
Cedric Lemarchand (cedric-lemarchand) wrote :

Yes it seems it was a typo, I am unable to reproduce it, you can close this.

Thanks for your time.

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Hi Cedric, thanks for the confirmation and I'm also pleased that you've got it sorted.

Changed in charm-ceilometer:
status: New → Invalid
Revision history for this message
Patrik Arlos (pal-arlos) wrote :
Download full text (16.1 KiB)

Hej,

sorry to add to this. I've just deployed the openstack+telemetry bundle, and seem to hit a similar issue.

I've tried to repeat the actions done above, but no success. Rather, I get errors from the run-action.

juju run-action ceilometer/leader ceilometer-upgrade --wait
unit-ceilometer-0:
  UnitId: ceilometer/0
  id: "20"
  message: 'ceilometer-upgrade resulted in an unexpected error: Command ''[''ceilometer-upgrade'',
    ''--debug'', ''--retry'', ''10'']'' returned non-zero exit status 1.'
  results:
    Stdout: "2021-05-27 06:59:43.091 93061 DEBUG ceilometer.cmd.storage [-] Upgrading
      Gnocchi resource types upgrade /usr/lib/python3/dist-packages/ceilometer/cmd/storage.py:42\e[00m\n2021-05-27
      06:59:43.110 93061 WARNING keystoneauth.identity.generic.base [-] Failed to
      discover available identity versions when contacting https://A.B.X.9:5000.
      Attempting to parse version from URL.: keystoneauth1.exceptions.connection.SSLError:
      SSL exception connecting to https://A.B.X.9:5000: HTTPSConnectionPool(host='A.B.X.9',
      port=5000): Max retries exceeded with url: / (Caused by SSLError(SSLError(\"bad
      handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate
      verify failed')])\")))\e[00m\n2021-05-27 06:59:43.112 93061 CRITICAL ceilometer
      [-] Unhandled error: keystoneauth1.exceptions.discovery.DiscoveryFailure: Could
      not find versioned identity endpoints when attempting to authenticate. Please
      check that your auth_url is correct. SSL exception connecting to https://A.B.X.9:5000:
      HTTPSConnectionPool(host='A.B.X.9', port=5000): Max retries exceeded with
      url: / (Caused by SSLError(SSLError(\"bad handshake: Error([('SSL routines',
      'tls_process_server_certificate', 'certificate verify failed')])\")))\n2021-05-27
      06:59:43.112 93061 ERROR ceilometer Traceback (most recent call last):\n2021-05-27
      06:59:43.112 93061 ERROR ceilometer File \"/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py\",
      line 485, in wrap_socket\n2021-05-27 06:59:43.112 93061 ERROR ceilometer cnx.do_handshake()\n2021-05-27
      06:59:43.112 93061 ERROR ceilometer File \"/usr/lib/python3/dist-packages/OpenSSL/SSL.py\",
      line 1915, in do_handshake\n2021-05-27 06:59:43.112 93061 ERROR ceilometer self._raise_ssl_error(self._ssl,
      result)\n2021-05-27 06:59:43.112 93061 ERROR ceilometer File \"/usr/lib/python3/dist-packages/OpenSSL/SSL.py\",
      line 1647, in _raise_ssl_error\n2021-05-27 06:59:43.112 93061 ERROR ceilometer
      \ _raise_current_error()\n2021-05-27 06:59:43.112 93061 ERROR ceilometer
      \ File \"/usr/lib/python3/dist-packages/OpenSSL/_util.py\", line 54, in exception_from_error_queue\n2021-05-27
      06:59:43.112 93061 ERROR ceilometer raise exception_type(errors)\n2021-05-27
      06:59:43.112 93061 ERROR ceilometer OpenSSL.SSL.Error: [('SSL routines', 'tls_process_server_certificate',
      'certificate verify failed')]\n2021-05-27 06:59:43.112 93061 ERROR ceilometer
      \n2021-05-27 06:59:43.112 93061 ERROR ceilometer During handling of the above
      exception, another exception occurred:\n2021-05...

Revision history for this message
Alex Kavanagh (ajkavanagh) wrote :

Hi Patrik

This doesn't look like a "ceilometer-upgrade" issue, rather than a TLS/Endpoint issue. It'll be worth verifying that the ceilometer unit has the correct certificate info. i.e. keystone is expected SSL connections and ceilometer isn't presenting valid credentials.

Have you got the relation for notifications between keystone and ceilometer setup? (juju status ceilometer --relations) will give you the list. If it's missing then:

juju add-relation ceilometer:identity-notifications keystone:identity-notifications

will add it.

Thanks, Alex.

Revision history for this message
Patrik Arlos (pal-arlos) wrote :

Hi Alex,

will check when I deploy the telemetry Openstack again. I (incorrectly perhaps), assumed that a bundle with celiometer at least got the relationships correct.

BR/Patrik

Revision history for this message
Patrik Arlos (pal-arlos) wrote :

Hi Alex,

it seems that the bundle does add the relationships.
.....
- add relation vault:certificates - mysql-innodb-cluster:certificates
- add relation ceilometer-agent:ceilometer-service - ceilometer:ceilometer-service
- add relation ceilometer:identity-notifications - keystone:identity-notifications
- add relation ceilometer:identity-credentials - keystone:identity-credentials
- add relation ceilometer-agent:nova-ceilometer - nova-compute:nova-ceilometer
- add relation ceilometer-agent:amqp - rabbitmq-server:amqp
- add relation ceilometer:amqp - rabbitmq-server:amqp
- add relation aodh-mysql-router:db-router - mysql-innodb-cluster:db-router
.....

All right, guess I have to figure out why ceilometer cant talk to vault, as I suspect that's the entity that hands out SSLs.

BR/Patrik

Revision history for this message
Aurelien Lourot (aurelien-lourot) wrote :

If I understand correctly, this has been solved (partly thanks to [0]) but now we're hitting #1906623

[0] https://github.com/openstack-charmers/openstack-bundles/pull/214

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.