ceilometer-upgrade action does not work as expected
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Ceilometer Charm |
Invalid
|
Undecided
|
Unassigned |
Bug Description
I am facing an issue when deploying an Openstack telemetry bundle, where the ceilometer charm is stuck in blocked status, waiting for the ceilometer-upgrade action, which does nothing when triggered.
What has been tried so far without success:
* deploy openstack-charmers and openstack-
* inspection of logs (ceilometer, gnocchi and juju agent)
* running the “ceilometer-
# ceilometer-upgrade --debug
2021-03-17 15:35:40.581 27392 DEBUG ceilometer.
As far as I can understand the bundle seems correct (based on somewhat official https:/
Cedric Lemarchand (cedric-lemarchand) wrote : | #1 |
description: | updated |
description: | updated |
description: | updated |
Alex Kavanagh (ajkavanagh) wrote : | #2 |
Changed in charm-ceilometer: | |
status: | New → Incomplete |
Alex Kavanagh (ajkavanagh) wrote : | #5 |
For some odd reason, the following seemed to only be sent to me from Cedric (odd launchpad hiccup??) (with better formatting than the previous comment):
Hi Alex,
1. 20.04/Focal
2. Ussuri (openstack-origin: distro)
3. lastest from openstack-charmers (see juju status)
4.
```
Model Controller Cloud/Region Version SLA Timestamp
liquid-rmv juju-controller brane-liquid/
App Version Status Scale Charm Store Rev OS Notes
aodh 10.0.0 active 1 aodh jujucharms 46 ubuntu
aodh-mysql-router 8.0.23 active 1 mysql-router jujucharms 6 ubuntu
ceilometer 14.0.0 blocked 1 ceilometer jujucharms 389 ubuntu
ceilometer-agent 14.0.0 active 1 ceilometer-agent jujucharms 340 ubuntu
ceph-mon 15.2.8 active 3 ceph-mon jujucharms 53 ubuntu
ceph-osd 15.2.8 active 3 ceph-osd jujucharms 308 ubuntu
ceph-radosgw 15.2.8 active 1 ceph-radosgw jujucharms 294 ubuntu
cinder 16.2.1 active 1 cinder jujucharms 308 ubuntu
cinder-ceph 16.2.1 active 1 cinder-ceph jujucharms 260 ubuntu
cinder-mysql-router 8.0.23 active 1 mysql-router jujucharms 6 ubuntu
glance 20.0.1 active 1 glance jujucharms 303 ubuntu
glance-mysql-router 8.0.23 active 1 mysql-router jujucharms 6 ubuntu
gnocchi 4.3.4 active 1 gnocchi jujucharms 128 ubuntu
gnocchi-
keystone 17.0.0 active 1 keystone jujucharms 321 ubuntu
keystone-
memcached active 1 memcached jujucharms 32 ubuntu
mysql-innodb-
neutron-api 16.2.0 active 1 neutron-api jujucharms 292 ubuntu
neutron-
neutron-gateway 16.2.0 active 1 neutron-gateway jujucharms 289 ubuntu
neutron-openvswitch 16.2.0 active 1 neutron-openvswitch jujucharms 280 ubuntu
nova-cloud-
nova-cloud-
nova-compute 21.1.1 ...
Alex Kavanagh (ajkavanagh) wrote : | #6 |
Cedric
Thanks for the additional info (although strangely it didn't appear in the bug, only in my email??)
Anyway, there's still not enough information to go on. What is the output of:
juju run-action ceilometer/0 ceilometer-upgrade --wait
return?
Does the command fail or simply just do nothing. Beforehand, please turn on logging on the charm as much as possible:
juju model-config -m logging-
Also, check the output from the ceilometer unit with:
juju debug-log --replay -i unit-ceilomter-0
Thanks
Alex.
Cedric Lemarchand (cedric-lemarchand) wrote : | #7 |
Previously the action returned nothing, the unit went in "executing" state then fails back to "blocked".
Adding the "--wait" option actually succeed, ceilometer is now happy.
```
juju run-action ceilometer/0 ceilometer-upgrade --wait
unit-ceilometer-0:
UnitId: ceilometer/0
id: "6"
results:
Stderr: |
E: Unable to locate package openstack-release
E: Unable to locate package openstack-release
E: Unable to locate package openstack-release
E: Unable to locate package openstack-release
E: Unable to locate package openstack-release
Stdout: "Reading package lists...\nBuilding dependency tree...\nReading state
informati
state information.
state information.
[-] Upgrading Gnocchi resource types upgrade /usr/lib/
package lists...\nBuilding dependency tree...\nReading state information.
package lists...\nBuilding dependency tree...\nReading state information.
Release: {}\nOpenStack Release: {}\nOpenStack Release: {}\nOpenStack Release:
{}\nOpenStack Release: {}\n"
outcome: success, ceilometer-upgrade completed.
status: completed
timing:
completed: 2021-03-24 10:41:37 +0000 UTC
enqueued: 2021-03-24 10:41:29 +0000 UTC
started: 2021-03-24 10:41:30 +0000 UTC
```
Is the option "--wait" mandatory for this specific ceilometer action ?
Let me now if you need further information.
Cedric Lemarchand (cedric-lemarchand) wrote : | #8 |
Side note for completeness: "juju model-config -m logging-
```
ERROR opening API connection: model name "clemarch/
```
Juju client and controller running v2.8.9.
Cheers
Alex Kavanagh (ajkavanagh) wrote : | #9 |
Side note for completeness: "juju model-config -m logging-
Should be
juju model-config -m model-name logging-
(if model-name is not the current model - the one with the * next to it in the output of "juju models")
otherwise:
juju model-config logging-
(sorry, there was a typo in my original command)
Alex Kavanagh (ajkavanagh) wrote : | #10 |
> Previously the action returned nothing, the unit went in "executing" state then fails back to "blocked".
> Adding the "--wait" option actually succeed, ceilometer is now happy.
So "all" the --wait option does is wait for the command to fail or succeed. The run-action command without the --wait option simply queues the command to run. The "actions" command shows the list of actions and their ids. The "show-action-
So, no --wait shouldn't make a difference. I'm curious as to what might have done, but that ship may have sailed now, if there's nothing in the debug log that indicates what might be wrong.
Unfortunately, juju now defaults to 'info' for logging which means that the error may never have been logged (debug logging tends to give more logging).
I'm pleased it's now working for you; however, I'm leaving the bug as incomplete as I'm not sure that the information is available in the debug-log. If you could attach the result from "juju debug-log --replay -i unit-ceilometer-0" that *may* provide some info.
Thanks.
Cedric Lemarchand (cedric-lemarchand) wrote : | #11 |
I reproduce the issue on freshly deployed bundle:
```
juju model-config logging-config
debug
juju run ceilometer-upgrade --unit ceilometer/leader
juju show-action-status
actions:
- action: juju-run
completed at: "2021-03-24 15:12:58"
id: "2"
status: completed
unit: ceilometer/0
juju show-action-output 2
UnitId: ceilometer/0
id: "2"
results: {}
status: completed
timing:
completed: 2021-03-24 15:12:58 +0000 UTC
enqueued: 2021-03-24 15:12:46 +0000 UTC
started: 2021-03-24 15:12:46 +0000 UTC
juju status ceilometer
Model Controller Cloud/Region Version SLA Timestamp
liquid-rmv juju-controller brane-liquid/
App Version Status Scale Charm Store Rev OS Notes
ceilometer 14.0.0 blocked 1 ceilometer jujucharms 280 ubuntu
Unit Workload Agent Machine Public address Ports Message
ceilometer/0* blocked idle 4 10.140.10.71 Run the ceilometer-upgrade action on the leader to initialize ceilometer and gnocchi
Machine State DNS Inst id Series AZ Message
4 started 10.140.10.71 VM-25 focal default Deployed
```
Unfortunatly "juju debug-log --replay -i unit-ceilometer/0" does not return anything.
Hope it helps to track down the issue.
Cedric Lemarchand (cedric-lemarchand) wrote : | #12 |
- ceilometer_debug.log Edit (285.3 KiB, text/plain)
I finally get the debug log for ceilometer, see attached file.
Alex Kavanagh (ajkavanagh) wrote : | #13 |
Cedric, Thanks for the additional info; I'll take a look at the debug log.
FYI, for "reasons" the command for the juju debug-log to include (-i) the unit-ceilometer/0 is:
juju debug-log --replay -i unit-ceilomter-0
Note the '-' rather than the '/'. It catches me out all the time.
Alex Kavanagh (ajkavanagh) wrote : | #14 |
I'm a bit confused about the timings in the debug-log. It seems to start at 16:23 yet the action was run at 15:12 ... do you have the logs that overlap with when the action was run. The /var/log/
Anyway, I'll deploy it this evening and have a poke around and see what happens.
Cedric Lemarchand (cedric-lemarchand) wrote : | #15 |
This is weird, on my side (2.8.9-
Thanks for the time being involved, let me now if you need something.
Alex Kavanagh (ajkavanagh) wrote : | #16 |
So I tried to reproduce it using the stable charm but my version of the command:
juju run-action ceilometer/leader ceilometer-upgrade
did run the upgrade, and it completed okay.
My equivalent debug log is:
unit-ceilometer-0: 20:32:34 DEBUG juju.worker.
unit-ceilometer-0: 20:32:34 DEBUG juju.worker.
unit-ceilometer-0: 20:32:34 DEBUG juju.worker.uniter [AGENT-STATUS] executing: running action ceilometer-upgrade
unit-ceilometer-0: 20:32:34 DEBUG juju.worker.
unit-ceilometer-0: 20:32:34 DEBUG juju.worker.
unit-ceilometer-0: 20:32:35 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:35 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:35 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:35 WARNING unit.ceilometer
unit-ceilometer-0: 20:32:36 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:36 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:36 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:36 WARNING unit.ceilometer
unit-ceilometer-0: 20:32:37 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:37 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:37 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:37 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:38 WARNING unit.ceilometer
unit-ceilometer-0: 20:32:39 DEBUG unit.ceilometer
ib/python3/
unit-ceilometer-0: 20:32:50 DEBUG juju.worker.
unit-ceilometer-0: 20:32:50 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:51 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:51 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:51 WARNING unit.ceilometer
unit-ceilometer-0: 20:32:51 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:52 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:52 DEBUG unit.ceilometer
unit-ceilometer-0: 20:32:52 DEBUG unit.c...
Cedric Lemarchand (cedric-lemarchand) wrote : | #17 |
- bundle.yaml Edit (18.2 KiB, text/plain)
Yes it seems:
juju show-action-status 8
actions:
- action: juju-run
completed at: "2021-03-25 10:31:40"
id: "8"
status: completed
unit: ceilometer/0
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.
unit-ceilometer-0: 10:31:37 DEBUG juju.machinelock acquire machine lock for ceilometer/0 uniter (run action 8)
unit-ceilometer-0: 10:31:37 DEBUG juju.machinelock machine lock acquired for ceilometer/0 uniter (run action 8)
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.uniter [AGENT-STATUS] executing: running action juju-run
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.
unit-ceilometer-0: 10:31:37 DEBUG juju.worker.
unit-ceilometer-0: 10:31:40 DEBUG juju.worker.
unit-ceilometer-0: 10:31:40 DEBUG juju.machinelock machine lock released for ceilometer/0 uniter (run action 8)
unit-ceilometer-0: 10:31:40 DEBUG juju.worker.
unit-ceilometer-0: 10:31:40 DEBUG juju.worker.uniter no operations in progress; waiting for changes
unit-ceilometer-0: 10:31:40 DEBUG juju.worker.uniter [AGENT-STATUS] idle:
Unit remain in "blocked" state:
juju status ceilometer
Model Controller Cloud/Region Version SLA Timestamp
liquid-rmv juju-controller brane-liquid/
App Version Status Scale Charm Store Rev OS Notes
ceilometer 14.0.0 blocked 1 ceilometer jujucharms 280 ubuntu
Unit Workload Agent Machine Public address Ports Message
ceilometer/0* blocked idle 4 10.140.10.68 Run the ceilometer-upgrade action on the leader to initialize ceilometer and gnocchi
Machine State DNS Inst id Series AZ Message
4 started 10.140.10.68 VM-24 focal default Deployed
I also join the related bundle, could be good to compare mine and yours.
Alex Kavanagh (ajkavanagh) wrote : | #18 |
Okay, let's try a different tack. let's look at the relations between the 3 main components here (with ceilometer). This is from my working system. Let's compare:
juju status ceilometer --relations
Model Controller Cloud/Region Version SLA Timestamp
test tinwood2-
App Version Status Scale Charm Store Rev OS Notes
ceilometer 14.0.0 active 1 ceilometer local 0 ubuntu
Unit Workload Agent Machine Public address Ports Message
ceilometer/0* active idle 16 172.20.1.5 Unit is ready
Machine State DNS Inst id Series AZ Message
16 started 172.20.1.5 2d140d44-
Relation provider Requirer Interface Type Message
ceilometer:
ceilometer:cluster ceilometer:cluster ceilometer-ha peer
gnocchi:
keystone:
keystone:
rabbitmq-
juju status gnocchi --relations
Model Controller Cloud/Region Version SLA Timestamp
test tinwood2-
App Version Status Scale Charm Store Rev OS Notes
gnocchi 4.3.4 active 1 gnocchi jujucharms 46 ubuntu
gnocchi-
Unit Workload Agent Machine Public address Ports Message
gnocchi/0* active idle 17 172.20.1.20 8041/tcp Unit is ready
gnocchi-
Machine State DNS Inst id Series AZ Message
17 started 172.20.1.20 18aee2ea-
Relation provider Requirer Interface Type Message
ceph-mon:client gnocchi:
gnocchi-
gnocchi:cluster gnocchi:cluster openstack-ha peer
gnocchi:
keystone:
memcached:cache gnocchi:
mysql-innodb-
Cedric Lemarchand (cedric-lemarchand) wrote : | #19 |
Thanks Alex, so far it seems relations are pretty the same (note we don't run the same controller version):
juju status ceilometer --relations
Model Controller Cloud/Region Version SLA Timestamp
liquid-rmv juju-controller brane-liquid/
App Version Status Scale Charm Store Rev OS Notes
ceilometer 14.0.0 blocked 1 ceilometer jujucharms 280 ubuntu
Unit Workload Agent Machine Public address Ports Message
ceilometer/0* blocked idle 4 10.140.10.71 Run the ceilometer-upgrade action on the leader to initialize ceilometer and gnocchi
Machine State DNS Inst id Series AZ Message
4 started 10.140.10.71 VM-25 focal default Deployed
Relation provider Requirer Interface Type Message
ceilometer:
ceilometer:cluster ceilometer:cluster ceilometer-ha peer
gnocchi:
keystone:
keystone:
rabbitmq-
quark:admin$ juju status gnocchi --relations
Model Controller Cloud/Region Version SLA Timestamp
liquid-rmv juju-controller brane-liquid/
App Version Status Scale Charm Store Rev OS Notes
gnocchi 4.3.4 active 1 gnocchi jujucharms 46 ubuntu
gnocchi-
Unit Workload Agent Machine Public address Ports Message
gnocchi/0* active idle 0/lxd/1 10.140.10.75 8041/tcp Unit is ready
gnocchi-
Machine State DNS Inst id Series AZ Message
0 started 10.140.10.1 VM-11 focal default Deployed
0/lxd/1 started 10.140.10.75 juju-2d66be-0-lxd-1 focal default Container started
Relation provider Requirer Interface Type Message
ceph-mon:client gnocchi:
gnocchi-
gnocchi:cluster gnocchi:cluster openstack-ha peer
gnocchi:
keystone:
memcached:cache gnocchi:
mysql-innodb-
Alex Kavanagh (ajkavanagh) wrote : | #20 |
I must admit to being completely stumped. If you are still having the problem, let's start again.
If you could try again, but then use: https:/
There must be something weird going on, and it would be good to get to the bottom of it.
Many thanks.
Cedric Lemarchand (cedric-lemarchand) wrote : | #21 |
- juju-crashdump-fa4673bb-7a3c-489e-ba18-3a5e35f93d4c.tar.xz Edit (182.4 KiB, application/x-tar)
So here it is:
juju run ceilometer-upgrade --unit ceilometer/leader
juju show-action-status
actions:
- action: juju-run
completed at: "2021-04-10 18:38:13"
id: "2"
status: completed
unit: ceilometer/0
juju show-action-output 2
UnitId: ceilometer/0
id: "2"
results: {}
status: completed
timing:
completed: 2021-04-10 18:38:13 +0000 UTC
enqueued: 2021-04-10 18:38:01 +0000 UTC
started: 2021-04-10 18:38:01 +0000 UTC
File generated by juju-crashdump attached.
Thanks
Aurelien Lourot (aurelien-lourot) wrote : | #22 |
@Cedric thanks for reporting. I know it's not intuitive, but for next time you'll want to set the bug back to New (from Incomplete) when you answer otherwise we may just miss your reply. Doing it now.
Changed in charm-ceilometer: | |
status: | Incomplete → New |
Cedric Lemarchand (cedric-lemarchand) wrote : | #23 |
Thanks Aurélien, ack.
Alex Kavanagh (ajkavanagh) wrote : | #24 |
> So here it is:
> juju run ceilometer-upgrade --unit ceilometer/leader
Is that a typo? i.e. it should be "juju run-action --unit ceilometer/leader ceilometer-upgrade
"juju run" (on juju < 3.0) is for running commands (not actions) as root on the unit as though you had "juju ssh" and then "command" on that unit from the CLI.
Cedric Lemarchand (cedric-lemarchand) wrote : | #25 |
Yes it seems it was a typo, I am unable to reproduce it, you can close this.
Thanks for your time.
Alex Kavanagh (ajkavanagh) wrote : | #26 |
Hi Cedric, thanks for the confirmation and I'm also pleased that you've got it sorted.
Changed in charm-ceilometer: | |
status: | New → Invalid |
Patrik Arlos (pal-arlos) wrote : | #27 |
Hej,
sorry to add to this. I've just deployed the openstack+telemetry bundle, and seem to hit a similar issue.
I've tried to repeat the actions done above, but no success. Rather, I get errors from the run-action.
juju run-action ceilometer/leader ceilometer-upgrade --wait
unit-ceilometer-0:
UnitId: ceilometer/0
id: "20"
message: 'ceilometer-upgrade resulted in an unexpected error: Command ''[''ceilometer
''--debug'', ''--retry'', ''10'']'' returned non-zero exit status 1.'
results:
Stdout: "2021-05-27 06:59:43.091 93061 DEBUG ceilometer.
Gnocchi resource types upgrade /usr/lib/
06:59:43.110 93061 WARNING keystoneauth.
discover available identity versions when contacting https:/
Attempting to parse version from URL.: keystoneauth1.
SSL exception connecting to https:/
port=5000): Max retries exceeded with url: / (Caused by SSLError(
handshake: Error([('SSL routines', 'tls_process_
verify failed'
[-] Unhandled error: keystoneauth1.
not find versioned identity endpoints when attempting to authenticate. Please
check that your auth_url is correct. SSL exception connecting to https:/
HTTPSConn
url: / (Caused by SSLError(
'
06:59:43.112 93061 ERROR ceilometer Traceback (most recent call last):\n2021-05-27
06:59:43.112 93061 ERROR ceilometer File \"/usr/
line 485, in wrap_socket\
06:59:43.112 93061 ERROR ceilometer File \"/usr/
line 1915, in do_handshake\
result)
line 1647, in _raise_
\ _raise_
\ File \"/usr/
06:59:43.112 93061 ERROR ceilometer raise exception_
06:59:43.112 93061 ERROR ceilometer OpenSSL.SSL.Error: [('SSL routines', 'tls_process_
'certificate verify failed'
\n2021-05-27 06:59:43.112 93061 ERROR ceilometer During handling of the above
exception, another exception occurred:
Alex Kavanagh (ajkavanagh) wrote : | #28 |
Hi Patrik
This doesn't look like a "ceilometer-
Have you got the relation for notifications between keystone and ceilometer setup? (juju status ceilometer --relations) will give you the list. If it's missing then:
juju add-relation ceilometer:
will add it.
Thanks, Alex.
Patrik Arlos (pal-arlos) wrote : | #29 |
Hi Alex,
will check when I deploy the telemetry Openstack again. I (incorrectly perhaps), assumed that a bundle with celiometer at least got the relationships correct.
BR/Patrik
Patrik Arlos (pal-arlos) wrote : | #30 |
Hi Alex,
it seems that the bundle does add the relationships.
.....
- add relation vault:certificates - mysql-innodb-
- add relation ceilometer-
- add relation ceilometer:
- add relation ceilometer:
- add relation ceilometer-
- add relation ceilometer-
- add relation ceilometer:amqp - rabbitmq-
- add relation aodh-mysql-
.....
All right, guess I have to figure out why ceilometer cant talk to vault, as I suspect that's the entity that hands out SSLs.
BR/Patrik
Aurelien Lourot (aurelien-lourot) wrote : | #31 |
If I understand correctly, this has been solved (partly thanks to [0]) but now we're hitting #1906623
[0] https:/
Hi Cedric
In order to help we need some additional information.
1. Ubuntu version (or other)
2. OpenStack version
3. charms version (e.g. current stable - 21.01, or charm revision numbers)
4. The output of "juju status"
5. If possible atttachments of the ceilometer logs from install -> where the actions didn't work.
6. The result of the ceilometer action command.
Thanks.