Failure in reporting metrics

Bug #1875770 reported by Mark Beierl
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Ceilometer Charm
Expired
Undecided
Unassigned

Bug Description

With all metrics enabled in the charm, and a VM deployed that is active, ceilometer fails to report a number of metrics and the following errors appear in the logs:

2020-04-28 21:15:56.341 6436 WARNING ceilometer.transformer.conversions [-] Dropping sample with no predecessor: (<name: cpu, volume: 342530000000, resource_id: 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff, timestamp: 2020-04-28T21:15:56.143561>,)
2020-04-28 21:20:56.409 6439 WARNING ceilometer.transformer.conversions [-] dropping sample with no predecessor: (<name: network.outgoing.bytes, volume: 11150478, resource_id: instance-00000001-6fe8de79-5f3d-41b7-85bd-f958cdedd9ff-tap70b63909-42, timestamp: 2020-04-28T21:20:56.199023>,)
2020-04-28 21:20:56.444 6436 WARNING ceilometer.transformer.conversions [-] dropping sample with no predecessor: (<name: disk.device.read.requests, volume: 11878, resource_id: 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff-vda, timestamp: 2020-04-28T21:20:56.196966>,)
2020-04-28 21:25:56.290 6439 WARNING ceilometer.transformer.conversions [-] dropping sample with no predecessor: (<name: cpu, volume: 353540000000, resource_id: 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff, timestamp: 2020-04-28T21:25:56.145695>,)
2020-04-28 21:25:56.290 6439 WARNING ceilometer.transformer.conversions [-] Dropping sample with no predecessor: (<name: cpu, volume: 353540000000, resource_id: 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff, timestamp: 2020-04-28T21:25:56.145695>,)
2020-04-28 21:25:56.481 6436 WARNING ceilometer.transformer.conversions [-] dropping sample with no predecessor: (<name: network.incoming.packets, volume: 43907, resource_id: instance-00000001-6fe8de79-5f3d-41b7-85bd-f958cdedd9ff-tap70b63909-42, timestamp: 2020-04-28T21:25:56.160256>,)
2020-04-28 21:25:56.671 6436 WARNING ceilometer.transformer.conversions [-] dropping sample with no predecessor: (<name: disk.device.read.bytes, volume: 289910272, resource_id: 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff-vda, timestamp: 2020-04-28T21:25:56.178590>,)
2020-04-28 21:40:56.537 6442 WARNING ceilometer.transformer.conversions [-] dropping sample with no predecessor: (<name: network.outgoing.bytes, volume: 11150610, resource_id: instance-00000001-6fe8de79-5f3d-41b7-85bd-f958cdedd9ff-tap70b63909-42, timestamp: 2020-04-28T21:40:56.193747>,)
2020-04-28 22:00:56.340 6436 WARNING ceilometer.transformer.conversions [-] dropping sample with no predecessor: (<name: network.incoming.bytes, volume: 132975065, resource_id: instance-00000001-6fe8de79-5f3d-41b7-85bd-f958cdedd9ff-tap70b63909-42, timestamp: 2020-04-28T22:00:56.135131>,)
2020-04-28 22:30:56.369 6433 WARNING ceilometer.transformer.conversions [-] dropping sample with no predecessor: (<name: disk.device.write.requests, volume: 78557, resource_id: 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff-vda, timestamp: 2020-04-28T22:30:56.136984>,)

Revision history for this message
Mark Beierl (mbeierl) wrote :
Download full text (9.2 KiB)

Here is the list of all metrics, those with None as the Unit are measurements that have failed to be reported:

| id | archive_policy/name | name | unit | resource_id |
| f1959f0e-57f6-440b-935e-d85a9607daac | low | bandwidth | None | 13086211-e9d6-4248-91f5-327815cacfe2 |
| 7f83499e-7d41-48b2-92a2-a89af7437a58 | low | compute.instance.booting.time | None | 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff |
| 009e0eb6-1643-49f1-ae1d-5977ba054bc6 | low | cpu | None | 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff |
| 7d653c96-d0cc-44a7-a16e-425c62bb2822 | low | cpu.delta | None | 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff |
| 4c719dd4-c890-4b5f-850d-1a78d241799e | low | cpu_l3_cache | None | 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff |
| 7e868dbe-ce50-4f9e-9322-be2b655cd380 | low | cpu_util | None | 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff |
| 49a6142f-a137-4743-8868-2b5b8e88ea98 | low | disk.allocation | None | 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff |
| 121e8a49-7023-4d96-a5fc-21b839ab4885 | low | disk.capacity | None | 6fe8de79-5f3d-41b7-85bd-f958cdedd9ff |
| c98b11ef-fd57-4734-8f21-a75acb7edbb2 | low | disk.device.allocation | None | 1697b45f-e901-5a44-b8b8-c15a1e3cda4f |
| 63e07976-dab8-4ac0-8096-50e4e690b849 | low | disk.device.capacity | None | 1697b45f-e901-5a44-b8b8-c15a1e3cda4f |
| be229417-19e8-426d-bdb7-a8e87c80410a | low | disk.device.iops | None | 1697b45f-e901-5a44-b8b8-c15a1e3cda4f |
| 15f91a8a-3937-434b-b24a-c478062a56fa | low | disk.device.latency | None | 1697b45f-e901-5a44-b8b8-c15a1e3cda4f |
| e957842c-b69a-4743-b9b7-6641d6359d0c | low | disk.device.read.bytes | None | 1697b45f-e901-5a44-b8b8-c15a1e3cda4f |
| 05b1fdb3-9d0c-4991-a3fd-9ad91fd3a41e | low | disk.device.read.bytes.rate | None | 1697b45f-e901-5a44-b8b8-c15a1e3cda4f |
| 40d03227-2ffe-42ba-b137-ce0eca1813e6 | low | disk.device.read.latency | None | 1697b45f-e901-5a44-b8b8-c15a1e3cda4f |
| 32af7b02-79e6-4388-8e97-efb2aa93143b | low | disk.device.read.requests | None | 1697b45f-e901-5a44-b8b8-c15a1e3cda4f |
| 784a4d4a-984a-43e5-a489-61513b365fc6 | low | disk.device.read.requests.rate | None | 1697b45f-e901-5a44-b8b8-c15a1e3cda4f |
| 938c36aa-d8e3-410d-9604-584a6fff70af | low | disk.device.usage | None | 1697b45f-e901-5a44-b8b8-c15a1e3cda4f |
| e4142ea5-c336-4737-b26a-cd6947a96c81 | low | disk.device.write.bytes | B | 1697b45f-e901-5a44-b8b8-c15a1e3cda4f |
| 55575acb-b8d3-43f3-a4f5-0c31fd3f700e | low | disk.device.write.bytes.rate | None | 1697b45f-e901-5a44-b8b8-c15a1e3cda4f |
| d84a4dd4-0bcf-49aa-b899-ecd639326e6d | low | disk.device.write.late...

Read more...

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Hi Mark, thank you for the bug report.

Could you please share more details about the environment? Namely, a sanitized bundle in use (`juju export-bundle` with any secrets redacted would work) for us to understand the OpenStack version used and the components and relations present.

Changed in charm-ceilometer:
status: New → Incomplete
Revision history for this message
Mark Beierl (mbeierl) wrote :

Sorry, I thought I already posted this, but I cannot find it anymore:

mark@lxd-os:~/openstack-on-lxd$ git diff
diff --git a/bundle-bionic-rocky.yaml b/bundle-bionic-rocky.yaml
index 119bc88..340c900 100644
--- a/bundle-bionic-rocky.yaml
+++ b/bundle-bionic-rocky.yaml
@@ -116,6 +116,11 @@ services:
   ceilometer:
     charm: cs:ceilometer
     num_units: 1
+ options:
+ enable-all-pollsters: true
+ polling-interval: 60
+ gnocchi-archive-policy: low
+ debug: true
   ceilometer-agent:
     charm: cs:ceilometer-agent
   ceph-mon:
@@ -188,6 +193,7 @@ services:
     num_units: 1
     options:
       openstack-origin: cloud:bionic-rocky
+ debug: true
   heat:
     charm: cs:heat
     num_units: 1

Revision history for this message
Mark Beierl (mbeierl) wrote :
Revision history for this message
Mark Beierl (mbeierl) wrote :
Download full text (3.4 KiB)

I also just found these errors in the ceilometer logs:

2020-04-29 19:14:40.681 26178 WARNING ceilometer.publisher.gnocchi [-] filtered project not found in keystone, ignoring the filter_project option: NotFound: 404 (HTTP 404)
2020-04-29 19:14:40.704 26178 ERROR ceilometer.publisher.gnocchi [-] fail to retrieve filtered project : NoUniqueMatch: ClientException
2020-04-29 19:14:40.704 26178 ERROR ceilometer.publisher.gnocchi Traceback (most recent call last):
2020-04-29 19:14:40.704 26178 ERROR ceilometer.publisher.gnocchi File "/usr/lib/python2.7/dist-packages/ceilometer/publisher/gnocchi.py", line 255, in gnocchi_project_id
2020-04-29 19:14:40.704 26178 ERROR ceilometer.publisher.gnocchi name=self.filter_project)
2020-04-29 19:14:40.704 26178 ERROR ceilometer.publisher.gnocchi File "/usr/lib/python2.7/dist-packages/keystoneclient/base.py", line 75, in func
2020-04-29 19:14:40.704 26178 ERROR ceilometer.publisher.gnocchi return f(*args, **new_kwargs)
2020-04-29 19:14:40.704 26178 ERROR ceilometer.publisher.gnocchi File "/usr/lib/python2.7/dist-packages/keystoneclient/base.py", line 447, in find
2020-04-29 19:14:40.704 26178 ERROR ceilometer.publisher.gnocchi raise ksc_exceptions.NoUniqueMatch
2020-04-29 19:14:40.704 26178 ERROR ceilometer.publisher.gnocchi NoUniqueMatch: ClientException
2020-04-29 19:14:40.704 26178 ERROR ceilometer.publisher.gnocchi
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample [-] Pipeline meter_sink: Continue after error from publisher <ceilometer.publisher.gnocchi.GnocchiPublisher object at 0x7f32e0fc07d0>: NoUniqueMatch: ClientException
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample Traceback (most recent call last):
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample File "/usr/lib/python2.7/dist-packages/ceilometer/pipeline/sample.py", line 157, in _publish_samples
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample p.publish_samples(transformed_samples)
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample File "/usr/lib/python2.7/dist-packages/ceilometer/publisher/gnocchi.py", line 294, in publish_samples
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample data = [s for s in data if not self._is_gnocchi_activity(s)]
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample File "/usr/lib/python2.7/dist-packages/ceilometer/publisher/gnocchi.py", line 278, in _is_gnocchi_activity
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample return (self.filter_project and self.gnocchi_project_id and (
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample File "/usr/lib/python2.7/dist-packages/ceilometer/publisher/gnocchi.py", line 255, in gnocchi_project_id
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample name=self.filter_project)
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample File "/usr/lib/python2.7/dist-packages/keystoneclient/base.py", line 75, in func
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample return f(*args, **new_kwargs)
2020-04-29 19:14:40.756 26178 ERROR ceilometer.pipeline.sample File "/usr/lib/python2.7/dist-packages/ke...

Read more...

Revision history for this message
Mark Beierl (mbeierl) wrote :

And with debug on, from the nova-compute ceilometer-agent logs:

2020-04-29 19:15:06.037 13682 DEBUG ceilometer.agent [-] Config file: {'sources': [{'name': 'some_pollsters', 'interval': 300, 'meters': ['cpu', 'cpu_l3_cache', 'memory.usage', 'network.incoming.bytes', 'network.incoming.packets', 'network.outgoing.bytes', 'network.outgoing.packets', 'disk.device.read.bytes', 'disk.device.read.requests', 'disk.device.write.bytes', 'disk.device.write.requests', 'hardware.cpu.util', 'hardware.memory.used', 'hardware.memory.total', 'hardware.memory.buffer', 'hardware.memory.cached', 'hardware.memory.swap.avail', 'hardware.memory.swap.total', 'hardware.system_stats.io.outgoing.blocks', 'hardware.system_stats.io.incoming.blocks', 'hardware.network.ip.incoming.datagrams', 'hardware.network.ip.outgoing.datagrams']}]} load_config /usr/lib/python3/dist-packages/ceilometer/agent.py:70
2020-04-29 19:15:06.044 13682 DEBUG ceilometer.compute.virt.libvirt.utils [-] Connecting to libvirt: qemu:///system new_libvirt_connection /usr/lib/python3/dist-packages/ceilometer/compute/virt/libvirt/utils.py:87
2020-04-29 19:15:06.045 13682 DEBUG ceilometer.polling.manager [-] Skip pollster network.incoming.packets, no resources found this cycle poll_and_notify /usr/lib/python3/dist-packages/ceilometer/polling/manager.py:189
2020-04-29 19:15:06.045 13682 DEBUG ceilometer.polling.manager [-] Skip pollster disk.device.read.bytes, no resources found this cycle poll_and_notify /usr/lib/python3/dist-packages/ceilometer/polling/manager.py:189
2020-04-29 19:15:06.045 13682 DEBUG ceilometer.polling.manager [-] Skip pollster disk.device.read.requests, no resources found this cycle poll_and_notify /usr/lib/python3/dist-packages/ceilometer/polling/manager.py:189
2020-04-29 19:15:06.046 13682 DEBUG ceilometer.polling.manager [-] Skip pollster network.incoming.bytes, no resources found this cycle poll_and_notify /usr/lib/python3/dist-packages/ceilometer/polling/manager.py:189
2020-04-29 19:15:06.046 13682 DEBUG ceilometer.polling.manager [-] Skip pollster network.outgoing.bytes, no resources found this cycle poll_and_notify /usr/lib/python3/dist-packages/ceilometer/polling/manager.py:189
2020-04-29 19:15:06.046 13682 DEBUG ceilometer.polling.manager [-] Skip pollster disk.device.write.requests, no resources found this cycle poll_and_notify /usr/lib/python3/dist-packages/ceilometer/polling/manager.py:189

Revision history for this message
Mark Beierl (mbeierl) wrote :

I meant to ask: why is the interval 300 when the polling interval in the charm was set to 60?

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

So the messages you were getting in the original bug report were warnings from the transformer code in Ceilometer. This code was removed in Stein but is present in Rocky which is what you have.

https://github.com/openstack/ceilometer/commit/9db5c6c9bfc66018aeb78c4a262e1bfa9b326798#diff-442aa054b89a947ed2e9aa559fadfbdeL96

 git --no-pager branch --contains=9db5c6c9bfc66018aeb78c4a262e1bfa9b326798
* master
  stable/stein
  stable/train
  stable/ussuri

I believe that in order to get rid of those, workload partitioning needs to be enabled in Ceilometer and this feature needs to be implemented https://bugs.launchpad.net/charm-ceilometer/+bug/1768527.

However, this seems unrelated to the errors you mention in #5.

To debug further, have you run `juju run-action ceilometer-upgrade` post-deployment as mentioned in the usage guide?
https://github.com/openstack/charm-ceilometer/tree/stable/20.02#usage

Regarding #7, polling.yaml should be updated on config-changed events. Looking at the 20.02 code, this should be done properly. We have a functional test that validates ceilometer.conf change propagation but not for polling.yaml. Could you try toggling debug=True/debug=False to force a config change and see if /etc/ceilometer/polling.yaml gets updated with new values?

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack ceilometer charm because there has been no activity for 60 days.]

Changed in charm-ceilometer:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.