hook failed: "metric-service-relation-changed" for gnocchi:metric-service

Bug #1746548 reported by Ashley Lai
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Ceilometer Charm
Invalid
Undecided
Unassigned

Bug Description

cloud:xenial-pike

ceilometer failed in metric-service-relation-changed hook for unit ceilometer/1.

2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer Traceback (most recent call last):
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer File "/usr/bin/ceilometer-upgrade", line 10, in <module>
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer sys.exit(upgrade())
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer File "/usr/lib/python2.7/dist-packages/ceilometer/cmd/storage.py", line 59, in upgrade
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer gnocchi_client.upgrade_resource_types(conf)
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer File "/usr/lib/python2.7/dist-packages/ceilometer/gnocchi_client.py", line 194, in upgrade_resource_types
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer gnocchi.resource_type.get(name=name)
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer File "/usr/lib/python2.7/dist-packages/gnocchiclient/v1/resource_type.py", line 44, in get
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer headers={'Content-Type': "application/json"}).json()
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer File "/usr/lib/python2.7/dist-packages/gnocchiclient/v1/base.py", line 37, in _get
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer return self.client.api.get(*args, **kwargs)
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer File "/usr/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 288, in get
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer return self.request(url, 'GET', **kwargs)
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer File "/usr/lib/python2.7/dist-packages/gnocchiclient/client.py", line 35, in request
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer **kwargs)
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer File "/usr/lib/python2.7/dist-packages/keystoneauth1/adapter.py", line 192, in request
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer return self.session.request(url, method, **kwargs)
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer File "/usr/lib/python2.7/dist-packages/positional/__init__.py", line 101, in inner
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer return wrapped(*args, **kwargs)
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer File "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 703, in request
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer resp = send(**kwargs)
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer File "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 795, in _send_request
2018-01-31 16:04:11 DEBUG metric-service-relation-changed 2018-01-31 16:04:11.987 239143 ERROR ceilometer **kwargs)

Revision history for this message
Ashley Lai (alai) wrote :
Revision history for this message
James Page (james-page) wrote :

Q - it looks like gnocchi units and mongodb units are deployed on the same LXD containers? is that correct?

Revision history for this message
James Page (james-page) wrote :

  gnocchi:
    charm: cs:~openstack-charmers-next/xenial/gnocchi
    num_units: 3
    bindings:
      "": *oam-space
      public: *public-space
      admin: *admin-space
      internal: *internal-space
      shared-db: *internal-space
      storage-ceph: *ceph-public-space
      coordinator-memcached: *internal-space
    options:
      worker-multiplier: *worker-multiplier
      openstack-origin: *openstack-origin
      region: *openstack-region
      vip: *gnocchi-vip
      haproxy-server-timeout: *haproxy-server-timeout
      haproxy-client-timeout: *haproxy-client-timeout
      haproxy-queue-timeout: *haproxy-queue-timeout
      haproxy-connect-timeout: *haproxy-connect-timeout
      use-internal-endpoints: True
    to:
    - mongodb/0
    - mongodb/1
    - mongodb/2

Revision history for this message
James Page (james-page) wrote :

From another bug report but I think that confirms it.

Not sure whether that's related or not but I'd suggest not doing this.

tags: added: foundations-engine
removed: cpe-foundations
Revision history for this message
Nobuto Murata (nobuto) wrote :

I was hit by a similar behavior in the field. I had no time to investigate it further, but as far as I tested:

- gnocchi and gnocchi-hacluster charm do not configure any pacemaker resources when multiple vips are specified in the charm config
- other reactive charms like aodh configures vip resources correctly with the similar multiple vips config
- no resource was configured even after redeploying with a dedicated container to gnocchi, not sharing with mongodb.

Revision history for this message
Nobuto Murata (nobuto) wrote :

LP: #1748286 and LP: #1746548 might have be related each other.

Revision history for this message
Narinder Gupta (narindergupta) wrote :

Even after deploy in different LXD container we are facing the same issue. Here is log form failing unit.

http://paste.ubuntu.com/p/cjvpC6MPSf

/ceilometer/2* error idle 20/lxd/1 10.181.6.52 8777/tcp hook failed: "metric-service-relation-changed" for gnocchi:metric-service

Error message is same.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

why would gnocchi and mongodb in the same container cause issues?

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Jason, it would be good in general to collect more info on pacemaker and corosync status

$ sudo corosync-cfgtool -s
# sample status
Printing ring status.
Local node ID 303938909
RING ID 0
  id = 10.0.1.1
  status = ring 0 active with no faults
RING ID 1
  id = 192.168.42.1
  status = ring 1 active with no faults

$ sudo corosync-cmapctl

$ sudo crm status

~~~

// What I see based on data from comment #2

➜ crashdump-gnocchi-ceilometer-mongo find . -name 'syslog' | xargs grep -RiP res_gnocchi_eth1_vip_monitor_0 | pastebinit
http://paste.ubuntu.com/p/YBCfwcPYjN/

https://paste.ubuntu.com/p/kHY5BkmrXc/
Jan 31 16:30:03 juju-c187ba-16-lxd-3 crmd[99583]: notice: Transition aborted by res_gnocchi_haproxy_monitor_0 'create' on juju-c187ba-18-lxd-3: Event failed (magic=0:0;7:465:7:306017e9-60bd-479b-bdaa-56b18735c47b, cib=0.172.6, source=match_graph_event:381, 0)
...
Jan 31 16:30:03 juju-c187ba-16-lxd-3 crmd[99583]: notice: Operation res_gnocchi_haproxy_monitor_0: ok (node=juju-c187ba-16-lxd-3, call=1891, rc=0, cib-update=3289, confirmed=true)
Jan 31 16:30:03 juju-c187ba-16-lxd-3 crmd[99583]: notice: Operation res_gnocchi_eth2_vip_monitor_0: not running (node=juju-c187ba-16-lxd-3, call=1886, rc=7, cib-update=3290, confirmed=true)
Jan 31 16:30:03 juju-c187ba-16-lxd-3 crmd[99583]: notice: Operation res_gnocchi_eth1_vip_monitor_0: not running (node=juju-c187ba-16-lxd-3, call=1882, rc=7, cib-update=3291, confirmed=true)

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

@Dmitrii,

I agree more info on pacemaker and corosync status would be good. What is the appropriate place to collect it? Should the hacluster charm implement collecting that info, maybe as part of its update-status hook?

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

@jason-hobbs,

I think during crashdump collection if it's possible.

Revision history for this message
James Page (james-page) wrote :

I'm going to mark this bug as Invalid; we change the way ceilometer does its upgrade task to be action oriented as its pretty much impossible to get an in-hook upgrade to work with so many downstream charm dependencies:

   gnocchi
     |----- keystone ---- mysql
     |----- mysql
     |----- ceph-mon ---- ceph-osd
     |----- memcached

Hence its an external action driven task now.

We might get to this once goal-state is full adopted but that's not going to be for a few cycles yet.

James Page (james-page)
Changed in charm-ceilometer:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.