bootstrap_gnocchi fails during deploy (rocky)

Bug #1803522 reported by Marek Grudzinski
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
kolla
Expired
Undecided
Unassigned
kolla-ansible
Expired
Undecided
Unassigned

Bug Description

Deployed with kolla-ansible==7.0.0, openstack_release="rocky"
Kolla-ansible is being run in a venv.

Deploying a very vanilla multinode setup. External ceph (mimic 13.2.2), all core services & some logging. In addition to this all telemetry services, which aren't deploying properly.

#globals.yml

kolla_base_distro: "ubuntu"
kolla_install_type: "source"
openstack_release: "rocky"
node_custom_config: "/etc/kolla/config"
kolla_internal_vip_address: "192.168.9.8"
kolla_external_vip_address: "192.168.9.9"
network_interface: "eno1"
api_interface: "eno1"
storage_interface: "ens4.20"
tunnel_interface: "ens4.40"
neutron_external_interface: "ens4"
neutron_plugin_agent: "openvswitch"
enable_neutron_provider_networks: "yes"
kolla_enable_tls_external: "yes"
kolla_external_fqdn_cert: "{{ node_config_directory }}/certificates/haproxy.pem"
enable_aodh: "yes"
enable_ceilometer: "yes"
enable_central_logging: "yes"
enable_ceph: "no"
enable_chrony: "yes"
enable_cinder: "yes"
enable_cinder_backup: "no"
enable_cinder_backend_hnas_iscsi: "no"
enable_cinder_backend_hnas_nfs: "no"
enable_cinder_backend_iscsi: "no"
enable_cinder_backend_lvm: "no"
enable_cinder_backend_nfs: "no"
enable_fluentd: "yes"
enable_gnocchi: "yes"
enable_haproxy: "yes"
enable_heat: "yes"
enable_horizon: "yes"
enable_openvswitch: "{{ neutron_plugin_agent != 'linuxbridge' }}"
enable_panko: "yes"
enable_redis: "yes"
external_ceph_cephx_enabled: "yes"
enable_ceph_dashboard: "yes"
enable_ceph_rgw_keystone: "yes"
keystone_token_provider: 'fernet'
glance_backend_ceph: "yes"
panko_database_type: "mysql"
gnocchi_backend_storage: "ceph"
gnocchi_pool_name: "metrics"
gnocchi_incoming_storage: "{{ 'redis' if enable_redis | bool else '' }}"
cinder_backend_ceph: "yes"
nova_backend_ceph: "yes"
nova_compute_virt_type: "kvm"

All goes well until kolla-ansible gets to: TASK [gnocchi : Running gnocchi bootstrap container]

This fails with:

2018-11-15 10:58:14,406 [17] INFO gnocchi.service: Gnocchi version 4.3.0
2018-11-15 10:58:14,724 [17] WARNING py.warnings: /var/lib/kolla/venv/local/lib/python2.7/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
  """)

2018-11-15 10:58:14,898 [17] INFO gnocchi.cli.manage: Upgrading indexer SQLAlchemyIndexer: mysql+pymysql://gnocchi:cti3WaxzusGwKNbVIvaZUB8xPPCC5VMdNtew3ACB@192.168.9.8:3306/gnocchi
2018-11-15 10:58:15,004 [17] INFO gnocchi.common.ceph: Ceph storage backend use 'cradox' python library
2018-11-15 10:58:15,021 [17] INFO gnocchi.cli.manage: Upgrading storage CephStorage: 73f8c334-2b4d-456a-b13c-d691ddd3ce37
2018-11-15 10:58:15,024 [17] ERROR gnocchi.utils: Unable to initialize incoming driver
Traceback (most recent call last):
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/tenacity/__init__.py", line 333, in call
    result = fn(*args, **kwargs)
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/incoming/__init__.py", line 268, in get_driver
    conf.incoming, conf.metricd.greedy)
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/incoming/redis.py", line 59, in __init__
    self._client, self._scripts = redis.get_client(conf, self._SCRIPTS)
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/redis/client.py", line 986, in __getitem__
    raise KeyError(name)

I have also tried using tooz as coordinator but it fails as well, I will try to post those logs a bit later.

I believe the solution is to upgrade the gnocchi pip package that is installed, gnocchi==4.3.0 (in the /va/lib/kolla/venv).
Upgrading this to gnocchi==4.3.2 and rerunning gnocchi-upgrade fixes the problem.

Can someone confirm? I'm still new to kolla-ansible & kolla. I will try to build a new base-image later.

Marek Grudzinski (ivve)
summary: - bootstrap_fails during deploy
+ bootstrap_gnocchi fails during deploy
Revision history for this message
Marek Grudzinski (ivve) wrote : Re: bootstrap_gnocchi fails during deploy

So when not using redis and instead using tooz as coordinator for gnocchi, deployment is fine but gnocchi_metricd container spams with:

2018-11-15 11:59:26,339 [48] ERROR gnocchi.cli.metricd: Unexpected error during processing job
Traceback (most recent call last):
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/cli/metricd.py", line 85, in run
    self._run_job()
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/cli/metricd.py", line 246, in _run_job
    self.coord.update_capabilities(self.GROUP_ID, self.store.statistics)
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/tooz/coordination.py", line 592, in update_capabilities
    raise tooz.NotImplemented
NotImplemented

Same goes here, installing gnocchi under /var/lib/kolla/venv virtualenv solves the issue.

Revision history for this message
Marek Grudzinski (ivve) wrote :

Good evening.

I just finished re-deploying with new containers.
Problem is solved with these containers rebuilt with following template overrides:

Containers:
  - gnocchi-api
  - gnocchi-metricd
  - gnocchi-statsd

overrides with kolla-build:

{% extends parent_template %}

{% set gnocchi_base_pip_packages_append = ['gnocchi==4.3.2'] %}

Marek Grudzinski (ivve)
summary: - bootstrap_gnocchi fails during deploy
+ bootstrap_gnocchi fails during deploy (rocky)
Revision history for this message
Marek Grudzinski (ivve) wrote :

It also seems this error can happen directly after deployment:

2018-11-16 14:44:18,664 [249780] ERROR cotyledon._utils: Unhandled exception
Traceback (most recent call last):
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/cotyledon/_utils.py", line 95, in exit_on_exception
    yield
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/cotyledon/_service.py", line 139, in _run
    self.run()
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/cli/metricd.py", line 78, in run
    self._configure()
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/tenacity/__init__.py", line 241, in wrapped_f
    return self.call(f, *args, **kw)
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/tenacity/__init__.py", line 330, in call
    start_time=start_time)
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/tenacity/__init__.py", line 279, in iter
    return fut.result()
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/concurrent/futures/_base.py", line 455, in result
    return self.__get_result()
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/tenacity/__init__.py", line 333, in call
    result = fn(*args, **kwargs)
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/cli/metricd.py", line 162, in _configure
    self.fallback_tasks = list(self.incoming.iter_sacks())
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/incoming/__init__.py", line 248, in iter_sacks
    return (self._make_sack(i) for i in six.moves.range(self.NUM_SACKS))
  File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/gnocchi/incoming/__init__.py", line 119, in NUM_SACKS
    raise SackDetectionError(e)
SackDetectionError: int() argument must be a string or a number, not 'NoneType'

It is solved by running gnocchi-update.

description: updated
Revision history for this message
Mark Goddard (mgoddard) wrote :

Is there something we need to do in kolla?

Revision history for this message
Michal Nasiadka (mnasiadka) wrote :

Is this still a bug with latest code? Can you reproduce this and upload new logs?

Changed in kolla:
status: New → Incomplete
Changed in kolla-ansible:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for kolla because there has been no activity for 60 days.]

Changed in kolla:
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for kolla-ansible because there has been no activity for 60 days.]

Changed in kolla-ansible:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.