Handling of tooz coordinators is messy

Bug #1840070 reported by Radosław Piliszek
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
Triaged
Wishlist
Unassigned

Bug Description

We seem to handle tooz coordinator url for:
- cinder
- cloudkitty
- gnocchi
- mistral

However, in each place we do it differently and, for example, it may not be obvious that enabling etcd suddenly reconfigures cinder to use it as its coordinator.

This bug is to track relevant fixes. Starting with cinder now.

This might evolve in a blueprint dealing with how to properly handle the coordinator config across board.

Changed in kolla-ansible:
assignee: Pili (pili) → nobody
assignee: nobody → Radosław Piliszek (yoctozepto)
milestone: none → 9.0.0
tags: added: coordinator etcd redis tooz
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/676261

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (master)

Reviewed: https://review.opendev.org/676261
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=03b4c706fae54a381b78c0f516499bfac986fb26
Submitter: Zuul
Branch: master

commit 03b4c706fae54a381b78c0f516499bfac986fb26
Author: Radosław Piliszek <email address hidden>
Date: Tue Aug 13 20:22:54 2019 +0200

    Allow cinder coordination backend to be configured

    This is to allow operator to prevent enabling redis and/or
    etcd from magically configuring cinder coordinator.

    Note this change is backwards-compatible.

    Change-Id: Ie10be55968e43e3b9cc347b1b58771c1f7b1b910
    Related-Bug: #1840070
    Signed-off-by: Radosław Piliszek <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (stable/stein)

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/677060

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (stable/stein)

Reviewed: https://review.opendev.org/677060
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=439aed8852ba54092d20d8c2dc1d5f1ebeaf012a
Submitter: Zuul
Branch: stable/stein

commit 439aed8852ba54092d20d8c2dc1d5f1ebeaf012a
Author: Radosław Piliszek <email address hidden>
Date: Tue Aug 13 20:22:54 2019 +0200

    Allow cinder coordination backend to be configured

    This is to allow operator to prevent enabling redis and/or
    etcd from magically configuring cinder coordinator.

    Note this change is backwards-compatible.

    Change-Id: Ie10be55968e43e3b9cc347b1b58771c1f7b1b910
    Related-Bug: #1840070
    Signed-off-by: Radosław Piliszek <email address hidden>
    (cherry picked from commit 03b4c706fae54a381b78c0f516499bfac986fb26)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (master)

Reviewed: https://review.opendev.org/682095
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=9cae608392802ce4b80e83105b60dbd3611cd7b9
Submitter: Zuul
Branch: master

commit 9cae608392802ce4b80e83105b60dbd3611cd7b9
Author: Joseph M <email address hidden>
Date: Fri Sep 13 11:50:21 2019 -0400

    [designate] Add coordination backend for designate workers

    Add coordination backend configuration to designate.conf which is
    required in multinode environments. Fixes warning from designate:

    WARNING designate.coordination [-] No coordination backend configured,
    assuming we are the only worker. Please configure a coordination backend

    Change-Id: I23c4d2de7e3f9368795c423000a4f9a6c3a431e2
    Closes-Bug: #1843842
    Related-Bug: #1840070

Revision history for this message
Mark Goddard (mgoddard) wrote :

I don't think we're going to do any more on this in Train (it's kind of a feature). Shall we move to Ussuri?

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Yeah, looks like it. I think those services need different approaches to tooz coordination anyway so it might be ok to leave it the way it is now as at least cinder is fixed.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (master)

Fix proposed to branch: master
Review: https://review.opendev.org/694476

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

So the default has bitten someone in Stein already: https://bugs.launchpad.net/kolla-ansible/+bug/1852086

I think I might sit on this a bit more to squeeze it out for Train so that we have that cleaned. Feature or not, this can cause needless trouble for operators.

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Ok, so we fixed the issue with etcd3 by replacing it with etcd3gw.

The bug remains to track coordinators config.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (stable/stein)

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/697840

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (stable/stein)

Reviewed: https://review.opendev.org/697840
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=98a0c14eb8dbd95b9b40b408a892c82e6076ad99
Submitter: Zuul
Branch: stable/stein

commit 98a0c14eb8dbd95b9b40b408a892c82e6076ad99
Author: Joseph M <email address hidden>
Date: Fri Sep 13 11:50:21 2019 -0400

    [designate] Add coordination backend for designate workers

    Add coordination backend configuration to designate.conf which is
    required in multinode environments. Fixes warning from designate:

    WARNING designate.coordination [-] No coordination backend configured,
    assuming we are the only worker. Please configure a coordination backend

    Change-Id: I23c4d2de7e3f9368795c423000a4f9a6c3a431e2
    Closes-Bug: #1843842
    Related-Bug: #1840070
    (cherry picked from commit 9cae608392802ce4b80e83105b60dbd3611cd7b9)

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Based on https://bugs.launchpad.net/kolla-ansible/+bug/1872205 - we should check which backends are really supported (based on required features) before throwing in conditionals.

See https://docs.openstack.org/tooz/latest/user/compatibility.html

At the moment of this writing (Ussuri, same since at least Stein), Redis, Memcached and Zookeeper (via the Kazoo driver) support all the features (grouping, leaders and locking). etcd (all versions) and others (including sql) only locking.

We already prefer Redis in 2 services, falling back to etcd: Cinder (fallback to check) and Designate (fallback already known to break, see above).
Mistral forcibly uses Redis and will likely break if left without Redis enabled (to check).
Gnocchi uses Redis conditionally on its availability (both as coordination and incoming driver).
Redis seems to be the preffered osprofile backend too.

Cloudkitty seems to use mysql for coordination (abomination!).

We don't seem to use memcached or Zookeeper for coordination (but we deploy them). Memcached obviously for cache, Zookeper for Kafka (references also seen in Storm, Telegraf and Monasca).

Resilience-wise we deploy Redis in sentinel mode (which is not bad but could be better - to check). Memcached would be a worse choice than Redis (reall cache only here). We should in fact aim for Zookeeper.

PS: There are other issues with etcd drivers that we already consider(ed):

- etcd-compatible tooz drivers do not support multiple endpoints here (verified in Stein, Train)
- we must use etcd3gw (aka etcd3+http) due to issues with alternative (etcd3) and eventlet (as used by designate and cinder)
  see https://bugs.launchpad.net/kolla-ansible/+bug/1854932
  and https://review.opendev.org/466098 for details

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/719583

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (master)

Reviewed: https://review.opendev.org/719583
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=3c234603a9c8e443dc3d38989e3e78147757d1da
Submitter: Zuul
Branch: master

commit 3c234603a9c8e443dc3d38989e3e78147757d1da
Author: Radosław Piliszek <email address hidden>
Date: Mon Apr 13 17:33:02 2020 +0200

    Fix Designate not to use etcd coordination backend

    etcd via tooz does not support group membership required by
    Designate coordination.
    The best k-a can do is not to configure etcd in Designate.

    Change-Id: I2f64f928e730355142ac369d8868cf9f65ca357e
    Closes-bug: #1872205
    Related-bug: #1840070

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/723253

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (stable/stein)

Related fix proposed to branch: stable/stein
Review: https://review.opendev.org/723255

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (stable/train)

Reviewed: https://review.opendev.org/723253
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=8c7afb73bcc1dc5e8982660b7c3774686f658356
Submitter: Zuul
Branch: stable/train

commit 8c7afb73bcc1dc5e8982660b7c3774686f658356
Author: Radosław Piliszek <email address hidden>
Date: Mon Apr 13 17:33:02 2020 +0200

    Fix Designate not to use etcd coordination backend

    etcd via tooz does not support group membership required by
    Designate coordination.
    The best k-a can do is not to configure etcd in Designate.

    Change-Id: I2f64f928e730355142ac369d8868cf9f65ca357e
    Closes-bug: #1872205
    Related-bug: #1840070
    (cherry picked from commit 3c234603a9c8e443dc3d38989e3e78147757d1da)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (stable/stein)

Reviewed: https://review.opendev.org/723255
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=66a2f493589bd737d8a4302ff27c0c430371ee68
Submitter: Zuul
Branch: stable/stein

commit 66a2f493589bd737d8a4302ff27c0c430371ee68
Author: Radosław Piliszek <email address hidden>
Date: Mon Apr 13 17:33:02 2020 +0200

    Fix Designate not to use etcd coordination backend

    etcd via tooz does not support group membership required by
    Designate coordination.
    The best k-a can do is not to configure etcd in Designate.

    Change-Id: I2f64f928e730355142ac369d8868cf9f65ca357e
    Closes-bug: #1872205
    Related-bug: #1840070
    (cherry picked from commit 3c234603a9c8e443dc3d38989e3e78147757d1da)

no longer affects: kolla-ansible/stein
no longer affects: kolla-ansible/train
no longer affects: kolla-ansible/ussuri
Changed in kolla-ansible:
status: Fix Committed → Triaged
assignee: Radosław Piliszek (yoctozepto) → nobody
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.opendev.org/735520

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (stable/rocky)

Reviewed: https://review.opendev.org/735520
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=4cd14150279d939774347c5d7e1ae903ac0824cc
Submitter: Zuul
Branch: stable/rocky

commit 4cd14150279d939774347c5d7e1ae903ac0824cc
Author: Radosław Piliszek <email address hidden>
Date: Tue Aug 13 20:22:54 2019 +0200

    Allow cinder coordination backend to be configured

    This is to allow operator to prevent enabling redis and/or
    etcd from magically configuring cinder coordinator.

    Note this change is backwards-compatible.

    Backporting to Rocky to allow users to fix their
    coordination configs.
    It is not possible to fix it (unset) via an override.
    See [1].

    [1] https://launchpad.net/bugs/1883310

    Change-Id: Ie10be55968e43e3b9cc347b1b58771c1f7b1b910
    Related-Bug: #1840070
    Closes-Bug: #1883310
    Signed-off-by: Radosław Piliszek <email address hidden>
    (cherry picked from commit 03b4c706fae54a381b78c0f516499bfac986fb26)
    (cherry picked from commit 439aed8852ba54092d20d8c2dc1d5f1ebeaf012a)

tags: added: in-stable-rocky
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.