New control node is not added to etcd cluster (ZUN/KURYR)

Bug #2017669 reported by ICT
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
New
Undecided
Unassigned

Bug Description

Hi,

I'm running kolla-ansible version 14.8.1 on a YOGA cluster with ZUN enalbed. When adding a new control node, it is not added to the etcd cluster. In my test environment I have 3 control nodes:

control01-03

I added a new control node control04 and afterwards I removed control03 following the official kolla-ansible guide https://docs.openstack.org/kolla-ansible/yoga/user/adding-and-removing-hosts.html. I would have expected that the new control node control04 is joined to the etcd cluster but that it not the case:

(etcd)[etcd@control01 /]$ etcdctl -C http://192.168.20.142:2379 cluster-health
failed to check the health of member c6e74a554ab6931 on http://192.168.20.143:2379: Get http://192.168.20.143:2379/health: dial tcp 192.168.20.143:2379: connect: connection refused
member c6e74a554ab6931 is unreachable: [http://192.168.20.143:2379] are all unreachable
member 52148ade03e00536 is healthy: got healthy result from http://192.168.20.142:2379
member e4fbe16b914befd2 is healthy: got healthy result from http://192.168.20.141:2379
cluster is healthy

Maybe this is intended but I think the new control node should be joined to the cluster.

Cheers,
Oliver

**Environment**:
* OS (e.g. from /etc/os-release):

NAME="CentOS Stream"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Stream 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"

* Kernel (e.g. `uname -a`):

Linux control01 4.18.0-425.10.1.el8_7.x86_64 #1 SMP Thu Jan 12 16:32:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

* Docker version if applicable (e.g. `docker version`):

Docker version 23.0.4, build f480fb1

* Kolla-Ansible version (e.g. `git head or tag or stable branch` or pip package version if using release):

14.8.1

* Docker image Install type (source/binary):

source

* Docker image distribution:

kolla_base_distro: "centos"

* Are you using official images from Docker Hub or self built?

official

Revision history for this message
Peter Struys (peterstruys) wrote :

Hi,

Old bug but I had a similar experience with openstack-ansible and zun. I was able to solve it by changing a parameter in the file /etc/default/etcd in the zun container on the new controller. This file contains a parameter ETCD_INITIAL_CLUSTER_STATE which was on "new". This had as a consequence that the newly installed controller was making a cluster on its own instead of joining the existing etcd cluster. So, by changing this parameter to "existng" and double checking and comparing the other parameters with the same file on the other controllers, the etcd service was able to start and the playbook continued to a succesful end. But this was openstack-ansible, not kolla, so YMMV.

regards
Peter

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.