kolla_toolbox fails to deploy after switching from kolla-ansible 16.1.0 to master tag

Bug #2038914 reported by Marcel Hauptmann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
New
Undecided
Unassigned

Bug Description

**Bug Report**

We have successfully deployed openstack with kolla-ansible version 16.1.0. Unfortunately, we have problems with the opensearch container in this version - it does not start and always crashes.

Therefore we would like to deploy OpenStack with kolla-ansible master. However, in this version we have the problem that kolla-toolbox is not found during the deployment. The deployment fails on all nodes:

----------------------------

Error in gitlab pipeline:

fatal: [10.21.20.2]: FAILED! => {"Command": "dumb-init --single-child -- kolla_start", "Created": 1696926350, "HostConfig": {"NetworkMode": "host"}, "Id": "395e3ee92d5146ae6e0f3a2dab16194f7499f898d7e00fb7a2847ee72d44e2b7", "Image": "quay.io/openstack.kolla/kolla-toolbox:2023.1-rocky-9", "ImageID": "sha256:5b8102090b7995252a753d2836c7b10545e40939572cfb8e7ab82c28d901e8f3", "Labels": {"build-date": "20230920", "io.buildah.version": "1.30.0", "kolla_version": "16.1.1", "maintainer": "Kolla Project (https://launchpad.net/kolla)", "name": "kolla-toolbox"}, "Mounts": [{"Destination": "/dev", "Mode": "rw", "Propagation": "rprivate", "RW": true, "Source": "/dev", "Type": "bind"}, {"Destination": "/etc/localtime", "Mode": "ro", "Propagation": "rprivate", "RW": false, "Source": "/etc/localtime", "Type": "bind"}, {"Destination": "/run", "Mode": "shared", "Propagation": "shared", "RW": true, "Source": "/run", "Type": "bind"}, {"Destination": "/var/lib/kolla/config_files", "Mode": "ro", "Propagation": "rprivate", "RW": false, "Source": "/etc/kolla/kolla-toolbox", "Type": "bind"}, {"Destination": "/var/log/kolla", "Driver": "local", "Mode": "rw", "Name": "kolla_logs", "Propagation": "", "RW": true, "Source": "/var/lib/docker/volumes/kolla_logs/_data", "Type": "volume"}], "Names": ["/kolla_toolbox"], "NetworkSettings": {"Networks": {"host": {"Aliases": null, "DriverOpts": null, "EndpointID": "", "Gateway": "", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "IPAMConfig": null, "IPAddress": "", "IPPrefixLen": 0, "IPv6Gateway": "", "Links": null, "MacAddress": "", "NetworkID": "8a180824a28e73a21f8806bebeb90a64aedde3789fbe1365e8fad3a9ac9e8bd2"}}}, "Ports": [], "State": "exited", "Status": "Exited (127) 23 seconds ago", "changed": true, "msg": "Container timed out"}

----------------------------

docker logs kolla_toolbox

+ sudo -E kolla_set_configs
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Deleting /var/lib/rabbitmq/.erlang.cookie
INFO:__main__:Copying /var/lib/kolla/config_files/rabbitmq-erlang.cookie to /var/lib/rabbitmq/.erlang.cookie
INFO:__main__:Setting permission for /var/lib/rabbitmq/.erlang.cookie
INFO:__main__:Deleting /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Copying /var/lib/kolla/config_files/rabbitmq-env.conf to /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Setting permission for /etc/rabbitmq/rabbitmq-env.conf
INFO:__main__:Deleting /etc/rabbitmq/erl_inetrc
INFO:__main__:Copying /var/lib/kolla/config_files/erl_inetrc to /etc/rabbitmq/erl_inetrc
INFO:__main__:Setting permission for /etc/rabbitmq/erl_inetrc
INFO:__main__:Writing out command to execute
INFO:__main__:Setting permission for /var/log/kolla/ansible.log
++ cat /run_command
+ CMD=kolla_toolbox
+ ARGS=
+ sudo kolla_copy_cacerts
+ [[ ! -n '' ]]
+ . kolla_extend_start
++ [[ ! -f /var/log/kolla/ansible.log ]]
+++ stat -c %U:%G /var/log/kolla/ansible.log
++ [[ ansible:kolla != \a\n\s\i\b\l\e\:\k\o\l\l\a ]]
+++ stat -c %a /var/log/kolla/ansible.log
++ [[ 664 != \6\6\4 ]]
+ echo 'Running command: '\''kolla_toolbox'\'''
+ exec kolla_toolbox
Running command: 'kolla_toolbox'
/usr/local/bin/kolla_start: line 24: exec: kolla_toolbox: not found

----------------------------

How can we fix kolla_toolbox in kolla-ansible master?

Thanks in advance!

**Environment**
- OS: Rocky Linux 9.2
- OpenStack Release: 2023.1
- Kernel:5.14.0-284.25.1.el9_2.x86_64
- Docker version: 24.0.6
- Kolla-ansible version: switched from 16.1.0 to master
- Are you using official images from Docker Hub or self built -> Official images

summary: - kolla_toolbox fails to deploy after switching from kolla-ansible 16.0.1
+ kolla_toolbox fails to deploy after switching from kolla-ansible 16.1.0
to master tag
description: updated
Revision history for this message
Satish Patel (satish-txt) wrote :
Download full text (5.7 KiB)

I am seeing similar issue with Ubuntu 22.04 with kolla-ansible tag 16.1.9 and almost all images throwing this error

RUNNING HANDLER [common : Restart fluentd container] *****************************************************************************************************************************
fatal: [os1-bos-ctrl02]: FAILED! => {"Command": "dumb-init --single-child -- kolla_start", "Created": 1698865177, "HostConfig": {"NetworkMode": "host"}, "Id": "ae17d4d276fd278d15c5df03fdee70eafbff66fecbc78ae0bbae64274daa910b", "Image": "registry.example.com/kolla/fluentd:2023.1-ubuntu-jammy", "ImageID": "sha256:41752d28100fd85074d112942f843ce144df8bb023d7bdf664df0ef7be9a50ea", "Labels": {"build-date": "20231029", "kolla_version": "16.1.1", "maintainer": "Kolla Project (https://launchpad.net/kolla)", "name": "fluentd", "org.opencontainers.image.ref.name": "ubuntu", "org.opencontainers.image.version": "22.04"}, "Mounts": [{"Destination": "/etc/localtime", "Mode": "ro", "Propagation": "rprivate", "RW": false, "Source": "/etc/localtime", "Type": "bind"}, {"Destination": "/etc/timezone", "Mode": "ro", "Propagation": "rprivate", "RW": false, "Source": "/etc/timezone", "Type": "bind"}, {"Destination": "/var/lib/fluentd/data", "Driver": "local", "Mode": "rw", "Name": "fluentd_data", "Propagation": "", "RW": true, "Source": "/var/lib/docker/volumes/fluentd_data/_data", "Type": "volume"}, {"Destination": "/var/lib/kolla/config_files", "Mode": "ro", "Propagation": "rprivate", "RW": false, "Source": "/etc/kolla/fluentd", "Type": "bind"}, {"Destination": "/var/log/kolla", "Driver": "local", "Mode": "rw", "Name": "kolla_logs", "Propagation": "", "RW": true, "Source": "/var/lib/docker/volumes/kolla_logs/_data", "Type": "volume"}], "Names": ["/fluentd"], "NetworkSettings": {"Networks": {"host": {"Aliases": null, "DriverOpts": null, "EndpointID": "", "Gateway": "", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "IPAMConfig": null, "IPAddress": "", "IPPrefixLen": 0, "IPv6Gateway": "", "Links": null, "MacAddress": "", "NetworkID": ""}}}, "Ports": [], "State": "created", "Status": "Created", "changed": true, "msg": "Container timed out"}
fatal: [os1-bos-ctrl01]: FAILED! => {"Command": "dumb-init --single-child -- kolla_start", "Created": 1698865177, "HostConfig": {"NetworkMode": "host"}, "Id": "1ca559230483b490830dc2e38791fc8c567702d3280713a6f0c7bc45f2c8a038", "Image": "registry.example.com/kolla/fluentd:2023.1-ubuntu-jammy", "ImageID": "sha256:41752d28100fd85074d112942f843ce144df8bb023d7bdf664df0ef7be9a50ea", "Labels": {"build-date": "20231029", "kolla_version": "16.1.1", "maintainer": "Kolla Project (https://launchpad.net/kolla)", "name": "fluentd", "org.opencontainers.image.ref.name": "ubuntu", "org.opencontainers.image.version": "22.04"}, "Mounts": [{"Destination": "/etc/timezone", "Mode": "ro", "Propagation": "rprivate", "RW": false, "Source": "/etc/timezone", "Type": "bind"}, {"Destination": "/var/lib/fluentd/data", "Driver": "local", "Mode": "rw", "Name": "fluentd_data", "Propagation": "", "RW": true, "Source": "/var/lib/docker/volumes/fluentd_data/_data", "Type": "volume"}, {"Destination": "/var/lib/kolla/config_files", "Mode": "ro", "Propagation": "rprivate"...

Read more...

Revision history for this message
Sven Kieske (s-kieske) wrote :

the opensearch issue might be fixed by yesterdays(?) new release:

https://docs.openstack.org/releasenotes/kolla-ansible/2023.1.html#relnotes-16-1-0-32-stable-2023-1

Fixes an issue where a Docker health check wasn’t configured for the OpenSearch Dashboards container. See bug 2028362.

actually looking at https://bugs.launchpad.net/kolla-ansible/+bug/2028362 it seems it is not related.

do you use the pip releases or do you install from git stable branches?

only the latter is really supported, but the docs are not updated just yet.

it would be good to have two bug reports for this, one for kolla-toolbox not working and one for opensearch.

I didn't see any of those yet myself. I'd assume that our CI would be really red if kolla-toolbox would not work on master. This is very weird.

Changed in kolla-ansible:
status: New → Incomplete
Revision history for this message
Marcel Hauptmann (marcelhauptmann) wrote :

Sorry for my late reply. We install all the services with git stable branches. It is really weird why kolla-toolbox fails when using the master branch...

Changed in kolla-ansible:
status: Incomplete → New
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.