Description
===========
After fresh installation of TripleO Stain/Stable on 5 nodes (3 HA Controllers and 2 Computes),
rabbitmq bundle and some other resources Failed in Pacemaker.
Steps to reproduce
==================
1- installing undercloud
2- installing overcloud with this command:
It seems cluster is completely unhealthy. even running these commands don't help:
pcs resource restart rabbitmq-bundle
pcs resource cleanup rabbitmq-bundle
or restarting the whole cluster or all nodes with deleting rmenia directory.
All requests on overcloud are extremely slow, Horizon takes one minute for each refresh.
adding additional services like Octavia cause failed overcloud installation due to 504 Gateway timeout.
Description
===========
After fresh installation of TripleO Stain/Stable on 5 nodes (3 HA Controllers and 2 Computes),
rabbitmq bundle and some other resources Failed in Pacemaker.
Steps to reproduce
==================
1- installing undercloud
2- installing overcloud with this command:
openstack overcloud deploy \ tripleo- heat-templates \ roles_data. yaml \ containers- prepare- parameter. yaml \ tripleo- heat-templates/ environments/ services/ neutron- ovn-dvr- ha.yaml \ tripleo- heat-templates/ environments/ docker- ha.yaml \ tripleo- heat-templates/ environments/ network- isolation. yaml \ tripleo- heat-templates/ environments/ network- environment. yaml \
--control-flavor control \
--compute-flavor compute \
--templates ~/openstack-
-r /home/stack/
-e /home/stack/
-e environment.yaml \
-e ~/openstack-
-e ~/openstack-
-e ~/openstack-
-e ~/openstack-
--timeout 360 \
--ntp-server pool.ntp.org
I got same result without network isolation and custom network environment, and completely default settings.
Expected result
===============
Fresh healthy HA OpenStack.
Actual result
=============
pcs status output is as follows:
Full list of resources:
Docker container set: rabbitmq-bundle [192.168. 24.1:8787/ tripleostein/ centos- binary- rabbitmq: pcmklatest] bundle- 0 (ocf::heartbeat :rabbitmq- cluster) : FAILED overcloud- controller- 0 (Monitoring) bundle- 1 (ocf::heartbeat :rabbitmq- cluster) : Stopped overcloud- controller- 1 bundle- 2 (ocf::heartbeat :rabbitmq- cluster) : Stopped overcloud- controller- 2 24.1:8787/ tripleostein/ centos- binary- mariadb: pcmklatest] :galera) : Master overcloud- controller- 0 :galera) : Master overcloud- controller- 1 :galera) : Master overcloud- controller- 2 24.1:8787/ tripleostein/ centos- binary- redis:pcmklates t] :redis) : Master overcloud- controller- 0 :redis) : Slave overcloud- controller- 1 :redis) : Slave overcloud- controller- 2 :IPaddr2) : Started overcloud- controller- 0 :IPaddr2) : Started overcloud- controller- 1 :IPaddr2) : Started overcloud- controller- 2 :IPaddr2) : Started overcloud- controller- 1 :IPaddr2) : Stopped :IPaddr2) : Stopped 24.1:8787/ tripleostein/ centos- binary- haproxy: pcmklatest] bundle- docker- 0 (ocf::heartbeat :docker) : Started overcloud- controller- 1 bundle- docker- 1 (ocf::heartbeat :docker) : Started overcloud- controller- 2 bundle- docker- 2 (ocf::heartbeat :docker) : Started overcloud- controller- 0 24.1:8787/ tripleostein/ centos- binary- ovn-northd: pcmklatest] ovndb-servers) : Master overcloud- controller- 1 ovndb-servers) : Slave overcloud- controller- 2 ovndb-servers) : Slave overcloud- controller- 0 cinder- volume [192.168. 24.1:8787/ tripleostein/ centos- binary- cinder- volume: pcmklatest] cinder- volume- docker- 0 (ocf::heartbeat :docker) : Started overcloud- controller- 0
rabbitmq-
rabbitmq-
rabbitmq-
Docker container set: galera-bundle [192.168.
galera-bundle-0 (ocf::heartbeat
galera-bundle-1 (ocf::heartbeat
galera-bundle-2 (ocf::heartbeat
Docker container set: redis-bundle [192.168.
redis-bundle-0 (ocf::heartbeat
redis-bundle-1 (ocf::heartbeat
redis-bundle-2 (ocf::heartbeat
ip-192.168.24.10 (ocf::heartbeat
ip-X.X.X.X (ocf::heartbeat
ip-172.16.2.175 (ocf::heartbeat
ip-172.16.2.41 (ocf::heartbeat
ip-172.16.1.166 (ocf::heartbeat
ip-172.16.3.10 (ocf::heartbeat
Docker container set: haproxy-bundle [192.168.
haproxy-
haproxy-
haproxy-
Docker container set: ovn-dbs-bundle [192.168.
ovn-dbs-bundle-0 (ocf::ovn:
ovn-dbs-bundle-1 (ocf::ovn:
ovn-dbs-bundle-2 (ocf::ovn:
Docker container: openstack-
openstack-
Failed Resource Actions: 16.1.166_ start_0 on overcloud- controller- 0 'unknown error' (1): call=89, status=complete, exitreason= '[findif] failed', rc-change= 'Sun Apr 26 17:19:01 2020', queued=0ms, exec=111ms 16.3.10_ start_0 on overcloud- controller- 0 'unknown error' (1): call=95, status=complete, exitreason= '[findif] failed', rc-change= 'Sun Apr 26 17:19:42 2020', queued=0ms, exec=103ms 16.1.166_ start_0 on overcloud- controller- 1 'unknown error' (1): call=87, status=complete, exitreason= '[findif] failed', rc-change= 'Sun Apr 26 17:19:00 2020', queued=1ms, exec=147ms 16.3.10_ start_0 on overcloud- controller- 1 'unknown error' (1): call=93, status=complete, exitreason= '[findif] failed', rc-change= 'Sun Apr 26 17:19:41 2020', queued=0ms, exec=99ms 16.1.166_ start_0 on overcloud- controller- 2 'unknown error' (1): call=87, status=complete, exitreason= '[findif] failed', rc-change= 'Sun Apr 26 17:19:00 2020', queued=0ms, exec=105ms 16.3.10_ start_0 on overcloud- controller- 2 'unknown error' (1): call=93, status=complete, exitreason= '[findif] failed', rc-change= 'Sun Apr 26 17:19:42 2020', queued=0ms, exec=93ms rc-change= 'Sun Apr 26 18:18:42 2020', queued=0ms, exec=200049ms rc-change= 'Sun Apr 26 18:04:59 2020', queued=0ms, exec=200031ms monitor_ 10000 on rabbitmq-bundle-0 'unknown error' (1): call=2298, status=Timed Out, exitreason='', rc-change= 'Sun Apr 26 18:36:59 2020', queued=0ms, exec=40036ms monitor_ 30000 on ovn-dbs-bundle-2 'not running' (7): call=23, status=complete, exitreason='', rc-change= 'Sun Apr 26 17:33:03 2020', queued=1ms, exec=1806ms
* ip-172.
last-
* ip-172.
last-
* ip-172.
last-
* ip-172.
last-
* ip-172.
last-
* ip-172.
last-
* rabbitmq_start_0 on rabbitmq-bundle-1 'unknown error' (1): call=2121, status=Timed Out, exitreason='',
last-
* rabbitmq_start_0 on rabbitmq-bundle-2 'unknown error' (1): call=1979, status=Timed Out, exitreason='',
last-
* rabbitmq_
last-
* ovndb_servers_
last-
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
It seems cluster is completely unhealthy. even running these commands don't help:
pcs resource restart rabbitmq-bundle
pcs resource cleanup rabbitmq-bundle
or restarting the whole cluster or all nodes with deleting rmenia directory.
All requests on overcloud are extremely slow, Horizon takes one minute for each refresh.
adding additional services like Octavia cause failed overcloud installation due to 504 Gateway timeout.
Environment
===========
TripleO OpenStack Stable/Stein
Logs & Configs
==============
I can provide any required log or config.