Tripleo 16.1 Beta Train Octavia service returns 503
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
High
|
Unassigned |
Bug Description
Description
===========
TripleO 16.1 Beta Centos7 Train/Current deploys Octavia on overcloud using the supplied template
-e /usr/share/
The configuration is 3 Controllers,3 Compute,3 Ceph-Storage with SSL Public Endpoints and OVN (onv-ha-dvr).
However, once deployed, all loadbalancer commands result in a HTTP 503 Service unavailable response. Here are the logs from driver-agent.log and octavia.log from any of the three controllers.
/var/log/
2020-07-15 11:00:47.730 24 INFO octavia.
2020-07-15 11:00:47.730 24 INFO octavia.
2020-07-15 11:00:47.774 25 INFO octavia.
2020-07-15 11:00:47.774 25 INFO octavia.
2020-07-15 11:03:12.962 25 ERROR octavia.
2020-07-15 11:03:12.977 24 ERROR octavia.
2020-07-15 11:03:13.100 24 INFO octavia.
2020-07-15 11:03:13.100 24 INFO octavia.
2020-07-15 11:03:13.106 25 INFO octavia.
2020-07-15 11:03:13.107 25 INFO octavia.
2020-07-15 11:05:30.297 24 ERROR octavia.
2020-07-15 11:05:30.298 25 ERROR octavia.
/var/log/
2020-07-15 11:59:34.673 32 ERROR octavia.
2020-07-15 11:59:34.673 32 ERROR octavia.
2020-07-15 11:59:34.673 32 ERROR octavia.
2020-07-15 11:59:34.673 32 ERROR octavia.
2020-07-15 11:59:34.673 32 ERROR octavia.
2020-07-15 11:59:34.673 32 ERROR octavia.
2020-07-15 11:59:34.673 32 ERROR octavia.
2020-07-15 11:59:34.673 32 ERROR octavia.
2020-07-15 11:59:34.673 32 ERROR octavia.
2020-07-15 11:59:34.673 32 ERROR octavia.
2020-07-15 11:59:34.673 32 ERROR octavia.
2020-07-15 11:59:34.673 32 ERROR octavia.
Steps to reproduce
==================
1) Deploy Undercloud configuration of Tripleo 16.1 Beta (upstream current) for Train on CentOS7.
2) Deploy Overcloud with Octavia by adding
-e /usr/share/
3) Deployment has 3 controllers, 3 compute, 3 ceph-storage nodes; manila, barbican, octavia, ironic and sahara projects.
THT=/usr/
openstack overcloud deploy --templates $THT \
-e $THT/environmen
-e /home/stack/
-e $THT/environmen
-e $THT/environmen
-e $THT/environmen
-e $THT/environmen
-e $THT/environmen
-e /home/stack/
-e $THT/environmen
-r /home/stack/
-n /home/stack/
-e /home/stack/
-e /home/stack/
-e /home/stack/
-e /home/stack/
-e /home/stack/
-e /home/stack/
-e $THT/environmen
-e $THT/environmen
-e $THT/environmen
-e $THT/environmen
-e $THT/environmen
-e /home/stack/
-e /home/stack/
-e $THT/environmen
-e $THT/environmen
-e /home/stack/
-e /home/stack/
-e $THT/environmen
--libvirt-type kvm \
--timeout 210
4) From Undercloud source overcloudrc then - "openstack loadbalancer create lb1 ..."
5) HTTP 503 Service unavailable returned.
Expected result
===============
The command should result in the creation of loadbalance lb1 with default settings.
Actual result
=============
Service Unavailable HTTP 503.
ADDITIONAL INFORMATION
- I have tracked the bug to the octavia_
- The docker container USER is "octavia". So hence the "[Errno 13] Permission Denied".
- The docker image is centos-
Changed in tripleo: | |
milestone: | none → victoria-3 |
status: | New → Triaged |
importance: | Undecided → High |
tags: | added: queens-backport-potential train-backport-potential ussuri-backport-potential |
Changed in tripleo: | |
milestone: | victoria-3 → wallaby-1 |
Changed in tripleo: | |
milestone: | wallaby-1 → wallaby-2 |
Changed in tripleo: | |
milestone: | wallaby-2 → wallaby-3 |
Changed in tripleo: | |
milestone: | wallaby-3 → wallaby-rc1 |
Changed in tripleo: | |
status: | Triaged → Fix Released |
Okay, I have workarounds for all the bugs that stopped Octavia (Openstack Loadbalancer) coming up for TripleO 16.1 Beta - Train on CentOS7 production capable installation. I hate having to answer all my own questions.
OCTAVIA BUGS OR MISCONFIGURATIONS
1. /var/run/octavia has incorrect ownership of root.root. It needs to be oactavia.octavia it prevents the new driver_agent from running, so nothing runs then.
WORKAROUND driver_ agent and change the ownership to octavia.octavia then restart all octavia Docker Containers. Notice, we are all Docker here because Podman does not work with Pacemaker HA on CentOS7.
Go to the Docker Container octavia_
You have to execute these on all controllers.
------- ------- ---COMMANDS driver_ agent /bin/bash # Root on the container
sudo docker exec -ti --user root octavia_
chown octavi.octavia /var/run/octavia
exit
sudo docker restart octavia_ driver_ agent housekeeping health_ manager
sudo docker restart octavia_worker
sudo docker restart octavia_
sudo docker restart octavia_
------------------
I did try changing the docker image in the container- image-prepare command, but somewhere in the YAML deployment scripts /var/run/octavia gets reset to root.root ownership.
2. No Octavia certs get generated automatically.
The logic within the deployment jinga2 script /usr/share/ openstack- tripleo- heat-templates/ deployment/ octavia/ octavia- deployment- config. j2.yaml means that Octavia Certs are only generated if OctaviaGenerate Certs is true and IMPORTANTLY the Stack is at state CREATE.
WORKAROUND Certs is true and Stack is CREATE or STACK is UPDATE.
So it only works if your TripleO deploys completely the first time round. If you are looking at multiple Controllers and Computes the share number of moving parts makes this highly unlikely. I think the logic should be changed so that generate_certs is true when OctaviaGenerate
Is there a problem with regenerating the certs each time you deploy??? (it's better than not creating them at all)
My TripleO never deploys completely each time, I always have at least one restart even just owing to docker.io pull bandwidth issues. So I changed Stack - CREATE to UPDATE. I'll change it to CREATE or UPDATE (the best option) on my next redeployment.
------- ------- ------- ------- ---- /usr/share/ openstack- tripleo- heat-templates/ deployment/ octavia/ octavia- deployment- config. j2.yaml
generate_certs: Certs
and:
- get_param: OctaviaGenerate
- equals:
- get_param: StackAction
# - CREATE
- UPDATE
3. No Octavia roles are generated by deployment scripts. This is a strange one. The Policy file is generated and in place but no roles are set. Octavia cannot work without the 'load-balancer_ member' and 'load-balancer_ admin' roles being set for users. So we might as well generate the roles then.
This was an abandoned update (shown below possible fix).
WORKAROUND ------- ------- ------- --COMMANDS member observer global_ observer quota_admin
-------
source overcloudrc
openstack role create load-balancer_
openstack role create load-balancer_admin
openstack role create load-balancer_
openstack role create load-balancer_
openstack role create load-balancer_
For USER and...