[wallaby] With Ceph Dashboard enabled, Deploy hangs and fails at 'Create containers managed by Podman for /var/lib/tripleo-config/container-startup-config/step_3'
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
New
|
Undecided
|
Unassigned |
Bug Description
Description
===========
With wallaby, when deploying an HA cluster, with Ceph Dashboard enabled, deployment fails due to timeout waiting for containers to start at 'TASK | Create containers managed by Podman for /var/lib/
haproxy container fails to start and exits with 'Starting proxy ceph_dashboard: cannot bind socket [10.100.4.40:8444]' when trying to bind to the ctlplane VIP.
ceph-mgr prevents haproxy from binding to port 8444 on the VIP because it has bound the dashboard port to all interfaces
Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
tcp LISTEN 0 5 *:8444 *:* users:(
The IP address is set
[ceph: root@overcloud-
10.100.7.163
However, ceph-mgr binds to all interfaces
Steps to reproduce
==================
Deploy an openstack HA cluster with ceph dashboard enabled
Expected result
===============
Deployment does not hang and eventuality time out at 'TASK | Create containers managed by Podman for /var/lib/
ceph-mgr binds port 8444 to the correct, specific, ip address.
haproxy container is able to start because it is able to bind to port 8444 in the ctlplane VIP.
Actual result
=============
Deployment hangs and eventuality times out at 'TASK | Create containers managed by Podman for /var/lib/
ceph-mgr binds port 8444 to all interfaces.
haproxy container exits immediately because it is unable to bind to port 8444 in the ctlplane VIP.
Environment
===========
wallaby
python3-
ceph-ansible.noarch 6.0.7-1.el8s
ceph-mgr image: quay.io/
haproxy image: quay.io/
Logs & Configs
==============
+------
| Name | Fixed IP Addresses |
+------
| control_virtual_ip | ip_address=
| storage_virtual_ip | ip_address=
| overcloud-
| overcloud-
| overcloud-
| overcloud-
| overcloud-
| overcloud-
+------
[ceph: root@overcloud-
10.100.7.163
[ceph: root@overcloud-
10.100.7.157
[ceph: root@overcloud-
10.100.7.156
[ceph: root@overcloud-
{
"dashboard": "http://
"prometheus": "http://
}
[ceph: root@overcloud-
{
"dashboard": "http://
"prometheus": "http://
}
[ceph: root@overcloud-
{
"dashboard": "http://
"prometheus": "http://
}
http://
http://
It turns out the IP address is being set in an invalid configuration node, so ceph uses it's bad default behavior instead.
It is setting mgr/dashboard/ overcloud- controller- 0-lozwge/ server_ addr, but it needs to be setting mgr/dashboard/ overcloud- controller- 0.lozwge/ server_ addr.
I would suggest using this patch, which simply gets the correct name directly from ceph, as opposed to trying to kludge the brittle, regex voodoo that doesn't work correctly.
index dc083e42..9567e42e 100644 ansible/ roles/tripleo_ cephadm/ tasks/dashboard /configure_ dashboard_ backends. yml ansible/ roles/tripleo_ cephadm/ tasks/dashboard /configure_ dashboard_ backends. yml
--- a/tripleo_
+++ b/tripleo_
@@ -15,21 +15,30 @@
# under the License.
- name: Get the current mgr ?(.*)-mgr. *' --format \{\{\.Names\}\} select( startswith( "mgr.") )[4:]' stdout| length > 0
- command: |
- {{ container_cli }} ps -a -f 'name=ceph-
+ shell: |
+ {{ tripleo_cephadm_bin }} ls --no-detail | jq -r '.[]|.name|
register: ceph_mgr
become: true
+ until: ceph_mgr.
+ retries: "24"
+ delay: "5"
+ ignore_errors: "false"
delegate_to: "{{ dashboard_backend }}"
+- name: Fail if mgr daemon is not running stdout| length == 0 '^ceph- ?(.*)-mgr. ', '') }}" cephadm_ verbose
+ fail:
+ msg: "mgr daemon is not running"
+ when: ceph_mgr is undefined or ceph_mgr.stdout is undefined or ceph_mgr.
+
- name: Check the resulting mgr container instance
debug:
- msg: "{{ ceph_mgr.stdout | regex_replace(
+ msg: "'the mgr daemon id is ' returned {{ ceph_mgr.stdout }}"
when: tripleo_
- name: config the current dashboard backend cephadm_ ceph_cli }} config set \ '^ceph- ?(.*)-mgr. ', '') }}/server_addr \ dashboard_ backend] [tripleo_ ceph_dashboard_ net] }}
command: |
{{ tripleo_
- mgr mgr/dashboard/{{ ceph_mgr.stdout | regex_replace(
+ mgr mgr/dashboard/{{ ceph_mgr.stdout }}/server_addr \
{{ hostvars[
become: true
changed_when: false