Activity log for bug #1959668

Date Who What changed Old value New value Message
2022-02-01 11:40:05 Yusuf Güngör bug added bug
2022-02-01 11:41:21 Yusuf Güngör description **Bug Report** What happened: Hi, We are trying to implement masakari with hacluster. After installation there exist WARNING logs below on "masakari_hostmonitor" container logs: ``` 2022-01-31 11:29:33.063 7 INFO masakarimonitors.hostmonitor.host_handler.handle_host [-] Corosync communication using 'bond0.api' is normal. 2022-01-31 11:30:03.111 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] Exception caught: Unexpected error while running command. Command: crmadmin -S r2-osp-test-controller-02.mycompany.dmz Exit code: 124 Stdout: '' Stderr: 'No messages received in 30 seconds.. aborting\n': oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2022-01-31 11:30:03.112 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] 'r2-osp-test-controller-02.mycompany.dmz' is unstable state on cluster.: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2022-01-31 11:30:03.112 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] hostmonitor skips monitoring hosts. ``` Masakari HA not works because of this. It seems that hacluster task creates pacemaker resources with short hostnames but Masakari uses python socket library to get hostname and it gets FQDN. Our all installations have the hostnames as FQDN like below. It is not easy and safe to change all these hostnames. ``` root@r2-osp-test-controller-02:~# hostname r2-osp-test-controller-02.mycompany.dmz root@r2-osp-test-controller-02:~# hostname -f r2-osp-test-controller-02.mycompany.dmz root@r2-osp-test-controller-02:~# hostname -s r2-osp-test-controller-02 root@r2-osp-test-controller-02:~# ``` This issue has similarities with https://bugs.launchpad.net/kolla-ansible/+bug/1830023 Our multinode inventory file also consist of FQDNs like below: ``` [control] r2-osp-test-controller-01.mycompany.dmz r2-osp-test-controller-02.mycompany.dmz r2-osp-test-controller-03.mycompany.dmz ``` These FQDNs have dns records and ansible can reach the hosts with these FQDNs. We had to change **hacluster_corosync.conf.j2** as below: ``` vim /etc/kolla/config/hacluster-corosync/corosync.conf: ... ... ... nodelist { {% for host in groups['hacluster'] | sort %} node { ring0_addr: {{ 'api' | kolla_address(host) }} - name: {{ hostvars[host].ansible_facts.hostname }} + name: {{ hostvars[host].inventory_hostname }} nodeid: {{ loop.index }} } {% endfor %} } ... ... ... ``` We also updated the **kolla-ansible/ansible/roles/hacluster/tasks/bootstrap_service.yml** file and changed all **{{ ansible_facts.hostname }}** values to **{{ inventory_hostname }}** ``` --- - name: Ensure stonith is disabled vars: service: "{{ hacluster_services['hacluster-pacemaker'] }}" command: docker exec {{ service.container_name }} crm_attribute --type crm_config --name stonith-enabled --update false run_once: true become: true delegate_to: "{{ groups[service.group][0] }}" - name: Ensure remote node is added vars: pacemaker_service: "{{ hacluster_services['hacluster-pacemaker'] }}" pacemaker_remote_service: "{{ hacluster_services['hacluster-pacemaker-remote'] }}" shell: > docker exec {{ pacemaker_service.container_name }} cibadmin --modify --scope resources -X ' <resources> - <primitive id="{{ ansible_facts.hostname }}" class="ocf" provider="pacemaker" type="remote"> + <primitive id="{{ inventory_hostname }}" class="ocf" provider="pacemaker" type="remote"> - <instance_attributes id="{{ ansible_facts.hostname }}-instance_attributes"> + <instance_attributes id="{{ inventory_hostname }}-instance_attributes"> - <nvpair id="{{ ansible_facts.hostname }}-instance_attributes-server" name="server" value="{{ 'api' | kolla_address }}"/> + <nvpair id="{{ inventory_hostname }}-instance_attributes-server" name="server" value="{{ 'api' | kolla_address }}"/> </instance_attributes> <operations> - <op id="{{ ansible_facts.hostname }}-monitor" name="monitor" interval="60" timeout="30"/> + <op id="{{ inventory_hostname }}-monitor" name="monitor" interval="60" timeout="30"/> </operations> </primitive> </resources> ' become: true delegate_to: "{{ groups[pacemaker_service.group][0] }}" when: - inventory_hostname in groups[pacemaker_remote_service.group] - pacemaker_remote_service.enabled | bool ``` After these changes kolla-ansible creates the pacemaker resources with **short hostnames** and beacuse of our hosts have **fqdn** as **short hostnames** then everything works seamless. What you expected to happen: Masakari Installation should work out of the box even using FQDN as short hostnames. How to reproduce it (minimal and precise): Set short hostnames as FQDN then try to install hacluster and masakari. **Environment**: * OS (e.g. from /etc/os-release): Ubuntu 20.04.2 LTS * Kernel (e.g. `uname -a`): Linux 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux * Docker version if applicable (e.g. `docker version`): 20.10.7 * Kolla-Ansible version (e.g. `git head or tag or stable branch` or pip package version if using release): pip installation, kolla-ansible==12.3.0 * Docker image Install type (source/binary): source * Docker image distribution: DockerHub Images **Bug Report** What happened: Hi, We are trying to implement masakari with hacluster. After installation there exist WARNING logs below on "masakari_hostmonitor" container logs: <code> 2022-01-31 11:29:33.063 7 INFO masakarimonitors.hostmonitor.host_handler.handle_host [-] Corosync communication using 'bond0.api' is normal. 2022-01-31 11:30:03.111 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] Exception caught: Unexpected error while running command. Command: crmadmin -S r2-osp-test-controller-02.mycompany.dmz Exit code: 124 Stdout: '' Stderr: 'No messages received in 30 seconds.. aborting\n': oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2022-01-31 11:30:03.112 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] 'r2-osp-test-controller-02.mycompany.dmz' is unstable state on cluster.: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2022-01-31 11:30:03.112 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] hostmonitor skips monitoring hosts. </code> Masakari HA not works because of this. It seems that hacluster task creates pacemaker resources with short hostnames but Masakari uses python socket library to get hostname and it gets FQDN. Our all installations have the hostnames as FQDN like below. It is not easy and safe to change all these hostnames. ``` root@r2-osp-test-controller-02:~# hostname r2-osp-test-controller-02.mycompany.dmz root@r2-osp-test-controller-02:~# hostname -f r2-osp-test-controller-02.mycompany.dmz root@r2-osp-test-controller-02:~# hostname -s r2-osp-test-controller-02 root@r2-osp-test-controller-02:~# ``` This issue has similarities with https://bugs.launchpad.net/kolla-ansible/+bug/1830023 Our multinode inventory file also consist of FQDNs like below: ``` [control] r2-osp-test-controller-01.mycompany.dmz r2-osp-test-controller-02.mycompany.dmz r2-osp-test-controller-03.mycompany.dmz ``` These FQDNs have dns records and ansible can reach the hosts with these FQDNs. We had to change **hacluster_corosync.conf.j2** as below: ``` vim /etc/kolla/config/hacluster-corosync/corosync.conf:   ...   ...   ...   nodelist {   {% for host in groups['hacluster'] | sort %}       node {           ring0_addr: {{ 'api' | kolla_address(host) }}   - name: {{ hostvars[host].ansible_facts.hostname }}   + name: {{ hostvars[host].inventory_hostname }}           nodeid: {{ loop.index }}       }   {% endfor %}   }   ...   ...   ... ``` We also updated the **kolla-ansible/ansible/roles/hacluster/tasks/bootstrap_service.yml** file and changed all **{{ ansible_facts.hostname }}** values to **{{ inventory_hostname }}** ``` --- - name: Ensure stonith is disabled   vars:     service: "{{ hacluster_services['hacluster-pacemaker'] }}"   command: docker exec {{ service.container_name }} crm_attribute --type crm_config --name stonith-enabled --update false   run_once: true   become: true   delegate_to: "{{ groups[service.group][0] }}" - name: Ensure remote node is added   vars:     pacemaker_service: "{{ hacluster_services['hacluster-pacemaker'] }}"     pacemaker_remote_service: "{{ hacluster_services['hacluster-pacemaker-remote'] }}"   shell: >     docker exec {{ pacemaker_service.container_name }}     cibadmin --modify --scope resources -X '       <resources> - <primitive id="{{ ansible_facts.hostname }}" class="ocf" provider="pacemaker" type="remote"> + <primitive id="{{ inventory_hostname }}" class="ocf" provider="pacemaker" type="remote"> - <instance_attributes id="{{ ansible_facts.hostname }}-instance_attributes"> + <instance_attributes id="{{ inventory_hostname }}-instance_attributes"> - <nvpair id="{{ ansible_facts.hostname }}-instance_attributes-server" name="server" value="{{ 'api' | kolla_address }}"/> + <nvpair id="{{ inventory_hostname }}-instance_attributes-server" name="server" value="{{ 'api' | kolla_address }}"/>           </instance_attributes>           <operations> - <op id="{{ ansible_facts.hostname }}-monitor" name="monitor" interval="60" timeout="30"/> + <op id="{{ inventory_hostname }}-monitor" name="monitor" interval="60" timeout="30"/>           </operations>         </primitive>       </resources>     '   become: true   delegate_to: "{{ groups[pacemaker_service.group][0] }}"   when:     - inventory_hostname in groups[pacemaker_remote_service.group]     - pacemaker_remote_service.enabled | bool ``` After these changes kolla-ansible creates the pacemaker resources with **short hostnames** and beacuse of our hosts have **fqdn** as **short hostnames** then everything works seamless. What you expected to happen: Masakari Installation should work out of the box even using FQDN as short hostnames. How to reproduce it (minimal and precise): Set short hostnames as FQDN then try to install hacluster and masakari. **Environment**: * OS (e.g. from /etc/os-release): Ubuntu 20.04.2 LTS * Kernel (e.g. `uname -a`): Linux 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux * Docker version if applicable (e.g. `docker version`): 20.10.7 * Kolla-Ansible version (e.g. `git head or tag or stable branch` or pip package version if using release): pip installation, kolla-ansible==12.3.0 * Docker image Install type (source/binary): source * Docker image distribution: DockerHub Images
2022-02-01 11:44:37 Yusuf Güngör description **Bug Report** What happened: Hi, We are trying to implement masakari with hacluster. After installation there exist WARNING logs below on "masakari_hostmonitor" container logs: <code> 2022-01-31 11:29:33.063 7 INFO masakarimonitors.hostmonitor.host_handler.handle_host [-] Corosync communication using 'bond0.api' is normal. 2022-01-31 11:30:03.111 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] Exception caught: Unexpected error while running command. Command: crmadmin -S r2-osp-test-controller-02.mycompany.dmz Exit code: 124 Stdout: '' Stderr: 'No messages received in 30 seconds.. aborting\n': oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2022-01-31 11:30:03.112 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] 'r2-osp-test-controller-02.mycompany.dmz' is unstable state on cluster.: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2022-01-31 11:30:03.112 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] hostmonitor skips monitoring hosts. </code> Masakari HA not works because of this. It seems that hacluster task creates pacemaker resources with short hostnames but Masakari uses python socket library to get hostname and it gets FQDN. Our all installations have the hostnames as FQDN like below. It is not easy and safe to change all these hostnames. ``` root@r2-osp-test-controller-02:~# hostname r2-osp-test-controller-02.mycompany.dmz root@r2-osp-test-controller-02:~# hostname -f r2-osp-test-controller-02.mycompany.dmz root@r2-osp-test-controller-02:~# hostname -s r2-osp-test-controller-02 root@r2-osp-test-controller-02:~# ``` This issue has similarities with https://bugs.launchpad.net/kolla-ansible/+bug/1830023 Our multinode inventory file also consist of FQDNs like below: ``` [control] r2-osp-test-controller-01.mycompany.dmz r2-osp-test-controller-02.mycompany.dmz r2-osp-test-controller-03.mycompany.dmz ``` These FQDNs have dns records and ansible can reach the hosts with these FQDNs. We had to change **hacluster_corosync.conf.j2** as below: ``` vim /etc/kolla/config/hacluster-corosync/corosync.conf:   ...   ...   ...   nodelist {   {% for host in groups['hacluster'] | sort %}       node {           ring0_addr: {{ 'api' | kolla_address(host) }}   - name: {{ hostvars[host].ansible_facts.hostname }}   + name: {{ hostvars[host].inventory_hostname }}           nodeid: {{ loop.index }}       }   {% endfor %}   }   ...   ...   ... ``` We also updated the **kolla-ansible/ansible/roles/hacluster/tasks/bootstrap_service.yml** file and changed all **{{ ansible_facts.hostname }}** values to **{{ inventory_hostname }}** ``` --- - name: Ensure stonith is disabled   vars:     service: "{{ hacluster_services['hacluster-pacemaker'] }}"   command: docker exec {{ service.container_name }} crm_attribute --type crm_config --name stonith-enabled --update false   run_once: true   become: true   delegate_to: "{{ groups[service.group][0] }}" - name: Ensure remote node is added   vars:     pacemaker_service: "{{ hacluster_services['hacluster-pacemaker'] }}"     pacemaker_remote_service: "{{ hacluster_services['hacluster-pacemaker-remote'] }}"   shell: >     docker exec {{ pacemaker_service.container_name }}     cibadmin --modify --scope resources -X '       <resources> - <primitive id="{{ ansible_facts.hostname }}" class="ocf" provider="pacemaker" type="remote"> + <primitive id="{{ inventory_hostname }}" class="ocf" provider="pacemaker" type="remote"> - <instance_attributes id="{{ ansible_facts.hostname }}-instance_attributes"> + <instance_attributes id="{{ inventory_hostname }}-instance_attributes"> - <nvpair id="{{ ansible_facts.hostname }}-instance_attributes-server" name="server" value="{{ 'api' | kolla_address }}"/> + <nvpair id="{{ inventory_hostname }}-instance_attributes-server" name="server" value="{{ 'api' | kolla_address }}"/>           </instance_attributes>           <operations> - <op id="{{ ansible_facts.hostname }}-monitor" name="monitor" interval="60" timeout="30"/> + <op id="{{ inventory_hostname }}-monitor" name="monitor" interval="60" timeout="30"/>           </operations>         </primitive>       </resources>     '   become: true   delegate_to: "{{ groups[pacemaker_service.group][0] }}"   when:     - inventory_hostname in groups[pacemaker_remote_service.group]     - pacemaker_remote_service.enabled | bool ``` After these changes kolla-ansible creates the pacemaker resources with **short hostnames** and beacuse of our hosts have **fqdn** as **short hostnames** then everything works seamless. What you expected to happen: Masakari Installation should work out of the box even using FQDN as short hostnames. How to reproduce it (minimal and precise): Set short hostnames as FQDN then try to install hacluster and masakari. **Environment**: * OS (e.g. from /etc/os-release): Ubuntu 20.04.2 LTS * Kernel (e.g. `uname -a`): Linux 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux * Docker version if applicable (e.g. `docker version`): 20.10.7 * Kolla-Ansible version (e.g. `git head or tag or stable branch` or pip package version if using release): pip installation, kolla-ansible==12.3.0 * Docker image Install type (source/binary): source * Docker image distribution: DockerHub Images **Bug Report** What happened: Hi, We are trying to implement masakari with hacluster. After installation there exist WARNING logs below on "masakari_hostmonitor" container logs: -------------------------------------------------------- 2022-01-31 11:29:33.063 7 INFO masakarimonitors.hostmonitor.host_handler.handle_host [-] Corosync communication using 'bond0.api' is normal. 2022-01-31 11:30:03.111 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] Exception caught: Unexpected error while running command. Command: crmadmin -S r2-osp-test-controller-02.mycompany.dmz Exit code: 124 Stdout: '' Stderr: 'No messages received in 30 seconds.. aborting\n': oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2022-01-31 11:30:03.112 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] 'r2-osp-test-controller-02.mycompany.dmz' is unstable state on cluster.: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2022-01-31 11:30:03.112 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] hostmonitor skips monitoring hosts. -------------------------------------------------------- Masakari HA not works because of this. It seems that hacluster task creates pacemaker resources with short hostnames but Masakari uses python socket library to get hostname and it gets FQDN. Our all installations have the hostnames as FQDN like below. It is not easy and safe to change all these hostnames. -------------------------------------------------------- root@r2-osp-test-controller-02:~# hostname r2-osp-test-controller-02.mycompany.dmz root@r2-osp-test-controller-02:~# hostname -f r2-osp-test-controller-02.mycompany.dmz root@r2-osp-test-controller-02:~# hostname -s r2-osp-test-controller-02 root@r2-osp-test-controller-02:~# -------------------------------------------------------- This issue has similarities with https://bugs.launchpad.net/kolla-ansible/+bug/1830023 Our multinode inventory file also consist of FQDNs like below: -------------------------------------------------------- [control] r2-osp-test-controller-01.mycompany.dmz r2-osp-test-controller-02.mycompany.dmz r2-osp-test-controller-03.mycompany.dmz -------------------------------------------------------- These FQDNs have dns records and ansible can reach the hosts with these FQDNs. We had to change **hacluster_corosync.conf.j2** as below: -------------------------------------------------------- vim /etc/kolla/config/hacluster-corosync/corosync.conf:   ...   ...   ...   nodelist {   {% for host in groups['hacluster'] | sort %}       node {           ring0_addr: {{ 'api' | kolla_address(host) }}   - name: {{ hostvars[host].ansible_facts.hostname }}   + name: {{ hostvars[host].inventory_hostname }}           nodeid: {{ loop.index }}       }   {% endfor %}   }   ...   ...   ... -------------------------------------------------------- We also updated the **kolla-ansible/ansible/roles/hacluster/tasks/bootstrap_service.yml** file and changed all **{{ ansible_facts.hostname }}** values to **{{ inventory_hostname }}** -------------------------------------------------------- - name: Ensure stonith is disabled   vars:     service: "{{ hacluster_services['hacluster-pacemaker'] }}"   command: docker exec {{ service.container_name }} crm_attribute --type crm_config --name stonith-enabled --update false   run_once: true   become: true   delegate_to: "{{ groups[service.group][0] }}" - name: Ensure remote node is added   vars:     pacemaker_service: "{{ hacluster_services['hacluster-pacemaker'] }}"     pacemaker_remote_service: "{{ hacluster_services['hacluster-pacemaker-remote'] }}"   shell: >     docker exec {{ pacemaker_service.container_name }}     cibadmin --modify --scope resources -X '       <resources> - <primitive id="{{ ansible_facts.hostname }}" class="ocf" provider="pacemaker" type="remote"> + <primitive id="{{ inventory_hostname }}" class="ocf" provider="pacemaker" type="remote"> - <instance_attributes id="{{ ansible_facts.hostname }}-instance_attributes"> + <instance_attributes id="{{ inventory_hostname }}-instance_attributes"> - <nvpair id="{{ ansible_facts.hostname }}-instance_attributes-server" name="server" value="{{ 'api' | kolla_address }}"/> + <nvpair id="{{ inventory_hostname }}-instance_attributes-server" name="server" value="{{ 'api' | kolla_address }}"/>           </instance_attributes>           <operations> - <op id="{{ ansible_facts.hostname }}-monitor" name="monitor" interval="60" timeout="30"/> + <op id="{{ inventory_hostname }}-monitor" name="monitor" interval="60" timeout="30"/>           </operations>         </primitive>       </resources>     '   become: true   delegate_to: "{{ groups[pacemaker_service.group][0] }}"   when:     - inventory_hostname in groups[pacemaker_remote_service.group]     - pacemaker_remote_service.enabled | bool -------------------------------------------------------- After these changes kolla-ansible creates the pacemaker resources with **short hostnames** and beacuse of our hosts have **fqdn** as **short hostnames** then everything works seamless. What you expected to happen: Masakari Installation should work out of the box even using FQDN as short hostnames. How to reproduce it (minimal and precise): Set short hostnames as FQDN then try to install hacluster and masakari. **Environment**: * OS (e.g. from /etc/os-release): Ubuntu 20.04.2 LTS * Kernel (e.g. `uname -a`): Linux 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux * Docker version if applicable (e.g. `docker version`): 20.10.7 * Kolla-Ansible version (e.g. `git head or tag or stable branch` or pip package version if using release): pip installation, kolla-ansible==12.3.0 * Docker image Install type (source/binary): source * Docker image distribution: DockerHub Images
2022-02-01 13:59:35 Yusuf Güngör description **Bug Report** What happened: Hi, We are trying to implement masakari with hacluster. After installation there exist WARNING logs below on "masakari_hostmonitor" container logs: -------------------------------------------------------- 2022-01-31 11:29:33.063 7 INFO masakarimonitors.hostmonitor.host_handler.handle_host [-] Corosync communication using 'bond0.api' is normal. 2022-01-31 11:30:03.111 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] Exception caught: Unexpected error while running command. Command: crmadmin -S r2-osp-test-controller-02.mycompany.dmz Exit code: 124 Stdout: '' Stderr: 'No messages received in 30 seconds.. aborting\n': oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2022-01-31 11:30:03.112 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] 'r2-osp-test-controller-02.mycompany.dmz' is unstable state on cluster.: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2022-01-31 11:30:03.112 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] hostmonitor skips monitoring hosts. -------------------------------------------------------- Masakari HA not works because of this. It seems that hacluster task creates pacemaker resources with short hostnames but Masakari uses python socket library to get hostname and it gets FQDN. Our all installations have the hostnames as FQDN like below. It is not easy and safe to change all these hostnames. -------------------------------------------------------- root@r2-osp-test-controller-02:~# hostname r2-osp-test-controller-02.mycompany.dmz root@r2-osp-test-controller-02:~# hostname -f r2-osp-test-controller-02.mycompany.dmz root@r2-osp-test-controller-02:~# hostname -s r2-osp-test-controller-02 root@r2-osp-test-controller-02:~# -------------------------------------------------------- This issue has similarities with https://bugs.launchpad.net/kolla-ansible/+bug/1830023 Our multinode inventory file also consist of FQDNs like below: -------------------------------------------------------- [control] r2-osp-test-controller-01.mycompany.dmz r2-osp-test-controller-02.mycompany.dmz r2-osp-test-controller-03.mycompany.dmz -------------------------------------------------------- These FQDNs have dns records and ansible can reach the hosts with these FQDNs. We had to change **hacluster_corosync.conf.j2** as below: -------------------------------------------------------- vim /etc/kolla/config/hacluster-corosync/corosync.conf:   ...   ...   ...   nodelist {   {% for host in groups['hacluster'] | sort %}       node {           ring0_addr: {{ 'api' | kolla_address(host) }}   - name: {{ hostvars[host].ansible_facts.hostname }}   + name: {{ hostvars[host].inventory_hostname }}           nodeid: {{ loop.index }}       }   {% endfor %}   }   ...   ...   ... -------------------------------------------------------- We also updated the **kolla-ansible/ansible/roles/hacluster/tasks/bootstrap_service.yml** file and changed all **{{ ansible_facts.hostname }}** values to **{{ inventory_hostname }}** -------------------------------------------------------- - name: Ensure stonith is disabled   vars:     service: "{{ hacluster_services['hacluster-pacemaker'] }}"   command: docker exec {{ service.container_name }} crm_attribute --type crm_config --name stonith-enabled --update false   run_once: true   become: true   delegate_to: "{{ groups[service.group][0] }}" - name: Ensure remote node is added   vars:     pacemaker_service: "{{ hacluster_services['hacluster-pacemaker'] }}"     pacemaker_remote_service: "{{ hacluster_services['hacluster-pacemaker-remote'] }}"   shell: >     docker exec {{ pacemaker_service.container_name }}     cibadmin --modify --scope resources -X '       <resources> - <primitive id="{{ ansible_facts.hostname }}" class="ocf" provider="pacemaker" type="remote"> + <primitive id="{{ inventory_hostname }}" class="ocf" provider="pacemaker" type="remote"> - <instance_attributes id="{{ ansible_facts.hostname }}-instance_attributes"> + <instance_attributes id="{{ inventory_hostname }}-instance_attributes"> - <nvpair id="{{ ansible_facts.hostname }}-instance_attributes-server" name="server" value="{{ 'api' | kolla_address }}"/> + <nvpair id="{{ inventory_hostname }}-instance_attributes-server" name="server" value="{{ 'api' | kolla_address }}"/>           </instance_attributes>           <operations> - <op id="{{ ansible_facts.hostname }}-monitor" name="monitor" interval="60" timeout="30"/> + <op id="{{ inventory_hostname }}-monitor" name="monitor" interval="60" timeout="30"/>           </operations>         </primitive>       </resources>     '   become: true   delegate_to: "{{ groups[pacemaker_service.group][0] }}"   when:     - inventory_hostname in groups[pacemaker_remote_service.group]     - pacemaker_remote_service.enabled | bool -------------------------------------------------------- After these changes kolla-ansible creates the pacemaker resources with **short hostnames** and beacuse of our hosts have **fqdn** as **short hostnames** then everything works seamless. What you expected to happen: Masakari Installation should work out of the box even using FQDN as short hostnames. How to reproduce it (minimal and precise): Set short hostnames as FQDN then try to install hacluster and masakari. **Environment**: * OS (e.g. from /etc/os-release): Ubuntu 20.04.2 LTS * Kernel (e.g. `uname -a`): Linux 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux * Docker version if applicable (e.g. `docker version`): 20.10.7 * Kolla-Ansible version (e.g. `git head or tag or stable branch` or pip package version if using release): pip installation, kolla-ansible==12.3.0 * Docker image Install type (source/binary): source * Docker image distribution: DockerHub Images **Bug Report** What happened: Hi, We are trying to implement masakari with hacluster. After installation there exist WARNING logs below on "masakari_hostmonitor" container logs: -------------------------------------------------------- 2022-01-31 11:29:33.063 7 INFO masakarimonitors.hostmonitor.host_handler.handle_host [-] Corosync communication using 'bond0.api' is normal. 2022-01-31 11:30:03.111 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] Exception caught: Unexpected error while running command. Command: crmadmin -S r2-osp-test-controller-02.mycompany.dmz Exit code: 124 Stdout: '' Stderr: 'No messages received in 30 seconds.. aborting\n': oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2022-01-31 11:30:03.112 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] 'r2-osp-test-controller-02.mycompany.dmz' is unstable state on cluster.: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. 2022-01-31 11:30:03.112 7 WARNING masakarimonitors.hostmonitor.host_handler.handle_host [-] hostmonitor skips monitoring hosts. -------------------------------------------------------- Masakari HA not works because of this. It seems that hacluster task creates pacemaker resources with short hostnames but Masakari uses python socket library to get hostname and it gets FQDN. Our all installations have the hostnames as FQDN like below. It is not easy and safe to change all these hostnames. -------------------------------------------------------- root@r2-osp-test-controller-02:~# hostname r2-osp-test-controller-02.mycompany.dmz root@r2-osp-test-controller-02:~# hostname -f r2-osp-test-controller-02.mycompany.dmz root@r2-osp-test-controller-02:~# hostname -s r2-osp-test-controller-02 root@r2-osp-test-controller-02:~# -------------------------------------------------------- This issue has similarities with https://bugs.launchpad.net/kolla-ansible/+bug/1830023 Our multinode inventory file also consist of FQDNs like below: -------------------------------------------------------- [control] r2-osp-test-controller-01.mycompany.dmz r2-osp-test-controller-02.mycompany.dmz r2-osp-test-controller-03.mycompany.dmz -------------------------------------------------------- These FQDNs have dns records and ansible can reach the hosts with these FQDNs. We had to change **hacluster_corosync.conf.j2** as below: -------------------------------------------------------- vim /etc/kolla/config/hacluster-corosync/corosync.conf:   ...   ...   ...   nodelist {   {% for host in groups['hacluster'] | sort %}       node {           ring0_addr: {{ 'api' | kolla_address(host) }}   - name: {{ hostvars[host].ansible_facts.hostname }}   + name: {{ hostvars[host].inventory_hostname }}           nodeid: {{ loop.index }}       }   {% endfor %}   }   ...   ...   ... -------------------------------------------------------- We also updated the **kolla-ansible/ansible/roles/hacluster/tasks/bootstrap_service.yml** file and changed all **{{ ansible_facts.hostname }}** values to **{{ inventory_hostname }}** -------------------------------------------------------- - name: Ensure stonith is disabled   vars:     service: "{{ hacluster_services['hacluster-pacemaker'] }}"   command: docker exec {{ service.container_name }} crm_attribute --type crm_config --name stonith-enabled --update false   run_once: true   become: true   delegate_to: "{{ groups[service.group][0] }}" - name: Ensure remote node is added   vars:     pacemaker_service: "{{ hacluster_services['hacluster-pacemaker'] }}"     pacemaker_remote_service: "{{ hacluster_services['hacluster-pacemaker-remote'] }}"   shell: >     docker exec {{ pacemaker_service.container_name }}     cibadmin --modify --scope resources -X '       <resources> - <primitive id="{{ ansible_facts.hostname }}" class="ocf" provider="pacemaker" type="remote"> + <primitive id="{{ inventory_hostname }}" class="ocf" provider="pacemaker" type="remote"> - <instance_attributes id="{{ ansible_facts.hostname }}-instance_attributes"> + <instance_attributes id="{{ inventory_hostname }}-instance_attributes"> - <nvpair id="{{ ansible_facts.hostname }}-instance_attributes-server" name="server" value="{{ 'api' | kolla_address }}"/> + <nvpair id="{{ inventory_hostname }}-instance_attributes-server" name="server" value="{{ 'api' | kolla_address }}"/>           </instance_attributes>           <operations> - <op id="{{ ansible_facts.hostname }}-monitor" name="monitor" interval="60" timeout="30"/> + <op id="{{ inventory_hostname }}-monitor" name="monitor" interval="60" timeout="30"/>           </operations>         </primitive>       </resources>     '   become: true   delegate_to: "{{ groups[pacemaker_service.group][0] }}"   when:     - inventory_hostname in groups[pacemaker_remote_service.group]     - pacemaker_remote_service.enabled | bool -------------------------------------------------------- After these changes kolla-ansible creates the pacemaker resources with **FQDN hostnames** and beacuse of our hosts have **fqdn** as **short hostnames** then everything works seamless. What you expected to happen: Masakari Installation should work out of the box even using FQDN as short hostnames. How to reproduce it (minimal and precise): Set short hostnames as FQDN then try to install hacluster and masakari. **Environment**: * OS (e.g. from /etc/os-release): Ubuntu 20.04.2 LTS * Kernel (e.g. `uname -a`): Linux 5.4.0-90-generic #101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux * Docker version if applicable (e.g. `docker version`): 20.10.7 * Kolla-Ansible version (e.g. `git head or tag or stable branch` or pip package version if using release): pip installation, kolla-ansible==12.3.0 * Docker image Install type (source/binary): source * Docker image distribution: DockerHub Images
2022-02-07 08:07:56 Mesut Muhammet Şahin bug added subscriber Mesut Muhammet Şahin