OSA-AIO deploy failed cause haproxy didn't start

Bug #1775584 reported by Ilya Popov
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Won't Fix
High
Jean-Philippe Evrard

Bug Description

Error message appeared during openstack-ansible setup-infrastructure.yml execution

TASK [pip_install : Install PIP (fall back mode)] ***********************************************************************************************************************************************

Thursday 07 June 2018 10:51:49 +0000 (0:00:44.432) 0:19:25.544 *********

FAILED - RETRYING: Install PIP (fall back mode) (3 retries left).

FAILED - RETRYING: Install PIP (fall back mode) (2 retries left).

FAILED - RETRYING: Install PIP (fall back mode) (1 retries left).

fatal: [aio1_galera_container-db2a6e1e]: FAILED! => {"attempts": 3, "changed": false, "cmd": ["python", "/opt/get-pip.py", "--isolated", "--constraint", "http://<IP address>/gitweb/index.cgi?p=requirements.git;a=blob_plain;f=upper-constraints.txt;hb=6ea2d17f0a90985aa05c809c3e536abb7ea74b60", "pip", "setuptools", "wheel"], "delta": "0:00:09.335921", "end": "2018-06-07 10:52:34.012001", "failed": true, "rc": 1, "start": "2018-06-07 10:52:24.676080", "stderr": " Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7f8062c27b10>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /simple/pip/\n Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7f8062c27990>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /simple/pip/\n Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7f8062c27a50>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /simple/pip/\n Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7f8062c27b50>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /simple/pip/\n Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7f8062c27cd0>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /simple/pip/\n Could not find a version that satisfies the requirement pip (from versions: )\nNo matching distribution found for pip", "stderr_lines": [" Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by

this error happened because haproxy dint start correctly as the TCP port 80 was not empty:

journalctl -u haproxy.service

Jun 07 10:50:01 ubuntu1604 systemd[1]: Started HAProxy Load Balancer.

Jun 07 10:50:01 ubuntu1604 haproxy-systemd-wrapper[20945]: haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds

Jun 07 10:50:01 ubuntu1604 haproxy-systemd-wrapper[20945]: [ALERT] 157/105001 (20949) : Starting frontend horizon-redirect-front-1: cannot bind socket [<IP address>:80]

Jun 07 10:50:01 ubuntu1604 haproxy-systemd-wrapper[20945]: haproxy-systemd-wrapper: exit, haproxy RC=256

Jun 07 10:50:01 ubuntu1604 systemd[1]: haproxy.service: Service hold-off time over, scheduling restart.

Jun 07 10:50:01 ubuntu1604 systemd[1]: Stopped HAProxy Load Balancer.

Jun 07 10:50:01 ubuntu1604 systemd[1]: haproxy.service: Start request repeated too quickly.

Jun 07 10:50:01 ubuntu1604 systemd[1]: Failed to start HAProxy Load Balancer.

root@ubuntu1604:~# systemctl status haproxy.service

* haproxy.service - HAProxy Load Balancer

   Loaded: loaded (/lib/systemd/system/haproxy.service; enabled; vendor preset: enabled)

   Active: inactive (dead) since Thu 2018-06-07 10:50:01 UTC; 11min ago

     Docs: man:haproxy(1)

           file:/usr/share/doc/haproxy/configuration.txt.gz

  Process: 20945 ExecStart=/usr/sbin/haproxy-systemd-wrapper -f ${CONFIG} -p /run/haproxy.pid $EXTRAOPTS (code=exited, status=0/SUCCESS)

  Process: 20943 ExecStartPre=/usr/sbin/haproxy -f ${CONFIG} -c -q (code=exited, status=0/SUCCESS)

Main PID: 20945 (code=exited, status=0/SUCCESS)

Jun 07 10:50:01 ubuntu1604 systemd[1]: haproxy.service: Service hold-off time over, scheduling restart.

Jun 07 10:50:01 ubuntu1604 systemd[1]: Stopped HAProxy Load Balancer.

Jun 07 10:50:01 ubuntu1604 systemd[1]: haproxy.service: Start request repeated too quickly.

Jun 07 10:50:01 ubuntu1604 systemd[1]: Failed to start HAProxy Load Balancer.

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

It is needed to check haproxy health after all haproxy tasks were performed

Revision history for this message
Ilya Popov (ilya-p) wrote :

playbook should fail if haproxy isn't started successfully

Changed in openstack-ansible:
assignee: nobody → Ilya Popov (ilya-p)
Revision history for this message
Jean-Philippe Evrard (jean-philippe-evrard) wrote :

Will try to reproduce it locally.

Changed in openstack-ansible:
assignee: Ilya Popov (ilya-p) → Jean-Philippe Evrard (jean-philippe-evrard)
Revision history for this message
Jean-Philippe Evrard (jean-philippe-evrard) wrote :

I didn't reproduce this.

If the configuration was unchanged, the handler don't trigger, which causes the play to run just fine, even if it can't manage to run the service.

Is it possible you ran the haproxy playbook multiple times until it passes, while not checking for its failures? Do you have a full run log?

Revision history for this message
Ilya Popov (ilya-p) wrote :

Dear Jean-Philippe,

Steps to reproduce a the following:

1. On OSA-AIO host bind local TCP 80 port (with netcat or ssl tunnel).
2. Perform installtion steps of OSA-AIO from beginning (scripts/bootstrap-aio.sh)
3. Error will appear

As you can see from bug description text,

Jun 07 10:50:01 ubuntu1604 haproxy-systemd-wrapper[20945]: [ALERT] 157/105001 (20949) : Starting frontend horizon-redirect-front-1: cannot bind socket [<IP address>:80]

Haproxy couldn't bint port 80 as it was busy. So xaproxy service status became inactive

Revision history for this message
Jean-Philippe Evrard (jean-philippe-evrard) wrote :

Tested, and valid.

The systemd unit returns the state as loaded, which is what ansible looks at.
However the substate is dead.

It is due to a haproxy/packaging problem. Sadly we are just consuming the packages, not building them. I will check which version can be used that fixes this problem without having a too large version change.

Revision history for this message
Jean-Philippe Evrard (jean-philippe-evrard) wrote :

I tried across multiple versions of haproxy, and only a recent version from vbernat's haproxy 1.8 ppa is fixed.

I will evaluate this further, to see if there is no other alternative.

Changed in openstack-ansible:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Jose Gaitan (vchjgaitan) wrote :

Hello.

This issue is still surfacing on Openstack Ansible installations as of 10/25/2022. What would be the best way to override to a newer or working HAProxy version?

As of the time of this writting th lates stable version is 2.5.9.

Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote (last edit ):

Hi, Roger

At the moment we test and support only default haproxy versions provided by default disto repositories.

We never tested config against latest haproxy.

Also I'm not sure if you're hitting exact same issue as the role and task mentioned in the bug description is no longer part of deployment at all -pip_install role has been replaced with python_venv_build back in Stein.

So would be great if you could clarify what openstack-ansible version you are running

Revision history for this message
Dmitriy Rabotyagov (noonedeadpunk) wrote :

I will mark this bug as won't fix because it's created against old version of OSA that is EOL as of today. Please feel free to open new bug report if you still see issues regarding haproxy on the current versions with more details about the issue.

Changed in openstack-ansible:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.