podman/upgrade: haproxy exits and never restart

Bug #1806733 reported by Emilien Macchi on 2018-12-04
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
High
Emilien Macchi

Bug Description

Environment: master (Stein)

How to reproduce:

1) Deploy an undercloud with "container_cli = podman" in undercloud.conf
2) Run "openstack undercloud upgrade"

Results: it hangs at step3 when starting containers.

Looking at containers states, HAproxy isn't running.
[root@undercloud ~]# podman logs haproxy
+ sudo -E kolla_set_configs
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Deleting /etc/haproxy/haproxy.cfg
INFO:__main__:Copying /var/lib/kolla/config_files/src/etc/haproxy/haproxy.cfg to /etc/haproxy/haproxy.cfg
INFO:__main__:Copying /var/lib/kolla/config_files/src-tls/etc/pki/tls/private/overcloud_endpoint.pem to /etc/pki/tls/private/overcloud_endpoint.pem
INFO:__main__:Writing out command to execute
INFO:__main__:Setting permission for /var/lib/haproxy
++ cat /run_command
+ CMD='/usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg'
+ ARGS=
+ [[ ! -n '' ]]
+ . kolla_extend_start
+ echo 'Running command: '\''/usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg'\'''
Running command: '/usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg'
+ exec /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg
<7>haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ds
[WARNING] 337/132458 (12) : Setting tune.ssl.default-dh-param to 1024 by default, if your workload permits it you should set it to at least 2048. Please set a value >= 1024 to make this warning disappear.
<7>haproxy-systemd-wrapper: SIGTERM -> 13.
<5>haproxy-systemd-wrapper: exit, haproxy RC=0

We need to investigate why the HAproxy container receives a SIGPIPE signal.

Note: the bug can't be reproduced when running a re-deploy with "openstack undercloud install".

description: updated
description: updated
Emilien Macchi (emilienm) wrote :

I found the issue. We are still running upgrade_tasks on the host that cleanup systemd services. The problem is that when we upgrade from podman to podman, HAproxy is wiped-out on the host (in systemd) so the container never restart (remember, we count on systemd to restart containers when they are stopped by podman).

I'll fix it.

Changed in tripleo:
status: Triaged → In Progress

Reviewed: https://review.openstack.org/622578
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=e4ee042a2aaf880e9863461a7423549be3bb0aa1
Submitter: Zuul
Branch: master

commit e4ee042a2aaf880e9863461a7423549be3bb0aa1
Author: Emilien Macchi <email address hidden>
Date: Tue Dec 4 14:36:12 2018 -0500

    upgrade: remove tasks that stop and disable services

    We don't need upgrade_tasks that stop systemd services since all
    services are now containerized.
    However, we decided to keep the tasks that remove the rpms in case some
    of deployments didn't cleanup them in previous releases, they can still
    do it now.

    Change-Id: I6abdc9e37966cd818306f7af473958fd4662ccb5
    Related-Bug: #1806733

Changed in tripleo:
milestone: stein-2 → stein-3
Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers