tripleo

Some HA containers logging got lost with the move to podman

Bug #1872734 reported by Michele Baldessari on 2020-04-14

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	tripleo	Fix Released	High	Michele Baldessari	tripleo ussuri-rc1 "tripleo ussuri-rc1"

Bug Description

When podman dropped the journald log-driver we rushed to move to the supported k8s-file driver. This had the side effect of us losing the stdout logs of the HA containers.

In fact previously we were easily able to troubleshoot haproxy startup failures just by looking in the journal. These days instead if haproxy fails to start we have no traces whatsoever in the logs, because when a container fails it gets stopped by pacemaker (and consequently removed) and no logs on the system are available any longer

Tags:

Michele Baldessari (michele) on 2020-04-14

summary:

- HA containers logging got lost with the move to podman
+ Some HA containers logging got lost with the move to podman

OpenStack Infra (hudson-openstack) on 2020-04-14

Changed in tripleo:
status:	Triaged → In Progress

Bogdan Dobrelya (bogdando) on 2020-04-14

Changed in tripleo:
milestone:	none → ussuri-rc1

Revision history for this message

Michele Baldessari (michele) wrote on 2020-04-14:

https://review.opendev.org/719773

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-16: Fix merged to puppet-tripleo (master)

Reviewed: https://review.opendev.org/719773
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=06c4aa7446073022b86c1f034a0c5406f2675ddb
Submitter: Zuul
Branch: master

commit 06c4aa7446073022b86c1f034a0c5406f2675ddb
Author: Michele Baldessari <email address hidden>
Date: Tue Apr 14 11:14:22 2020 +0200

Log stdout of HA containers

    When podman dropped the journald log-driver we rushed to move to the supported
    k8s-file driver. This had the side effect of us losing the stdout logs of the
    HA containers.

    In fact previously we were easily able to troubleshoot haproxy startup failures
    just by looking in the journal. These days instead if haproxy fails to start we
    have no traces whatsoever in the logs, because when a container fails it gets
    stopped by pacemaker (and consequently removed) and no logs on the system are
    available any longer.

    Tested as follows:
    1) Redeploy a previously deployed overcloud that did not have the patch
    and observe that we now log the startup of HA bundles in /var/log/containers/stdouts/*bundle.log

    [root@controller-0 stdouts]# ls -l *bundle.log |grep -v -e init -e restart
    -rw-------. 1 root root 16032 Apr 14 14:13 openstack-cinder-volume.log
    -rw-------. 1 root root 19515 Apr 14 14:00 haproxy-bundle.log
    -rw-------. 1 root root 10509 Apr 14 14:03 ovn-dbs-bundle.log
    -rw-------. 1 root root 6451 Apr 14 14:00 redis-bundle.log

2) Deploy a composable HA overcloud from scratch with the patch above
and observe that we obtain the stdout on disk.

    Note that most HA containers log to their usual on-host files just
    fine, we are mainly missing haproxy logs and/or the kolla startup only
    of the HA containers.

Closes-Bug: #1872734

Change-Id: I4270b398366e90206adffe32f812632b50df615b

Changed in tripleo:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-17: Fix proposed to puppet-tripleo (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/720657

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-04-18: Fix merged to puppet-tripleo (stable/train)

Reviewed: https://review.opendev.org/720657
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=7e4aca45fa8fa5e7868b8f387242eb895cf975eb
Submitter: Zuul
Branch: stable/train

commit 7e4aca45fa8fa5e7868b8f387242eb895cf975eb
Author: Michele Baldessari <email address hidden>
Date: Tue Apr 14 11:14:22 2020 +0200

Log stdout of HA containers

    When podman dropped the journald log-driver we rushed to move to the supported
    k8s-file driver. This had the side effect of us losing the stdout logs of the
    HA containers.

    In fact previously we were easily able to troubleshoot haproxy startup failures
    just by looking in the journal. These days instead if haproxy fails to start we
    have no traces whatsoever in the logs, because when a container fails it gets
    stopped by pacemaker (and consequently removed) and no logs on the system are
    available any longer.

    Tested as follows:
    1) Redeploy a previously deployed overcloud that did not have the patch
    and observe that we now log the startup of HA bundles in /var/log/containers/stdouts/*bundle.log

2) Deploy a composable HA overcloud from scratch with the patch above
and observe that we obtain the stdout on disk.

    Note that most HA containers log to their usual on-host files just
    fine, we are mainly missing haproxy logs and/or the kolla startup only
    of the HA containers.

Closes-Bug: #1872734

    NB: Cherry-picks had some context change in
        manifests/profile/pacemaker/cinder/volume_bundle.pp
        manifests/profile/pacemaker/rabbitmq_bundle.pp
        manifests/profile/pacemaker/manila/share_bundle.pp

Change-Id: I4270b398366e90206adffe32f812632b50df615b
(cherry picked from commit 06c4aa7446073022b86c1f034a0c5406f2675ddb)

Reviewed:  https://review.opendev.org/720657
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=7e4aca45fa8fa5e7868b8f387242eb895cf975eb
Submitter: Zuul
Branch:    stable/train

commit 7e4aca45fa8fa5e7868b8f387242eb895cf975eb
Author: Michele Baldessari <michele@acksyn.org>
Date:   Tue Apr 14 11:14:22 2020 +0200

Log stdout of HA containers
    
    When podman dropped the journald log-driver we rushed to move to the supported
    k8s-file driver. This had the side effect of us losing the stdout logs of the
    HA containers.
    
    In fact previously we were easily able to troubleshoot haproxy startup failures
    just by looking in the journal. These days instead if haproxy fails to start we
    have no traces whatsoever in the logs, because when a container fails it gets
    stopped by pacemaker (and consequently removed) and no logs on the system are
    available any longer.
    
    Tested as follows:
    1) Redeploy a previously deployed overcloud that did not have the patch
    and observe that we now log the startup of HA bundles in /var/log/containers/stdouts/*bundle.log
    
    [root@controller-0 stdouts]# ls -l *bundle.log |grep -v -e init -e restart
    -rw-------. 1 root root   16032 Apr 14 14:13 openstack-cinder-volume.log
    -rw-------. 1 root root   19515 Apr 14 14:00 haproxy-bundle.log
    -rw-------. 1 root root   10509 Apr 14 14:03 ovn-dbs-bundle.log
    -rw-------. 1 root root    6451 Apr 14 14:00 redis-bundle.log
    
    2) Deploy a composable HA overcloud from scratch with the patch above
    and observe that we obtain the stdout on disk.
    
    Note that most HA containers log to their usual on-host files just
    fine, we are mainly missing haproxy logs and/or the kolla startup only
    of the HA containers.
    
    Closes-Bug: #1872734
    
    NB: Cherry-picks had some context change in
        manifests/profile/pacemaker/cinder/volume_bundle.pp
        manifests/profile/pacemaker/rabbitmq_bundle.pp
        manifests/profile/pacemaker/manila/share_bundle.pp
    
    Change-Id: I4270b398366e90206adffe32f812632b50df615b
    (cherry picked from commit 06c4aa7446073022b86c1f034a0c5406f2675ddb)

tags:

added: in-stable-train

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2021-02-08: Fix included in openstack/puppet-tripleo 11.5.0

This issue was fixed in the openstack/puppet-tripleo 11.5.0 release.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.