podman healthchecks listing processes/file descriptors too verbose

Bug #1821782 reported by Luca Miccini
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Cédric Jeanneret

Bug Description

Healthchecks like:

- tripleo_heat_engine_healthcheck.service
- tripleo_nova_conductor_healthcheck.service
- tripleo_nova_scheduler_healthcheck.service

journal output is a little too verbose, here an example from heat_engine:

Mar 26 13:59:50 undercloud-0.redhat.local systemd[1]: Starting heat_engine healthcheck...
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=25,fd=14))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=24,fd=19))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=24,fd=14))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=25,fd=15))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=23,fd=8))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=23,fd=7))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=22,fd=13))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=25,fd=10))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=25,fd=9))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=24,fd=7))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=23,fd=9))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=22,fd=12))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=25,fd=12))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=24,fd=13))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=25,fd=7))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=24,fd=23))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=23,fd=17))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=22,fd=15))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=22,fd=7))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=22,fd=9))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=25,fd=8))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=23,fd=13))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=22,fd=20))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=24,fd=9))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=23,fd=21))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=24,fd=12))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=23,fd=12))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=22,fd=8))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=24,fd=8))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=25,fd=16))
Mar 26 13:59:50 undercloud-0.redhat.local podman[174626]: 192.168.24.1:5672 - users:(("heat-engine",pid=23,fd=22))

we should trim down the output or redirect it somewhere else.

Changed in tripleo:
assignee: nobody → Cédric Jeanneret (cjeanner)
importance: Undecided → High
milestone: none → stein-rc1
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (master)

Fix proposed to branch: master
Review: https://review.openstack.org/648027

Changed in tripleo:
status: Triaged → In Progress
Changed in tripleo:
assignee: Cédric Jeanneret (cjeanner) → Sergii Golovatiuk (sgolovatiuk)
Changed in tripleo:
assignee: Sergii Golovatiuk (sgolovatiuk) → Cédric Jeanneret (cjeanner)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.openstack.org/648027
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=5312bf19c8f820ac65514885aebdc2dc4776d72d
Submitter: Zuul
Branch: master

commit 5312bf19c8f820ac65514885aebdc2dc4776d72d
Author: Cédric Jeanneret <email address hidden>
Date: Wed Mar 27 08:58:24 2019 +0100

    Silent file descriptor checks

    In order to avoid spam in journald, we just get the exit code and let
    the checker output the error message.

    Also, correct how we retrieve process in the healthcheck_port and _listen
    functions.
    "ss" doesn't allow to match some processes, like "neutron-l3-agent". We
    therefore use the PID instead, provided by "pgrep".
    The "-d" option of pgrep allow to prepare its output for the "grep -E",
    preventing any need of a loop.

    Change-Id: I1555a9b79c954e646fe9ae35272231c581cea03e
    Closes-Bug: #1821782
    Closes-Bug: #1821856

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 10.6.1

This issue was fixed in the openstack/tripleo-common 10.6.1 release.

tags: added: queens-backport-potential rocky-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/713375

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/rocky)

Reviewed: https://review.opendev.org/713375
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=89d2393ea9fa977946ba34b6b9b3b279f830b64c
Submitter: Zuul
Branch: stable/rocky

commit 89d2393ea9fa977946ba34b6b9b3b279f830b64c
Author: Cédric Jeanneret <email address hidden>
Date: Wed Mar 27 08:58:24 2019 +0100

    Silent file descriptor checks

    In order to avoid spam in journald, we just get the exit code and let
    the checker output the error message.

    Also, correct how we retrieve process in the healthcheck_port and _listen
    functions.
    "ss" doesn't allow to match some processes, like "neutron-l3-agent". We
    therefore use the PID instead, provided by "pgrep".
    The "-d" option of pgrep allow to prepare its output for the "grep -E",
    preventing any need of a loop.

    Change-Id: I1555a9b79c954e646fe9ae35272231c581cea03e
    Closes-Bug: #1821782
    Closes-Bug: #1821856
    (cherry picked from commit 5312bf19c8f820ac65514885aebdc2dc4776d72d)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/713579

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/queens)

Reviewed: https://review.opendev.org/713579
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=16776e3979b8206009844e0d63ab3bb38a632d74
Submitter: Zuul
Branch: stable/queens

commit 16776e3979b8206009844e0d63ab3bb38a632d74
Author: Cédric Jeanneret <email address hidden>
Date: Wed Mar 27 08:58:24 2019 +0100

    Silent file descriptor checks

    In order to avoid spam in journald, we just get the exit code and let
    the checker output the error message.

    Also, correct how we retrieve process in the healthcheck_port and _listen
    functions.
    "ss" doesn't allow to match some processes, like "neutron-l3-agent". We
    therefore use the PID instead, provided by "pgrep".
    The "-d" option of pgrep allow to prepare its output for the "grep -E",
    preventing any need of a loop.

    Change-Id: I1555a9b79c954e646fe9ae35272231c581cea03e
    Closes-Bug: #1821782
    Closes-Bug: #1821856
    (cherry picked from commit 5312bf19c8f820ac65514885aebdc2dc4776d72d)
    (cherry picked from commit 89d2393ea9fa977946ba34b6b9b3b279f830b64c)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common rocky-eol

This issue was fixed in the openstack/tripleo-common rocky-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common queens-eol

This issue was fixed in the openstack/tripleo-common queens-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.