fedora-28 standalone failing at neutron-haproxy-ovnmeta service

Bug #1824977 reported by Quique Llorente
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Quique Llorente

Bug Description

http://logs.openstack.org/66/652066/2/check/tripleo-ci-fedora-28-standalone/3132cc2/job-output.txt.gz
2019-04-16 09:15:08.057103 | primary | TASK [validate-services : Get failed containers for podman] ********************
2019-04-16 09:15:08.072763 | primary | Tuesday 16 April 2019 09:15:08 +0000 (0:00:00.808) 1:09:21.403 *********
2019-04-16 09:15:09.287342 | primary | changed: [undercloud]
2019-04-16 09:15:09.331782 | primary |
2019-04-16 09:15:09.332018 | primary | TASK [validate-services : Fail if we detect failed podman container] ***********
2019-04-16 09:15:09.348360 | primary | Tuesday 16 April 2019 09:15:09 +0000 (0:00:01.275) 1:09:22.678 *********
2019-04-16 09:15:09.512017 | primary | failed: [undercloud] (item=neutron-haproxy-ovnmeta-44b51029-7c6d-41e8-8d38-ce3d3e4b59e4 Exited (137) 4 minutes ago) => {
2019-04-16 09:15:09.512146 | primary | "changed": false,
2019-04-16 09:15:09.512349 | primary | "item": "neutron-haproxy-ovnmeta-44b51029-7c6d-41e8-8d38-ce3d3e4b59e4 Exited (137) 4 minutes ago"
2019-04-16 09:15:09.512389 | primary | }
2019-04-16 09:15:09.512420 | primary |
2019-04-16 09:15:09.512457 | primary | MSG:
2019-04-16 09:15:09.512487 | primary |
2019-04-16 09:15:09.512615 | primary | Failed container detected. Please check the following locations
2019-04-16 09:15:09.512703 | primary | /var/log/extras/failed_containers.log
2019-04-16 09:15:09.512778 | primary | /var/log/extras/podman

Tags: alert
tags: added: alert
tags: removed: promotion-blocker
Revision history for this message
Ronelle Landy (rlandy) wrote :

Related change: https://review.openstack.org/653136 Disable validations on f28 standalone

Revision history for this message
wes hayutin (weshayutin) wrote :
Changed in tripleo:
status: Triaged → Fix Released
Revision history for this message
Cédric Jeanneret (cjeanner) wrote :

hiding the failure isn't a fix.

Changed in tripleo:
status: Fix Released → In Progress
Revision history for this message
Quique Llorente (quiquell) wrote :

Looks like this is happening as expected

12:04 <dalvarez> quiquell|rover: Tengu numans mistery solved :) https://github.com/openstack/neutron/blob/master/neutron/agent/linux/external_process.py#L98
12:04 <dalvarez> quiquell|rover: Tengu numans when haproxy sidecar container is no longer needed, the ovn metadata agent kills it, hence you see that 137. It's all good
12:04 <dalvarez> and expected
1

So we have two options:
1 - discard sidecars from service validation
2 - do service validation before tempest.

Also why is not happening at centos7 ?

Revision history for this message
Daniel Alvarez (dalvarezs) wrote :

I believe it's happening as well on centos7 but perhaps the 'podman' output is not showing the exit code? The way sidecar containers are spawned/killed in the OVN metadata agent hasn't changed and is not dependent on the OS.

Revision history for this message
Quique Llorente (quiquell) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-quickstart-extras (master)

Fix proposed to branch: master
Review: https://review.openstack.org/653383

Changed in tripleo:
assignee: nobody → Quique Llorente (quiquell)
Revision history for this message
Quique Llorente (quiquell) wrote :

So looks like running tempest after validate-services make sense so we have reverse it here
https://review.openstack.org/653383, let's also re-activate f28 and c7 and see if it's working now.

Revision history for this message
Quique Llorente (quiquell) wrote :

Activaging validate_services for f28 and c7 standalone jobs here https://review.openstack.org/653390

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.openstack.org/653383
Committed: https://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/commit/?id=1c111804a8c91aa59f0796e7c1de0b1a21478e29
Submitter: Zuul
Branch: master

commit 1c111804a8c91aa59f0796e7c1de0b1a21478e29
Author: Quique Llorente <email address hidden>
Date: Wed Apr 17 12:37:09 2019 +0200

    Run validate-services before tempest

    Some of the actions done at tempest like create a vm instance restart
    some sidecar containers and they appear as failed containers since the
    purpose of validate_services is to check them after deploy running
    before tempest would be good enough.

    Change-Id: I750e4d2ffc139433f7ec2a9a5c4adc6467ad19c9
    Closes-Bug: #1824977

Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.