tripleo-ci-centos-8-standalone job failing with container create aync task

Bug #1865473 reported by Rabi Mishra
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Incomplete
Medium
Rabi Mishra

Bug Description

Noticed a few times on master:

https://6c87970bcd1576df1a31-6b2fbb83e342ba16a98bd56da886e807.ssl.cf1.rackcdn.com/710728/1/check/tripleo-ci-centos-8-standalone/f27a279/logs/undercloud/home/zuul/standalone_deploy.log

Looks like neutron_db_sync is still running

https://6c87970bcd1576df1a31-6b2fbb83e342ba16a98bd56da886e807.ssl.cf1.rackcdn.com/710728/1/check/tripleo-ci-centos-8-standalone/f27a279/logs/undercloud/var/log/containers/stdouts/neutron_db_sync.log

Seems like it's checking for the status too quickly and exhausting the 30 retries?

https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_container_manage/tasks/podman/create.yml#L31

traceback:

2020-03-02 10:40:44 | FAILED - RETRYING: Check podman create status (30 retries left).
2020-03-02 10:40:50 | FAILED - RETRYING: Check podman create status (29 retries left).
2020-03-02 10:40:55 | FAILED - RETRYING: Check podman create status (28 retries left).
2020-03-02 10:41:00 | FAILED - RETRYING: Check podman create status (27 retries left).
2020-03-02 10:41:05 | FAILED - RETRYING: Check podman create status (26 retries left).
2020-03-02 10:41:10 | FAILED - RETRYING: Check podman create status (25 retries left).
2020-03-02 10:41:16 | FAILED - RETRYING: Check podman create status (24 retries left).
2020-03-02 10:41:21 | FAILED - RETRYING: Check podman create status (23 retries left).
2020-03-02 10:41:26 | FAILED - RETRYING: Check podman create status (22 retries left).
2020-03-02 10:41:31 | FAILED - RETRYING: Check podman create status (21 retries left).
2020-03-02 10:41:36 | FAILED - RETRYING: Check podman create status (20 retries left).
2020-03-02 10:41:42 | FAILED - RETRYING: Check podman create status (19 retries left).
2020-03-02 10:41:47 | FAILED - RETRYING: Check podman create status (18 retries left).
2020-03-02 10:41:52 | FAILED - RETRYING: Check podman create status (17 retries left).
2020-03-02 10:41:57 | FAILED - RETRYING: Check podman create status (16 retries left).
2020-03-02 10:42:03 | FAILED - RETRYING: Check podman create status (15 retries left).
2020-03-02 10:42:08 | FAILED - RETRYING: Check podman create status (14 retries left).
2020-03-02 10:42:13 | FAILED - RETRYING: Check podman create status (13 retries left).
2020-03-02 10:42:18 | FAILED - RETRYING: Check podman create status (12 retries left).
2020-03-02 10:42:23 | FAILED - RETRYING: Check podman create status (11 retries left).
2020-03-02 10:42:29 | FAILED - RETRYING: Check podman create status (10 retries left).
2020-03-02 10:42:34 | FAILED - RETRYING: Check podman create status (9 retries left).
2020-03-02 10:42:39 | FAILED - RETRYING: Check podman create status (8 retries left).
2020-03-02 10:42:44 | FAILED - RETRYING: Check podman create status (7 retries left).
2020-03-02 10:42:50 | FAILED - RETRYING: Check podman create status (6 retries left).
2020-03-02 10:42:55 | FAILED - RETRYING: Check podman create status (5 retries left).
2020-03-02 10:43:00 | FAILED - RETRYING: Check podman create status (4 retries left).
2020-03-02 10:43:05 | FAILED - RETRYING: Check podman create status (3 retries left).
2020-03-02 10:43:10 | FAILED - RETRYING: Check podman create status (2 retries left).
2020-03-02 10:43:16 | FAILED - RETRYING: Check podman create status (1 retries left).
2020-03-02 10:43:21 | - upgrade
2020-03-02 10:43:21 | - heads
2020-03-02 10:43:21 | detach: false
2020-03-02 10:43:21 | environment:
2020-03-02 10:43:21 | TRIPLEO_CONFIG_HASH: 4e24a94564576aaeef9da5e79adee62b-4e24a94564576aaeef9da5e79adee62b
2020-03-02 10:43:21 | TRIPLEO_DEPLOY_IDENTIFIER: '1583144119'
2020-03-02 10:43:21 | image: 192.168.24.1:8787/tripleomaster/centos-binary-neutron-server:0481ffa881dbc1aa61d9810533d85229-updated-20200302100121
2020-03-02 10:43:21 | net: host
2020-03-02 10:43:21 | privileged: false
2020-03-02 10:43:21 | start_order: 0
2020-03-02 10:43:21 | user: root
2020-03-02 10:43:21 | volumes:
2020-03-02 10:43:21 | - /etc/hosts:/etc/hosts:ro
2020-03-02 10:43:21 | - /etc/localtime:/etc/localtime:ro
2020-03-02 10:43:21 | - /etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro
2020-03-02 10:43:21 | - /etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro
2020-03-02 10:43:21 | - /etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro
2020-03-02 10:43:21 | - /etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro
2020-03-02 10:43:21 | - /etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro
2020-03-02 10:43:21 | - /dev/log:/dev/log
2020-03-02 10:43:21 | - /etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro
2020-03-02 10:43:21 | - /etc/puppet:/etc/puppet:ro
2020-03-02 10:43:21 | - /var/log/containers/neutron:/var/log/neutron:z
2020-03-02 10:43:21 | - /var/log/containers/httpd/neutron-api:/var/log/httpd:z
2020-03-02 10:43:21 | - /var/lib/config-data/neutron/etc/my.cnf.d/tripleo.cnf:/etc/my.cnf.d/tripleo.cnf:ro
2020-03-02 10:43:21 | - /var/lib/config-data/neutron/etc/neutron:/etc/neutron:ro
2020-03-02 10:43:21 | failed: false
2020-03-02 10:43:21 | finished: 0
2020-03-02 10:43:21 | results_file: /tmp/.ansible_async/880336127558.59209
2020-03-02 10:43:21 | started: 1
2020-03-02 10:43:21 | finished: 0
2020-03-02 10:43:21 | started: 1
2020-03-02 10:43:21 | failed: [standalone] (item={'started': 1, 'finished': 0, 'ansible_job_id': '880336127558.59209', 'results_file': '/tmp/.ansible_async/880336127558.59209', 'changed': True, 'failed': False, 'container_data': {'neutron_db_sync': {'command': ['/usr/bin/bootstrap_host_exec', 'neutron_api', 'neutron-db-manage', 'upgrade', 'heads'], 'detach': False, 'environment': {'TRIPLEO_DEPLOY_IDENTIFIER': '1583144119', 'TRIPLEO_CONFIG_HASH': '4e24a94564576aaeef9da5e79adee62b-4e24a94564576aaeef9da5e79adee62b'}, 'image': '192.168.24.1:8787/tripleomaster/centos-binary-neutron-server:0481ffa881dbc1aa61d9810533d85229-updated-20200302100121', 'net': 'host', 'privileged': False, 'user': 'root', 'volumes': ['/etc/hosts:/etc/hosts:ro', '/etc/localtime:/etc/localtime:ro', '/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro', '/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '/dev/log:/dev/log', '/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro', '/etc/puppet:/etc/puppet:ro', '/var/log/containers/neutron:/var/log/neutron:z', '/var/log/containers/httpd/neutron-api:/var/log/httpd:z', '/var/lib/config-data/neutron/etc/my.cnf.d/tripleo.cnf:/etc/my.cnf.d/tripleo.cnf:ro', '/var/lib/config-data/neutron/etc/neutron:/etc/neutron:ro'], 'start_order': 0}}, 'ansible_loop_var': 'container_data'}) => changed=false

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (master)

Fix proposed to branch: master
Review: https://review.opendev.org/710750

Changed in tripleo:
milestone: none → ussuri-3
assignee: nobody → Rabi Mishra (rabi)
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/710750
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=d40a353e0474f291ef5f77d3486ea590390201de
Submitter: Zuul
Branch: master

commit d40a353e0474f291ef5f77d3486ea590390201de
Author: Rabi Mishra <email address hidden>
Date: Mon Mar 2 17:22:05 2020 +0530

    Increase number of retries for container create async task

    Seems like some db_sync tasks are taking more time and this
    results in jobs failing. Though probably not a long
    term solution, would be good to increase it to reduce the
    large number of job failures.

    Change-Id: Ifa494ffdd58772c39808bcaa3d5d37b3802af065
    Ralated-Bug: #1865473

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/714429

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/train)

Reviewed: https://review.opendev.org/714429
Committed: https://git.openstack.org/cgit/openstack/tripleo-ansible/commit/?id=875cba87e576f897d641e350b9657777593b3e53
Submitter: Zuul
Branch: stable/train

commit 875cba87e576f897d641e350b9657777593b3e53
Author: Rabi Mishra <email address hidden>
Date: Mon Mar 2 17:22:05 2020 +0530

    Increase number of retries for container create async task

    Seems like some db_sync tasks are taking more time and this
    results in jobs failing. Though probably not a long
    term solution, would be good to increase it to reduce the
    large number of job failures.

    Change-Id: Ifa494ffdd58772c39808bcaa3d5d37b3802af065
    Ralated-Bug: #1865473
    (cherry picked from commit d40a353e0474f291ef5f77d3486ea590390201de)

tags: added: in-stable-train
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-3 → ussuri-rc3
wes hayutin (weshayutin)
Changed in tripleo:
milestone: ussuri-rc3 → victoria-1
Changed in tripleo:
milestone: victoria-1 → victoria-3
Changed in tripleo:
milestone: victoria-3 → wallaby-1
Changed in tripleo:
milestone: wallaby-1 → wallaby-2
Changed in tripleo:
milestone: wallaby-2 → wallaby-3
Changed in tripleo:
milestone: wallaby-3 → wallaby-rc1
Changed in tripleo:
milestone: wallaby-rc1 → xena-1
Revision history for this message
Marios Andreou (marios-b) wrote :

This is an automated action. Bug status has been set to 'Incomplete' and target milestone has been removed due to inactivity. If you disagree please re-set these values and reach out to us on freenode #tripleo

Changed in tripleo:
milestone: xena-1 → none
status: Triaged → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.