Stack update on a TLS-e overcloud sometimes fails in "Extract and trust certmonger's local CA"

Bug #1925531 reported by Damien Ciabrini
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Medium
Damien Ciabrini

Bug Description

Observed on a local master deployment, whenever I try to do a stack
update with a 'overcloud deploy', the deployment playbook fails while
manipulating local CA:

2021-04-22 14:08:16,820 p=706492 u=stack n=ansible | PLAY RECAP *********************************************************************
2021-04-22 14:08:16,821 p=706492 u=stack n=ansible | compute-0 : ok=266 changed=57 unreachable=0 failed=0 skipped=148 rescued=0 ignored=0
2021-04-22 14:08:16,821 p=706492 u=stack n=ansible | compute-1 : ok=261 changed=57 unreachable=0 failed=0 skipped=146 rescued=0 ignored=0
2021-04-22 14:08:16,822 p=706492 u=stack n=ansible | controller-0 : ok=213 changed=40 unreachable=0 failed=1 skipped=103 rescued=0 ignored=0
2021-04-22 14:08:16,822 p=706492 u=stack n=ansible | controller-1 : ok=211 changed=40 unreachable=0 failed=1 skipped=103 rescued=0 ignored=0
2021-04-22 14:08:16,823 p=706492 u=stack n=ansible | controller-2 : ok=209 changed=40 unreachable=0 failed=1 skipped=103 rescued=0 ignored=0
2021-04-22 14:08:16,823 p=706492 u=stack n=ansible | localhost : ok=0 changed=0 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
2021-04-22 14:08:16,824 p=706492 u=stack n=ansible | undercloud : ok=720 changed=25 unreachable=0 failed=0 skipped=255 rescued=0 ignored=0
2021-04-22 14:08:16,830 p=706492 u=stack n=ansible | 2021-04-22 14:08:16.830163 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From ansible.log:

2021-04-22 14:07:00,812 p=706492 u=stack n=ansible | 2021-04-22 14:07:00.811405 | 525400c3-e4b0-a697-4e4f-00000000544b | FATAL | Extract and trust certmonger's local CA | controller-1 | error={"attempts": 5, "changed": true, "cmd": "set -e\nca_pem='/etc/pki/ca-trust/source/anchors/cm-local-ca.pem'\nopenssl pkcs12 -in /var/lib/certmonger/local/creds -out ${ca_pem} -nokeys -nodes -passin pass:''\nchmod 0644 ${ca_pem}\nupdate-ca-trust extract\ntest -e ${ca_pem} && openssl x509 -checkend 0 -noout -in ${ca_pem}\n", "delta": "0:00:00.378480", "end": "2021-04-22 14:07:00.775453", "rc": 0, "start": "2021-04-22 14:07:00.396973", "stderr": "", "stderr_lines": [], "stdout": "Certificate will not expire", "stdout_lines": ["Certificate will not expire"]}

Ansible returns a FATAL status even if the log seem to indicate that
the command rc was OK.

It seems that this is due to the output of the ansible task [1] not
being registered, so from time to time the task will pass by maybe due
to it reusing the value of a previous task.

[1] https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/haproxy/haproxy-public-tls-certmonger.yaml#L120

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)
Changed in tripleo:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/787618
Committed: https://opendev.org/openstack/tripleo-heat-templates/commit/0419c90064b52037a16c78b3498a006da5eb0fd6
Submitter: "Zuul (22348)"
Branch: master

commit 0419c90064b52037a16c78b3498a006da5eb0fd6
Author: Damien Ciabrini <email address hidden>
Date: Thu Apr 22 18:52:48 2021 +0200

    Fix random redeploy failure during certificate extraction

    During the extraction of the local certificate, the ansible
    task uses the output of an unregistered variable, so it
    passes based on a random input.

    Change-Id: I9c08189aaa4c8d8b3e4dcde38b1b2cd4146ac8e5
    Closes-Bug: #1925531

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 14.1.0

This issue was fixed in the openstack/tripleo-heat-templates 14.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.