[Stein to Train] Undercloud's container registry not reachable from Overcloud nodes

Bug #1863598 reported by Jose Luis Franco
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
High
Jose Luis Franco

Bug Description

The UndercloudHostsEntry heat parameter has been recently included so the undercloud host entries could be included in the overcloud nodes, as the container registry in the Undercloud is now defined by a hostname instead of an IP: https://review.opendev.org/#/c/687347/.

The newly added code becomes effective when deploying but when upgrading we also need to reach the Undercloud's registry for contatainer images retrieving (in the case the image pulling is configured to be local). As a result, the upgrade fails with the following error:

2020-02-14 10:19:30 | fatal: [controller-1]: FAILED! => {"changed": true, "cmd": ["podman", "pull", "undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1"], "delta": "0:00:10.121361", "end": "2020-02-14 10:19:29.522401", "msg": "non-zero return code", "rc": 125, "start": "2020-02-14 10:19:19.401040", "stderr": "Trying to pull undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1...\n Get https://undercloud-0.ctlplane.redhat.local:8787/v2/: dial tcp: lookup undercloud-0.ctlplane.redhat.local: no such host\nError: error pulling image \"undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1\": unable to pull undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1: unable to pull image: Error initializing source docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1: error pinging docker registry undercloud-0.ctlplane.redhat.local:8787: Get https://undercloud-0.ctlplane.redhat.local:8787/v2/: dial tcp: lookup undercloud-0.ctlplane.redhat.local: no such host", "stderr_lines": ["Trying to pull undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1...", " Get https://undercloud-0.ctlplane.redhat.local:8787/v2/: dial tcp: lookup undercloud-0.ctlplane.redhat.local: no such host", "Error: error pulling image \"undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1\": unable to pull undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1: unable to pull image: Error initializing source docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1: error pinging docker registry undercloud-0.ctlplane.redhat.local:8787: Get https://undercloud-0.ctlplane.redhat.local:8787/v2/: dial tcp: lookup undercloud-0.ctlplane.redhat.local: no such host"], "stdout": "", "stdout_lines": []}
2020-02-14 10:19:30 | fatal: [controller-2]: FAILED! => {"changed": true, "cmd": ["podman", "pull", "undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1"], "delta": "0:00:10.124989", "end": "2020-02-14 10:19:29.622836", "msg": "non-zero return code", "rc": 125, "start": "2020-02-14 10:19:19.497847", "stderr": "Trying to pull undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1...\n Get https://undercloud-0.ctlplane.redhat.local:8787/v2/: dial tcp: lookup undercloud-0.ctlplane.redhat.local: no such host\nError: error pulling image \"undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1\": unable to pull undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1: unable to pull image: Error initializing source docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1: error pinging docker registry undercloud-0.ctlplane.redhat.local:8787: Get https://undercloud-0.ctlplane.redhat.local:8787/v2/: dial tcp: lookup undercloud-0.ctlplane.redhat.local: no such host", "stderr_lines": ["Trying to pull undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1...", " Get https://undercloud-0.ctlplane.redhat.local:8787/v2/: dial tcp: lookup undercloud-0.ctlplane.redhat.local: no such host", "Error: error pulling image \"undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1\": unable to pull undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1: unable to pull image: Error initializing source docker://undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-cinder-backup:20200210.1: error pinging docker registry undercloud-0.ctlplane.redhat.local:8787: Get https://undercloud-0.ctlplane.redhat.local:8787/v2/: dial tcp: lookup undercloud-0.ctlplane.redhat.local: no such host"], "stdout": "", "stdout_lines": []}

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to python-tripleoclient (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/708128

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to python-tripleoclient (master)

Reviewed: https://review.opendev.org/708128
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=70d9d309cec524f94aa4843e57b513ff1ab7012e
Submitter: Zuul
Branch: master

commit 70d9d309cec524f94aa4843e57b513ff1ab7012e
Author: Jose Luis Franco Arza <email address hidden>
Date: Mon Feb 17 14:50:00 2020 +0100

    Remove extra whitespaces from getent.

    The UndercloudHostsEntries heat parameter gets its value set from the
    execution of getent hosts <undercloud_short_hostname>.ctrlplane. However,
    the execution of getent return extra 4 whitespaces between the IP and the
    hostname:
    (undercloud) [stack@undercloud-0 ~]$ getent hosts undercloud-0.ctlplane
    192.168.24.1 undercloud-0.ctlplane.redhat.local undercloud-0.ctlplane

    This four spaces enter in conflict with the entries at /etc/hosts if we
    want to use lineinfile to update the content. As it /ect/hosts already
    includes an entry for the undercloud-0.ctlplane (with a single space only)
    the Ansible module will consider that the line isn't present and we will
    end up with two entries for the undercloud-0.ctlplane.

    Also, the process.communicate() output return is in bytes, so in order
    to handle a string for the replacement this patch casts the output into
    string.

    Change-Id: Ibb51d5970f993b13a9684173704f64b98d81aae2
    Related-Bug: #1863598

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.opendev.org/707865
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=495c5c9de8711572122687b1556eedc11c1f0d6f
Submitter: Zuul
Branch: master

commit 495c5c9de8711572122687b1556eedc11c1f0d6f
Author: Jose Luis Franco Arza <email address hidden>
Date: Fri Feb 14 16:49:10 2020 +0100

    Configure Undercloud hostname in the overcloud during upgrade.

    Due to IPv6, the undercloud's container registry had to change
    from an IP address in the full hostname [0]. This impacts into
    the upgrade to Train as the overcloud nodes do not contain the
    Undercloud's host in /etc/hosts as well as it's missing in the
    registries.conf.

    This patch adds two new upgrade tasks to handle:

    1. container-image-prepare: Ensure that /etc/hosts file contains
    an entry for the Undercloud's ctrlplane hostname. This task makes
    use of the global_vars ansible parameter undercloud_hosts_entry
    which contains a list of the undercloud's ctrlplan hostnames (calling
    getent hosts underneath).

    2. podman-baremetal-ansible: There is already an upgrade task which
    takes care of reconfiguring podman during the upgrade, so this patch
    simply sets up the right container unsecure registries and passes it
    into the reconfiguring task. It also changes the (step | int) into
    step|int as the upgrade tasks require step|int condition to decide in
    which step_X playbook fall, otherwise the task will appear in every
    step_X playbook.

    [0] - Iac6efde9dd283906274d95c3a239b4b882ec052e

    Depends-On: https://review.opendev.org/708128
    Closes-Bug: #1863598
    Change-Id: Ifadc797f33d759eed38c9d9274fa588b6dd19488

summary: - [Stein to Rocky] Undercloud's container registry not reachable from
+ [Stein to Train] Undercloud's container registry not reachable from
Overcloud nodes
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to python-tripleoclient (stable/train)

Related fix proposed to branch: stable/train
Review: https://review.opendev.org/708450

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/708453

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to python-tripleoclient (stable/train)

Reviewed: https://review.opendev.org/708450
Committed: https://git.openstack.org/cgit/openstack/python-tripleoclient/commit/?id=5e40c707931948029d725df5469ffff108509c19
Submitter: Zuul
Branch: stable/train

commit 5e40c707931948029d725df5469ffff108509c19
Author: Jose Luis Franco Arza <email address hidden>
Date: Mon Feb 17 14:50:00 2020 +0100

    Remove extra whitespaces from getent.

    The UndercloudHostsEntries heat parameter gets its value set from the
    execution of getent hosts <undercloud_short_hostname>.ctrlplane. However,
    the execution of getent return extra 4 whitespaces between the IP and the
    hostname:
    (undercloud) [stack@undercloud-0 ~]$ getent hosts undercloud-0.ctlplane
    192.168.24.1 undercloud-0.ctlplane.redhat.local undercloud-0.ctlplane

    This four spaces enter in conflict with the entries at /etc/hosts if we
    want to use lineinfile to update the content. As it /ect/hosts already
    includes an entry for the undercloud-0.ctlplane (with a single space only)
    the Ansible module will consider that the line isn't present and we will
    end up with two entries for the undercloud-0.ctlplane.

    Also, the process.communicate() output return is in bytes, so in order
    to handle a string for the replacement this patch casts the output into
    string.

    Change-Id: Ibb51d5970f993b13a9684173704f64b98d81aae2
    Related-Bug: #1863598
    (cherry picked from commit 70d9d309cec524f94aa4843e57b513ff1ab7012e)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/train)

Reviewed: https://review.opendev.org/708453
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=49fc109bb9ad7163730f7d08e7f8a5217fe01c68
Submitter: Zuul
Branch: stable/train

commit 49fc109bb9ad7163730f7d08e7f8a5217fe01c68
Author: Jose Luis Franco Arza <email address hidden>
Date: Fri Feb 14 16:49:10 2020 +0100

    Configure Undercloud hostname in the overcloud during upgrade.

    Due to IPv6, the undercloud's container registry had to change
    from an IP address in the full hostname [0]. This impacts into
    the upgrade to Train as the overcloud nodes do not contain the
    Undercloud's host in /etc/hosts as well as it's missing in the
    registries.conf.

    This patch adds two new upgrade tasks to handle:

    1. container-image-prepare: Ensure that /etc/hosts file contains
    an entry for the Undercloud's ctrlplane hostname. This task makes
    use of the global_vars ansible parameter undercloud_hosts_entry
    which contains a list of the undercloud's ctrlplan hostnames (calling
    getent hosts underneath).

    2. podman-baremetal-ansible: There is already an upgrade task which
    takes care of reconfiguring podman during the upgrade, so this patch
    simply sets up the right container unsecure registries and passes it
    into the reconfiguring task. It also changes the (step | int) into
    step|int as the upgrade tasks require step|int condition to decide in
    which step_X playbook fall, otherwise the task will appear in every
    step_X playbook.

    [0] - Iac6efde9dd283906274d95c3a239b4b882ec052e

    Depends-On: https://review.opendev.org/708450
    Closes-Bug: #1863598
    Change-Id: Ifadc797f33d759eed38c9d9274fa588b6dd19488
    (cherry picked from commit 495c5c9de8711572122687b1556eedc11c1f0d6f)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 12.2.0

This issue was fixed in the openstack/tripleo-heat-templates 12.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 11.4.0

This issue was fixed in the openstack/tripleo-heat-templates 11.4.0 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers