tripleo-quickstart-extras standalone role fails ceph-install if initial registries.conf is in v1 format

Bug #1980869 reported by John Fulton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
John Fulton

Bug Description

RHOSP17 on rhel8: Scenario004 failing Error: error getting default registries to try: error loading registries configuration "/etc/containers/registries.conf": mixing sysregistry v1/v2 is not supported

In theory this could also happen when TQE test the upstream wallaby or main branch on centos8.

The following was reproduced by scenario001

~~~
2022-07-06 07:27:48.382263 | fa163ee9-42d3-8f9f-bcb8-000000000067 | FATAL | Run cephadm bootstrap | standalone.localdomain | error={"changed": true, "cmd": "/usr/sbin/cephadm --image registry-proxy.engineering.redhat.com/rh-osbs/rhceph:5-170 \bootstrap --yes-i-know --skip-firewalld --ssh-private-key /home/ceph-admin/.ssh/id_rsa --ssh-public-key /home/ceph-admin/.ssh/id_rsa.pub --ssh-user ceph-admin --allow-fqdn-hostname --output-keyring /etc/ceph/ceph.client.admin.keyring --output-config /etc/ceph/ceph.conf --fsid 392bad64-d1ec-5c35-b076-20db2755a22f --config /home/ceph-admin/assimilate_ceph.conf \--single-host-defaults \--skip-monitoring-stack --skip-dashboard --log-to-file --skip-mon-network \--mon-ip 192.168.42.1
", "delta": "0:00:00.712983", "end": "2022-07-06 07:27:48.356543", "msg": "non-zero return code", "rc": 1, "start": "2022-07-06 07:27:47.643560", "stderr": "Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman (/bin/podman) version 3.0.1 is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 392bad64-d1ec-5c35-b076-20db2755a22f
Verifying IP 192.168.42.1 port 3300 ...
Verifying IP 192.168.42.1 port 6789 ...
- internal network (--cluster-network) has not been provided, OSD replication will default to the public_network
Adjusting default settings to suit single-host cluster...
Pulling container image registry-proxy.engineering.redhat.com/rh-osbs/rhceph:5-170...
Non-zero exit code 125 from /bin/podman pull registry-proxy.engineering.redhat.com/rh-osbs/rhceph:5-170
/bin/podman: stderr Error: error getting default registries to try: error loading registries configuration "/etc/containers/registries.conf": mixing sysregistry v1/v2 is not supported
ERROR: Failed command: /bin/podman pull registry-proxy.engineering.redhat.com/rh-osbs/rhceph:5-170", "stderr_lines": ["Verifying podman|docker is present...", "Verifying lvm2 is present...", "Verifying time synchronization is in place...", "Unit chronyd.service is enabled and running", "Repeating the final host check...", "podman (/bin/podman) version 3.0.1 is present", "systemctl is present", "lvcreate is present", "Unit chronyd.service is enabled and running", "Host looks OK", "Cluster fsid: 392bad64-d1ec-5c35-b076-20db2755a22f", "Verifying IP 192.168.42.1 port 3300 ...", "Verifying IP 192.168.42.1 port 6789 ...", "- internal network (--cluster-network) has not been provided, OSD replication will default to the public_network", "Adjusting default settings to suit single-host cluster...", "Pulling container image registry-proxy.engineering.redhat.com/rh-osbs/rhceph:5-170...", "Non-zero exit code 125 from /bin/podman pull registry-proxy.engineering.redhat.com/rh-osbs/rhceph:5-170", "/bin/podman: stderr Error: error getting default registries to try: error loading registries configuration "/etc/containers/registries.conf": mixing sysregistry v1/v2 is not supported", "ERROR: Failed command: /bin/podman pull registry-proxy.engineering.redhat.com/rh-osbs/rhceph:5-170"], "stdout": "", "stdout_lines": []}
2022-07-06 07:27:48.383898 | fa163ee9-42d3-8f9f-bcb8-000000000067 | TIMING | tripleo_cephadm : Run cephadm bootstrap | standalone.localdomain | 0:00:14.500647 | 0.97s
~~~

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-quickstart-extras (master)
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/848902
Committed: https://opendev.org/openstack/tripleo-quickstart-extras/commit/7482df1efe87428b264fdea9b72ee4c7d383dbd2
Submitter: "Zuul (22348)"
Branch: master

commit 7482df1efe87428b264fdea9b72ee4c7d383dbd2
Author: John Fulton <email address hidden>
Date: Wed Jul 6 15:45:38 2022 -0400

    Remove initial v1 registry entries before installing Ceph

    I982dedb53582fbd76391165c3ca72954c129b84a introduced a task
    to update registries.conf to trust the docker_registry_host
    with the Ceph container. The update is only made in v2 format.
    The podman pull works on el9 but because el8's registries.conf
    has v1 format sections by default the v1/v2 mix causes 1980869.

    As described in Ic35155f04bf05913b9e9b8eaa22fe6c02515396c, which
    adds v2 support, tripleo "nuke(s) the existing configuration" and
    this patch adds a task to do the same. Why not just run the role
    from the same change?

    The tasks from Ic35155f04bf05913b9e9b8eaa22fe6c02515396c can be
    run by the tripleo.operator role tripleo_ceph_deploy by passing
    tripleo_ceph_deploy_skip_container_registry_config=false, but
    this option is true in order to be compatible with the changes
    in containers.yml which includes configuration for local mirrors.
    This is a scenario unique to our tesing environment. It could be
    refactored but this patch gets CI working on el8 (it had originally
    only been tested on el9).

    Change-Id: I077f1a317e43ba29da20e74955940b585c201502
    Closes-Bug: #1980869

Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.