multinode deploy fails on TASK [common : Ensure fluentd image is present for label check]

Bug #1901768 reported by Khawar Munir Abbasi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-ansible
Fix Released
Undecided
Mark Goddard

Bug Description

Hello experts,
I need your help. I am trying to install OpenStack kolla train release (as development using github).
When I used inventory all-in-one deployment without registry everything works fine but as soon as I enable registry for multinode, it throws below error.

These are the steps, I did for the registry:
1. vi /etc/kolla/globals.yml
    docker_registry: <my_machine_ip>:5000

2. ./kolla/tools/start-registry

3. export REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io

4. To verify, docker info shows insecure registry:
Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  10.237.223.217:5000
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support

5. output of docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7f197967064e registry:2 "/entrypoint.sh /etc…" 55 minutes ago Up 14 minutes 0.0.0.0:4000->5000/tcp registry

BTW, my machine is running behind the proxy, I have already set up proxy on docker and host environment file.

TASK [common : Ensure fluentd image is present for label check] ******************************************************************************************************************************

The full traceback is:
  File "/tmp/ansible_kolla_docker_payload_ni0q4tct/ansible_kolla_docker_payload.zip/ansible/modules/kolla_docker.py", line 1024, in main
  File "/tmp/ansible_kolla_docker_payload_ni0q4tct/ansible_kolla_docker_payload.zip/ansible/modules/kolla_docker.py", line 904, in ensure_image File "/tmp/ansible_kolla_docker_payload_ni0q4tct/ansible_kolla_docker_payload.zip/ansible/modules/kolla_docker.py", line 570, in pull_image
  File "/usr/local/lib/python3.8/dist-packages/docker/api/image.py", line 415, in pull
    self._raise_for_status(response)
  File "/usr/local/lib/python3.8/dist-packages/docker/api/client.py", line 261, in _raise_for_status
    raise create_api_error_from_http_exception(e)
  File "/usr/local/lib/python3.8/dist-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
fatal: [10.237.223.217]: FAILED! => {
    "changed": true,
    "invocation": {
        "module_args": {
            "action": "ensure_image",
            "api_version": "auto",
            "auth_email": null,
            "auth_password": null,
            "auth_registry": "10.237.223.217:5000",
            "auth_username": null,
            "cap_add": [],
            "client_timeout": 120,
            "command": null,
            "detach": true,
            "dimensions": {},
            "environment": {
                "KOLLA_CONFIG_STRATEGY": "COPY_ALWAYS"
            },
            "graceful_timeout": 10,
            "image": "10.237.223.217:5000/kolla/ubuntu-source-fluentd:train",
            "labels": {},
            "name": null,
            "privileged": false,
            "remove_on_exit": true,
            "restart_policy": "unless-stopped",
            "restart_retries": 10,
            "security_opt": [],
            "state": "running",
            "tls_cacert": null,
            "tls_cert": null,
            "tls_key": null,
            "tls_verify": false,
            "tty": false,
            "volumes": null,
            "volumes_from": null
        }
    },
    "msg": "'Traceback (most recent call last):\\n File \"/usr/local/lib/python3.8/dist-packages/docker/api/client.py\", line 259, in _raise_for_status\\n response.raise_for_status()\\n
 File \"/usr/lib/python3/dist-packages/requests/models.py\", line 940, in raise_for_status\\n raise HTTPError(http_error_msg, response=self)\\nrequests.exceptions.HTTPError: 500 Server Er
ror: Internal Server Error for url: http+docker://localhost/v1.40/images/create?tag=train&fromImage=10.237.223.217%3A5000%2Fkolla%2Fubuntu-source-fluentd\\n\\nDuring handling of the above ex
ception, another exception occurred:\\n\\nTraceback (most recent call last):\\n File \"/tmp/ansible_kolla_docker_payload_ni0q4tct/ansible_kolla_docker_payload.zip/ansible/modules/kolla_dock
er.py\", line 1024, in main\\n File \"/tmp/ansible_kolla_docker_payload_ni0q4tct/ansible_kolla_docker_payload.zip/ansible/modules/kolla_docker.py\", line 904, in ensure_image\\n File \"/tm
p/ansible_kolla_docker_payload_ni0q4tct/ansible_kolla_docker_payload.zip/ansible/modules/kolla_docker.py\", line 570, in pull_image\\n File \"/usr/local/lib/python3.8/dist-packages/docker/a
pi/image.py\", line 415, in pull\\n self._raise_for_status(response)\\n File \"/usr/local/lib/python3.8/dist-packages/docker/api/client.py\", line 261, in _raise_for_status\\n raise c
reate_api_error_from_http_exception(e)\\n File \"/usr/local/lib/python3.8/dist-packages/docker/errors.py\", line 31, in create_api_error_from_http_exception\\n raise cls(e, response=resp
onse, explanation=explanation)\\ndocker.errors.APIError: 500 Server Error: Internal Server Error (\"Get http://10.237.223.217:5000/v2/: net/http: request canceled while waiting for connectio
n (Client.Timeout exceeded while awaiting headers)\")\\n'"

# docker version
Client: Docker Engine - Community
 Version: 19.03.13
 API version: 1.40
 Go version: go1.13.15
 Git commit: 4484c46d9d
 Built: Wed Sep 16 17:02:52 2020
 OS/Arch: linux/amd64
 Experimental: false

Server: Docker Engine - Community
 Engine:
  Version: 19.03.13
  API version: 1.40 (minimum version 1.12)
  Go version: go1.13.15
  Git commit: 4484c46d9d
  Built: Wed Sep 16 17:01:20 2020
  OS/Arch: linux/amd64
  Experimental: false
 containerd:
  Version: 1.3.7
  GitCommit: 8fba4e9a7d01810a393d5d25a3621dc101981175
 runc:
  Version: 1.0.0-rc10
  GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version: 0.18.0
  GitCommit: fec3683

description: updated
description: updated
description: updated
Revision history for this message
Mark Goddard (mgoddard) wrote :

Hi. You don't need kolla-ansible to verify that your setup is working - a docker pull of any image on Dockerhub should work.

I can see a few issues with the instructions.

1. The start-registry script in kolla actually uses port 4000, not port 5000. Your docker_registry setting needs to reflect this.
2. The REGISTRY_PROXY_REMOTEURL environment variable needs to be available to the registry. You can pass this to the container via -e in the docker run command in the start-registry script.

Finally, if you are using a proxy, the registry will need to be configured to use it. I'm not sure how this is done.

Revision history for this message
Khawar Munir Abbasi (khawar426) wrote :

Thanks for your response.

but the problem is, I can pull docker image. I tested 'docker pull nginx' and it works fine. It throws error for OpenStack.

I followed the Kolla multinode guide for setting up registry. I noticed only in kolla/tools/start_script port 4000 is used. What if I change to port 5000, would Kolla work?

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

That script is just an example; use your favourite port.

Revision history for this message
Mark Goddard (mgoddard) wrote :

I think that our multinode guide is a bit inaccurate when it comes to using a registry mirror. In that case, you do not need to set docker_registry in globals.yml, but you do need to add the following:

docker_custom_config:
  registry-mirrors:
    - 10.237.223.217:5000

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (master)

Fix proposed to branch: master
Review: https://review.opendev.org/760318

Changed in kolla-ansible:
assignee: nobody → Mark Goddard (mgoddard)
status: New → In Progress
Revision history for this message
Mark Goddard (mgoddard) wrote :

Proposed an improvement to the docs

Revision history for this message
Khawar Munir Abbasi (khawar426) wrote :

Thanks Mark for your help and support.

Unfortunately when i add docker_custom_config (you mentioned) in globals.yml and start playbook bootstrap-servers. It fails on docker restart due to that change. So I make it done using this workaround

mkdir -p /etc/docker
tee /etc/docker/daemon.json <<-'EOF'
{
  "registry-mirrors": ["https://192.168.1.100:4000"]
}
EOF

systemctl restart docker

two questions:

Q1: Any idea what's wrong with docker_custom_config? and how to validate it except prechecks because in this case prechecks was passed.

Q2: Does compute node(s) need some certificates, etc.. to interact with deployment/controller node hosting registry mirror. Because in local registry case (legacy option for storing image), it uses insecure-registry and there were no need of certs, etc..

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 12.0.0.0rc1

This issue was fixed in the openstack/kolla-ansible 12.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 9.3.2

This issue was fixed in the openstack/kolla-ansible 9.3.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 11.1.0

This issue was fixed in the openstack/kolla-ansible 11.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 10.3.0

This issue was fixed in the openstack/kolla-ansible 10.3.0 release.

Tom Fifield (fifieldt)
Changed in kolla-ansible:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.