Failed to Starting kolla-toolbox container

Bug #1668059 reported by MarginHu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla
Invalid
Critical
Unassigned

Bug Description

when I run "kolla-ansible deploy -i inventory/multinode" , it reports failed to start kolla-toolbox container with "Unknown status message: Digest: sha256:684d4b4ac3988df5e49640e7097e697930588d8b42fcba70e8bece4998aa1d2f"

the original info is in http://pastebin.com/XRd4HkAQ

my kolla-ansible version: 4.0.0.0rc1
Image was built from centos binary.

Revision history for this message
Steven Dake (sdake) wrote :

need docker version
need ansible version
need output of the command docker images

Changed in kolla:
status: New → Triaged
importance: Undecided → Critical
milestone: none → ocata-rc2
Revision history for this message
MarginHu (margin2017) wrote :
Revision history for this message
MarginHu (margin2017) wrote :

I find a way to solve the issue,
1.remove the docker image with "docker rmi "

2.rebuild the image with kolla-build

3.run the deploy command.

but I don't know the root cause.

Moreover,after apply above method, the error appeared on other images again.

TASK [common : Starting cron container] ****************************************
fatal: [kode5]: FAILED! => {"changed": false, "failed": true, "msg": "Unknown status message: Digest: sha256:bca287c5d30b5e370b755ab041ea9e1fcf110286f7c397038d0bf2600f1a2206"}
fatal: [kode4]: FAILED! => {"changed": false, "failed": true, "msg": "Unknown status message: Digest: sha256:bca287c5d30b5e370b755ab041ea9e1fcf110286f7c397038d0bf2600f1a2206"}

Revision history for this message
MarginHu (margin2017) wrote :
Revision history for this message
MarginHu (margin2017) wrote :

now, I met similar issue on kolla-toolbox image, but this time the workaround is not effective.

I found a clue on kode4 , /var/log/messages has the following info.
Feb 27 14:06:05 kode4 ansible-kolla_docker: Invoked with tls_key=None image=192.168.103.16:5000/bgi/centos-binary-kolla-toolbox:ocata-rc1-1 labels={} tls_verify=False pid_mode=None tls_cacert=None auth_password=None environment={'ANSIBLE_LIBRARY': '/usr/share/ansible', 'ANSIBLE_NOCOLOR': '1'} auth_registry=None volumes_from=None tls_cert=None privileged=True api_version=auto remove_on_exit=True restart_retries=10 detach=True auth_username=None name=kolla_toolbox security_opt=[] cap_add=[] restart_policy=None auth_email=None ipc_mode=None volumes=['/etc/kolla//kolla-toolbox/:/var/lib/kolla/config_files/:ro', '/etc/localtime:/etc/localtime:ro', '/dev/:/dev/', '/run/:/run/:shared', 'kolla_logs:/var/log/kolla/'] action=start_container common_options={'auth_email': None, 'restart_policy': 'unless-stopped', 'environment': {'KOLLA_CONFIG_STRATEGY': 'COPY_ALWAYS'}, 'auth_registry': '192.168.103.16:5000', 'restart_retries': '10', 'auth_password': None, 'auth_username': 'sam'}
Feb 27 14:06:05 kode4 dockerd-current: time="2017-02-27T14:06:05.889807416+08:00" level=info msg="{Action=auth, Username=root, LoginUID=0, PID=10667}"
Feb 27 14:06:05 kode4 dockerd-current: time="2017-02-27T14:06:05.892787982+08:00" level=info msg="Error logging in to v2 endpoint, trying next endpoint: Get https://192.168.103.16:5000/v2/: http: server gave HTTP response to HTTPS client"
Feb 27 14:06:05 kode4 dockerd-current: time="2017-02-27T14:06:05.904874073+08:00" level=info msg="{Action=create, Username=root, LoginUID=0, PID=10667}"
Feb 27 14:06:05 kode4 dockerd-current: time="2017-02-27T14:06:05.909848231+08:00" level=warning msg="Error getting v2 registry: Get https://192.168.103.16:5000/v2/: http: server gave HTTP response to HTTPS client"
Feb 27 14:06:05 kode4 dockerd-current: time="2017-02-27T14:06:05.909889123+08:00" level=error msg="Attempting next endpoint for pull after error: Get https://192.168.103.16:5000/v2/: http: server gave HTTP response to HTTPS client"
Feb 27 14:07:34 kode4 systemd-logind: Removed session 222.

Revision history for this message
MarginHu (margin2017) wrote :
Download full text (7.8 KiB)

when I rerun the deploy command , it reports a new error.

TASK [common : Starting kolla-toolbox container] *******************************
fatal: [kode5]: FAILED! => {"changed": true, "failed": true, "msg": "'Traceback (most recent call last):\\n File \"/tmp/ansible_RjTukh/ansible_module_kolla_docker.py\", line 781, in main\\n result = bool(getattr(dw, module.params.get(\\'action\\'))())\\n File \"/tmp/ansible_RjTukh/ansible_module_kolla_docker.py\", line 608, in start_container\\n self.dc.start(container=self.params.get(\\'name\\'))\\n File \"/usr/lib/python2.7/site-packages/docker/utils/decorators.py\", line 21, in wrapped\\n return f(self, resource_id, *args, **kwargs)\\n File \"/usr/lib/python2.7/site-packages/docker/api/container.py\", line 368, in start\\n self._raise_for_status(res)\\n File \"/usr/lib/python2.7/site-packages/docker/client.py\", line 173, in _raise_for_status\\n raise errors.NotFound(e, response, explanation=explanation)\\nNotFound: 404 Client Error: Not Found (\"{\"message\":\"invalid header field value \\\\\"oci runtime error: container_linux.go:247: starting container process caused \\\\\\\\\\\\\"process_linux.go:359: container init caused \\\\\\\\\\\\\\\\\\\\\\\\\\\\\"rootfs_linux.go:54: mounting \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"/var/lib/docker/containers/e6c1ab3ea666a871f9c6416c397a520aedfd85b4a4e50c65a94ef98213048b4a/secrets\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\" to rootfs \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"/var/lib/docker/btrfs/subvolumes/32263df0ddae40923b60e2a78cc4a903ead21a7367a0498a510e3fc4e9c23896\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\" at \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"/var/lib/docker/btrfs/subvolumes/32263df0ddae40923b60e2a78cc4a903ead21a7367a0498a510e3fc4e9c23896/run/secrets\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\" caused \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"no such file or directory\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"\\\\\\\\\\\\\"\\\\\\\\n\\\\\"\"}\")\\n'"}
fatal: [kode4]: FAILED! => {"changed": true, "failed": true, "msg": "'Traceback (most recent call last):\\n File \"/tmp/ansible_sLTnpX/ansible_module_kolla_docker.py\", line 781, in main\\n result = bool(getattr(dw, module.params.get(\\'action\\'))())\\n File \"/tmp/ansible_sLTnpX/ansible_module_kolla_docker.py\", line 608, in start_container\\n self.dc.start(container=self.params.get(\\'name\\'))\\n File \"/usr/lib/python2.7/site-packages/docker/utils/decorators.py\", line 21, in wrapped\\n return f(self, resource_id, *args, **kwargs)\\n File \"/usr/lib/python2.7/site-packages/docker/api/container.py\", line 368, in start\\n self._raise_for_status(res)\\n File \"/usr/lib/python2.7/site-packages/docker/client.py\", line 173, in _raise_for_status\\n raise errors.NotFound(e, response, explanation=explanation)\\nNotFound: 404 Client Error: Not Found (\"{\"message\":\"invalid header field value \\\\\"oci runtime error: container_linux.go:247: starting container process caused \\\\\\\...

Read more...

Revision history for this message
Eduardo Gonzalez (egonzalez90) wrote :

The issue with HTTPS response is likely a misconfiguration of docker daemon regarding to insecure-registry. Please, follow this steps to configure insecure registry in all nodes.
https://github.com/openstack/kolla-ansible/blob/master/doc/multinode.rst#configure-docker-on-all-nodes

Revision history for this message
MarginHu (margin2017) wrote :

@Eduardo,you're right about the issue in comment #6.

but after I fix the issue #6, origin issue is existed,Finally I find that the issue is disappeared after I use docker-engine-1.12.6-1.el7.centos.x86_64 this docker community version instead of redhat docker version docker-1.12.5-14.el7.centos.

Changed in kolla:
milestone: ocata-rc2 → 4.0.1
Changed in kolla:
milestone: 4.0.1 → 4.0.2
Revision history for this message
MarginHu (margin2017) wrote :

the issue at comment #6 has reproduced, I use docker-1.12.6-16.el7.centos.x86_64 in centos repository instead of docker-engine-1.12.6-1.el7.centos.x86_64.
because I found system often trigger nmi deadlock If run docker-engine-1.12.6-1.el7.centos.x86_64

so now I must solve the issue, I google it is similar with https://bugzilla.redhat.com/show_bug.cgi?id=1410118

can you help me ?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 5.0.0.0b2

This issue was fixed in the openstack/kolla-ansible 5.0.0.0b2 development milestone.

Changed in kolla:
milestone: 4.0.2 → 4.0.3
Changed in kolla:
milestone: 4.0.3 → 4.0.4
Revision history for this message
Jeffrey Zhang (jeffrey4l) wrote :

disable duo to lack more info.

Changed in kolla:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.