Overcloud deployment is failing in C7 train with "stderr": "Get https://192.168.24.1:8787/v1/_ping: http: server gave HTTP response to HTTPS client",

Bug #1909750 reported by Sandeep Yadav on 2020-12-31
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
High
Alex Schultz

Bug Description

Description:-

Overcloud deployment is failing in C7 train with "stderr": "Get https://192.168.24.1:8787/v1/_ping: http: server gave HTTP response to HTTPS client", "stderr_lines": ["Get https://192.168.24.1:8787/v1/_ping: http: server gave HTTP response to HTTPS client"]

Affected job: periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-train

Build history:-
https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-train

Logs:-
https://logserver.rdoproject.org/17/26217/22/check/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-train/36ab673/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz
~~~
2020-12-31 10:49:48 | 2020-12-31 10:47:36.277586 | fa163ebf-e325-d2c8-5203-000000003300 | FATAL | Pull 192.168.24.1:8787/tripleotrain/centos-binary-cinder-volume:aa5cdac62ccf12ead64c317e554bf24fe7b312e3_9fe37254-updated-20201231085211 image | overcloud-controller-2 | error={"changed": true, "cmd": "docker pull 192.168.24.1:8787/tripleotrain/centos-binary-cinder-volume:aa5cdac62ccf12ead64c317e554bf24fe7b312e3_9fe37254-updated-20201231085211", "delta": "0:00:00.100917", "end": "2020-12-31 10:47:36.145534", "msg": "non-zero return code", "rc": 1, "start": "2020-12-31 10:47:36.044617", "stderr": "Get https://192.168.24.1:8787/v1/_ping: http: server gave HTTP response to HTTPS client", "stderr_lines": ["Get https://192.168.24.1:8787/v1/_ping: http: server gave HTTP response to HTTPS client"], "stdout": "Trying to pull repository 192.168.24.1:8787/tripleotrain/centos-binary-cinder-volume ... ", "stdout_lines": ["Trying to pull repository 192.168.24.1:8787/tripleotrain/centos-binary-cinder-volume ... "]}
2020-12-31 10:49:48 | 2020-12-31 10:47:36.278371 | fa163ebf-e325-d2c8-5203-000000003300 | TIMING | tripleo-container-tag : Pull 192.168.24.1:8787/tripleotrain/centos-binary-cinder-volume:aa5cdac62ccf12ead64c317e554bf24fe7b312e3_9fe37254-updated-20201231085211 image | overcloud-controller-2 | 0:12:34.929103 | 0.49s
~~~

Another example:-

https://logserver.rdoproject.org/14/31414/3/check/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-train/7fb08f6/logs/undercloud/home/zuul/overcloud_deploy.log.gz

Additional information:-

If we restart docker like what we tried here with a test patch in tht[1] deployment passes[2] without error.

With this bug we want to find the root cause of this issue and if there is any better way to solve the issue than what we tried in [1]

[1] https://review.opendev.org/c/openstack/tripleo-heat-templates/+/768231/4/deployment/deprecated/docker/docker-baremetal-ansible.yaml
[2] Testproject
https://review.opendev.org/c/openstack/tripleo-heat-templates/+/768231

~~~
https://logserver.rdoproject.org/14/31414/3/check/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-train/c73d293/logs/undercloud/home/zuul/overcloud_deploy.log.gz
~~~

summary: Overcloud deployment is failing in C7 train with "stderr": "Get
https://192.168.24.1:8787/v1/_ping: http: server gave HTTP response to
- HTTPS client", "stderr_lines": ["Get https://192.168.24.1:8787/v1/_ping:
- http: server gave HTTP response to HTTPS client"]
+ HTTPS client",
Alex Schultz (alex-schultz) wrote :

JFYI "Get https://192.168.24.1:8787/v1/_ping: http: server gave HTTP response to HTTPS client" is not related. That's always printed for HTTP endpoints. It would only be a problem if the host is not in the insecure registry list.

yatin (yatinkarel) wrote :

@Alex Actually it's there in insecure registry list but docker is not getting started post the configuration and thus the job fails, a test via manual restart with https://review.opendev.org/c/openstack/tripleo-heat-templates/+/768231 passes the job in https://review.rdoproject.org/r/#/c/31414/, didn't got why docker is not getting started post config changes with handlers. It happens randomly on different hosts. You have any idea what could cause it?

Alex Schultz (alex-schultz) wrote :

Fall out from the strategy change in train. handlers don't work right. We need to switch away from using handlers in the container-registry code. I'll propose a change

Changed in tripleo:
assignee: nobody → Alex Schultz (alex-schultz)
Alex Schultz (alex-schultz) wrote :
Changed in tripleo:
status: Triaged → In Progress
Changed in tripleo:
milestone: wallaby-2 → wallaby-3
yatin (yatinkarel) wrote :

https://review.opendev.org/c/openstack/ansible-role-container-registry/+/770148 merged and job not failing on this issue, so closing it.

Changed in tripleo:
status: In Progress → Fix Released

This issue was fixed in the openstack/ansible-role-container-registry 1.3.0 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers