Overcloud nova compute docker command failing on nova_cellv2_discover_hosts

Bug #1812632 reported by Arx Cruz
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Martin Schuppert

Bug Description

This is affecting featureset035:

http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-master/433695b/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

Failure:

2019-01-21 08:59:07 | fatal: [overcloud-novacompute-0]: FAILED! => {
2019-01-21 08:59:07 | "failed_when_result": true,
2019-01-21 08:59:07 | "outputs.stdout_lines | default([]) | union(outputs.stderr_lines | default([]))": [
2019-01-21 08:59:07 | "image_exist isn't supported by docker",
2019-01-21 08:59:07 | "Error running ['docker', 'run', '--name', 'nova_cellv2_discover_hosts', '--label', 'config_id=tripleo_step5', '--label', 'container_name=nova_cellv2_discover_hosts', '--label', 'managed_by=paunch', '--label', 'config_data={\"start_order\": 0, \"command\": \"/docker-config-scripts/nova_cell_v2_discover_host.py\", \"user\": \"root\", \"volumes\": [\"/etc/hosts:/etc/hosts:ro\", \"/etc/localtime:/etc/localtime:ro\", \"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro\", \"/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro\", \"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro\", \"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro\", \"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro\", \"/dev/log:/dev/log\", \"/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro\", \"/etc/puppet:/etc/puppet:ro\", \"/var/lib/config-data/nova_libvirt/etc/nova/:/etc/nova/:ro\", \"/var/log/containers/nova:/var/log/nova\", \"/var/lib/docker-config-scripts/:/docker-config-scripts/\"], \"image\": \"192.168.24.1:8787/tripleomaster/centos-binary-nova-compute:tripleo-ci-testing-updated-20190121070746\", \"detach\": false, \"net\": \"host\"}', '--net=host', '--user=root', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume=/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro', '--volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume=/dev/log:/dev/log', '--volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro', '--volume=/etc/puppet:/etc/puppet:ro', '--volume=/var/lib/config-data/nova_libvirt/etc/nova/:/etc/nova/:ro', '--volume=/var/log/containers/nova:/var/log/nova', '--volume=/var/lib/docker-config-scripts/:/docker-config-scripts/', '192.168.24.1:8787/tripleomaster/centos-binary-nova-compute:tripleo-ci-testing-updated-20190121070746', '/docker-config-scripts/nova_cell_v2_discover_host.py']. [1]",
2019-01-21 08:59:07 | "",
2019-01-21 08:59:07 | "stdout: (cellv2) Retrying",
2019-01-21 08:59:07 | "(cellv2) Retrying",
2019-01-21 08:59:07 | "stderr: Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f93f7ff1250>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
2019-01-21 08:59:07 | "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7febde33a250>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
2019-01-21 08:59:07 | "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fad3f65b250>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
2019-01-21 08:59:07 | "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fd0fd4cb250>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
2019-01-21 08:59:07 | "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f9839493250>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
2019-01-21 08:59:07 | "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f356917a250>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
2019-01-21 08:59:07 | "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f04934ab250>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
2019-01-21 08:59:11 | "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (CausException occured while running the command
2019-01-21 08:59:11 | Traceback (most recent call last):
2019-01-21 08:59:11 | File "/usr/lib/python2.7/site-packages/tripleoclient/command.py", line 28, in run
2019-01-21 08:59:11 | super(Command, self).run(parsed_args)
2019-01-21 08:59:11 | File "/usr/lib/python2.7/site-packages/osc_lib/command/command.py", line 41, in run
2019-01-21 08:59:11 | return super(Command, self).run(parsed_args)
2019-01-21 08:59:11 | File "/usr/lib/python2.7/site-packages/cliff/command.py", line 184, in run
2019-01-21 08:59:11 | return_code = self.take_action(parsed_args) or 0
2019-01-21 08:59:11 | File "/usr/lib/python2.7/site-packages/tripleoclient/v1/overcloud_deploy.py", line 945, in take_action
2019-01-21 08:59:11 | verbosity=self.app_args.verbose_level)
2019-01-21 08:59:11 | File "/usr/lib/python2.7/site-packages/tripleoclient/workflows/deployment.py", line 307, in config_download
2019-01-21 08:59:11 | raise exceptions.DeploymentError("Overcloud configuration failed.")
2019-01-21 08:59:11 | DeploymentError: Overcloud configuration failed.
2019-01-21 08:59:11 | Overcloud configuration failed.
2019-01-21 08:59:11 | ed by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fe21c3d9250>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
2019-01-21 08:59:11 | "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f8bf8d73250>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
2019-01-21 08:59:11 | "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f3bee6ac250>: Failed to establish a new connection: [Errno 101] Network is unreachable',))"
2019-01-21 08:59:11 | ]
2019-01-21 08:59:11 | }

It seems to be after the patch https://review.openstack.org/#/c/576481 get merged

Revision history for this message
Martin Schuppert (mschuppert) wrote :

The compute, where the script is running, has no access to the public network/endpoints [2].

2019-01-18 07:53:47.984 ERROR /var/log/paunch.log: 28516 ERROR paunch [ ] stderr: Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f5013dfa250>: Failed to establish a new connection: [Errno 101] Network is unreachable',))

We should use the internal network when running this on the computes.

[1] http://logs.rdoproject.org/81/576481/13/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035/0b0e009/logs/overcloud-novacompute-0/var/log/extra/errors.txt.gz

[2] http://logs.rdoproject.org/81/576481/13/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035/0b0e009/logs/overcloud-novacompute-0/var/log/extra/network.txt.gz

Sorin Sbarnea (ssbarnea)
tags: added: alert
Sorin Sbarnea (ssbarnea)
Changed in tripleo:
assignee: nobody → Martin Schuppert (mschuppert)
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (master)

Fix proposed to branch: master
Review: https://review.openstack.org/632097

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/632097
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=cde4134d555ba74a1b86f66510069fc2a521286c
Submitter: Zuul
Branch: master

commit cde4134d555ba74a1b86f66510069fc2a521286c
Author: Martin Schuppert <email address hidden>
Date: Mon Jan 21 15:12:04 2019 +0100

    Service check in nova_cell_v2_discover_host.py to use internal API

    e0e885b8ca3332e0815c537a32c564cac81f7f7e moved the cellv2 discovery from
    control plane to compute services. In case the computes won't have access
    to the external API the service check will fail. This switch the service
    check to use the internal endpoint.

    Change-Id: I234db0866fb6f1adefdcf7a2b2a82412e443b7c9
    Closes-bug: 1812632

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/639922

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/rocky)

Reviewed: https://review.openstack.org/639922
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=eef9d3e49e09b6d68036a2a4f665a8fc80e65e05
Submitter: Zuul
Branch: stable/rocky

commit eef9d3e49e09b6d68036a2a4f665a8fc80e65e05
Author: Martin Schuppert <email address hidden>
Date: Mon Jan 21 15:12:04 2019 +0100

    Service check in nova_cell_v2_discover_host.py to use internal API

    e0e885b8ca3332e0815c537a32c564cac81f7f7e moved the cellv2 discovery from
    control plane to compute services. In case the computes won't have access
    to the external API the service check will fail. This switch the service
    check to use the internal endpoint.

    Change-Id: I234db0866fb6f1adefdcf7a2b2a82412e443b7c9
    Closes-bug: 1812632
    (cherry picked from commit cde4134d555ba74a1b86f66510069fc2a521286c)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 10.4.0

This issue was fixed in the openstack/tripleo-heat-templates 10.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 9.4.0

This issue was fixed in the openstack/tripleo-heat-templates 9.4.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.