[queens] periodic-centos-ovb-3ctlr_1comp-featureset035-queens fails overcloud deploy overcloud.AllNodesDeploySteps.ComputeDeployment_Step5.0:

Bug #1820667 reported by Marios Andreou
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Marios Andreou

Bug Description

[queens promotion blocker]

the periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-queens is failing on 15th,16th, 17th 18th [1-4 respectively]

The trace in overcloud deploy is like

 2019-03-15 07:24:52 | 2019-03-15 07:24:50Z [overcloud.AllNodesDeploySteps.ComputeDeployment_Step5.0]: CREATE_FAILED Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
 2019-03-15 07:25:00 | overcloud.AllNodesDeploySteps.ComputeDeployment_Step5.0:
 2019-03-15 07:25:00 | resource_type: OS::Heat::StructuredDeployment
 2019-03-15 07:25:00 | physical_resource_id: 089b7776-0d1a-4fa4-9400-b0dc2e143cb3
 2019-03-15 07:25:00 | status: CREATE_FAILED
 2019-03-15 07:25:00 | status_reason: |
 2019-03-15 07:25:00 | Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
 2019-03-15 07:25:00 | deploy_stdout: |
 2019-03-15 07:25:00 | ...
 2019-03-15 07:25:00 | "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fdf4ad54f90>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
 2019-03-15 07:25:00 | "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f114f308f90>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
 2019-03-15 07:25:00 | "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f8846ac2f90>: Failed to establish a new connection: [Errno 101] Network is unreachable',))"
 2019-03-15 07:25:00 | ]
 2019-03-15 07:25:00 | }

Not sure what the root is yet but there are two interesting errors, mistral: [5]

  2019-03-15 06:31:11.467 31285 INFO workflow_trace [req-29eae36a-4e2d-4e3e-be32-b6c47dfe4c2c 02295a2993b64beb8bbee7d9ff3bbd86 288fc38120b3481c8348c3fb30c2586e - default default] Task 'get_stack' (305a81de-b107-4abd-9fb6-d3687850363b) [RUNNING -> ERROR, msg=Failed to run action [action_ex_id=9769805c-b0cd-4674-8c38-0f49946e28ce, action_cls='<class 'mistral.actions.action_factory.HeatAction'>', attributes='{u'client_method_name': u'stacks.get'}', params='{u'stack_id': u'overcloud'}']
   HeatAction.stacks.get failed: ERROR: The Stack (overcloud) could not be found.
  Traceback (most recent call last):

    File "/usr/lib/python2.7/site-packages/heat/common/context.py", line 409, in wrapped
      return func(self, ctx, *args, **kwargs)

    File "/usr/lib/python2.7/site-packages/heat/engine/service.py", line 489, in identify_stack
      raise exception.EntityNotFound(entity='Stack', name=stack_name)

  EntityNotFound: The Stack (overcloud) could not be found.
  ] (execution_id=2cdbfe22-a726-4e4c-a6c8-7bbb6119828a)

and ironic [6] :

  2019-03-15 06:47:59.298 26713 ERROR ironic.drivers.modules.agent_base_vendor Traceback (most recent call last):
  2019-03-15 06:47:59.298 26713 ERROR ironic.drivers.modules.agent_base_vendor File "/usr/lib/python2.7/site-packages/ironic/drivers/modules/agent_base_vendor.py", line 616, in reboot_and_finish_deploy
  2019-03-15 06:47:59.298 26713 ERROR ironic.drivers.modules.agent_base_vendor _wait_until_powered_off(task)
  2019-03-15 06:47:59.298 26713 ERROR ironic.drivers.modules.agent_base_vendor File "/usr/lib/python2.7/site-packages/retrying.py", line 68, in wrapped_f
  2019-03-15 06:47:59.298 26713 ERROR ironic.drivers.modules.agent_base_vendor return Retrying(*dargs, **dkw).call(f, *args, **kw)
  2019-03-15 06:47:59.298 26713 ERROR ironic.drivers.modules.agent_base_vendor File "/usr/lib/python2.7/site-packages/retrying.py", line 231, in call
  2019-03-15 06:47:59.298 26713 ERROR ironic.drivers.modules.agent_base_vendor raise RetryError(attempt)
  2019-03-15 06:47:59.298 26713 ERROR ironic.drivers.modules.agent_base_vendor RetryError: RetryError[Attempts: 7, Value: power on]
  2019-03-15 06:47:59.298 26713 ERROR ironic.drivers.modules.agent_base_vendor

--------------------

[1] http://logs.rdoproject.org/openstack-periodic-24hr/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-queens/9f05c2f/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

[2] http://logs.rdoproject.org/openstack-periodic-24hr/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-queens/e0de137/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

[3] http://logs.rdoproject.org/openstack-periodic-24hr/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-queens/3707869/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

[4] http://logs.rdoproject.org/openstack-periodic-24hr/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-queens/1f2208a/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz

[5] http://logs.rdoproject.org/openstack-periodic-24hr/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-queens/9f05c2f/logs/undercloud/var/log/mistral/engine.log.txt.gz

[6] http://logs.rdoproject.org/openstack-periodic-24hr/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-queens/9f05c2f/logs/undercloud/var/log/ironic/ironic-conductor.log.txt.gz

Tags: ci
Changed in tripleo:
importance: Undecided → High
assignee: nobody → Marios Andreou (marios-b)
summary: - periodic-centos-ovb-3ctlr_1comp-featureset035-queens fails overcloud
- deploy overcloud.AllNodesDeploySteps.ComputeDeployment_Step5.0:
+ [queens] periodic-centos-ovb-3ctlr_1comp-featureset035-queens fails
+ overcloud deploy
+ overcloud.AllNodesDeploySteps.ComputeDeployment_Step5.0:
description: updated
Revision history for this message
Marios Andreou (marios-b) wrote :

wrt queens promotion - we discussed this earlier on the phone with weshay and sshnaidm|rover

Today the rest of the queens promotion criteria jobs [1] are green [2]

[1] https://github.com/rdo-infra/ci-config/blob/f964b950ce8a6d91dd7a549a2daa377ead50fafd/ci-scripts/dlrnapi_promoter/config/CentOS-7/queens.ini#L14

[2](checking all the jobs):

jobs=(periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens
periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-queens-upload
periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-queens
periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset010-queens
periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset016-queens
periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset017-queens
periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset018-queens
periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset019-queens
periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-queens)

[m@192 ci-config]$ type -a job_build_statusRDO
job_build_statusRDO is a function
job_build_statusRDO ()
{
    builds_url="https://review.rdoproject.org/zuul/builds?job_name=$1";
    firefox $builds_url
}
[m@192 ci-config]$ for i in ${jobs[@]}; do job_build_statusRDO $i; sleep 0.4; done

Revision history for this message
Marios Andreou (marios-b) wrote :

    ykarel++ might be fixed with queens https://review.openstack.org/#/c/641977/1 and caused by https://review.openstack.org/#/c/641968/

18:29 < ykarel|away> H<HykarelH>H fs035 might have fix as well
18:29 < ykarel|away> H<HykarelH>H https://review.openstack.org/#/c/641977/1, but good to rebase and see
18:29 < ykarel|away> H<HykarelH>H ok it's already on the patch that break, so seems ^^ fixing the issue
18:29 < ykarel|away> H<HykarelH>H already on top

also this is not promotions only also affects check apparently

Revision history for this message
Martin Schuppert (mschuppert) wrote :
Download full text (5.3 KiB)

As discussed on IRC with ykarel

The issue is that the "old" nova_cell_v2_discover_host.py was using the external endpoint for checking the services and therefore the compute could not establish the connection:

Mar 15 07:24:46 overcloud-novacompute-0 os-collect-config[5504]: "",
Mar 15 07:24:46 overcloud-novacompute-0 os-collect-config[5504]: "stdout: (cellv2) Retrying",
Mar 15 07:24:46 overcloud-novacompute-0 os-collect-config[5504]: "(cellv2) Retrying",
Mar 15 07:24:46 overcloud-novacompute-0 os-collect-config[5504]: "stderr: Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (CauseewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fbcff416f90>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
Mar 15 07:24:46 overcloud-novacompute-0 os-collect-config[5504]: "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewtionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f229d0eaf90>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
Mar 15 07:24:46 overcloud-novacompute-0 os-collect-config[5504]: "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewtionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fec88ce0f90>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
Mar 15 07:24:46 overcloud-novacompute-0 os-collect-config[5504]: "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewtionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fd318da6f90>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
Mar 15 07:24:46 overcloud-novacompute-0 os-collect-config[5504]: "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewtionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f672a490f90>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
Mar 15 07:24:46 overcloud-novacompute-0 os-collect-config[5504]: "Unable to establish connection to https://[2001:db8:fd00:1000::5]:13774/v2.1/os-services: HTTPSConnectionPool(host='2001:db8:fd00:1000::5', port=13774): Max retries exceeded with url: /v2.1/os-services (Caused by NewtionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f5526e18f90>: Failed to establish a new connection: [Errno 101] Network is unreachable',))",
Mar 15 07:24:46 overcloud-novacompute-0 os-collect-config[5504]: "Unabl...

Read more...

Revision history for this message
Marios Andreou (marios-b) wrote :
Changed in tripleo:
status: Triaged → Fix Released
tags: removed: proomo
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.