post failure on multiple periodic jobs with SSH Error: data could not be sent to remote host
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Won't Fix
|
Critical
|
Unassigned |
Bug Description
It is a tracker bug for the multiple post failures seen on the RDO cloud periodic ovb jobs with following errors:
https:/
TASK [remove-
2018-11-19 01:44:08.625169 | primary | changed
2018-11-19 01:44:23.311954 | secondary | ERROR
2018-11-19 01:44:23.312239 | secondary | {
2018-11-19 01:44:23.312362 | secondary | "msg": "SSH Error: data could not be sent to remote host \"38.145.33.154\". Make sure this host can be reached over ssh",
2018-11-19 01:44:23.312442 | secondary | "unreachable": true
2018-11-19 01:44:23.312535 | secondary | }
Time stamp when it happened:
2018-11-19 01:44:23.
2018-11-19 08:37:16.
2018-11-19 06:06:53
As per discussion with kforde on #rhos-ops, there is no networking issue, E2E tests were passing.
We need to take a look why it happened.
Changed in tripleo: | |
milestone: | stein-2 → stein-3 |
Changed in tripleo: | |
status: | Triaged → Incomplete |
tags: | removed: promotion-blocker |
Changed in tripleo: | |
milestone: | stein-3 → stein-rc1 |
Changed in tripleo: | |
status: | Incomplete → Won't Fix |
In 2 of the 4 cases, we are seeing an SSH error message claiming that the SSH host key had changed:
- https:/ /logs.rdoprojec t.org/openstack -periodic- 24hr/git. openstack. org/openstack- infra/tripleo- ci/master/ periodic- tripleo- ci-centos- 7-ovb-1ctlr_ 1comp-featurese t002-pike- upload/ f00ec1c/ job-output. txt.gz# _2018-11- 19_06_06_ 59_957778
- https:/ /logs.rdoprojec t.org/69/ 618669/ 1/openstack- check/tripleo- ci-centos- 7-ovb-3ctlr_ 1comp-featurese t053/d9dc22f/ job-output. txt.gz# _2018-11- 19_04_00_ 37_775801
It looks like the node had been deleted outside the control of nodepool. Maybe we can check the RDO Cloud logs to see which user requested the deletion of those nodes? Also, do we have any other script/bot (te-broker) that could have deleted the VM?