hubbot check jobs are failing (timing out on OC deploy)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Triaged
|
Critical
|
Unassigned |
Bug Description
ROOT CAUSE IS NOT YET DETERMINED
---
sanity (check) job for master has been failing with OC deploy timeouts
https:/
2018-05-16 17:41:26.574265 | primary | TASK [overcloud-deploy : Deploy the overcloud] *******
2018-05-16 17:41:26.606048 | primary | Wednesday 16 May 2018 17:41:26 +0000 (0:00:00.112) 1:01:59.851 *********
2018-05-16 19:07:18.664433 | primary | +(./toci_
---
failing the same way (job timeout during OC deploy)
2018-05-16 17:26:35.648672 | primary | TASK [overcloud-deploy : Deploy the overcloud] *******
2018-05-16 17:26:35.692581 | primary | Wednesday 16 May 2018 17:26:35 +0000 (0:00:00.121) 0:00:41.116 *********
2018-05-16 19:05:51.509779 | primary | +(./toci_
---
The timeouts appear to have started on the recheck for the patch listed above (hubbot tqe gate job) May 15 3:05 PM
We've been nose down chasing promotions so just logging this now.
description: | updated |
description: | updated |
description: | updated |
tags: | added: alert |
tags: | removed: alert |
notes from initial investigation follow (WIP)
-
# disk space and ram are both identified as being too low, unclear yet if this is what normal looks like
http:// logs.openstack. org/24/ 567224/ 1/check/ tripleo- ci-centos- 7-containers- multinode/ a7052df/ job-output. txt.gz# _2018-05- 16_17_28_ 22_140352
TASK [tripleo- validations : Display failed validations tests]
fatal: [undercloud]: FAILED! => {"changed": false, "failed": true, "msg": ["### undercloud- disk-space FAILED ###", "Task 'Verify root disk space' failed:", "Host: localhost", "Message: The available space on the root partition is 31.1 GB, but it should be at least 60 GB.", "", "Failure! The validation failed for all hosts:", "* localhost", "", "### undercloud-ram FAILED ###", "Task 'Verify the RAM requirements' failed:", "Host: localhost", "Message: The RAM on the undercloud node is 7977 MB, the minimal recommended value is 16384 MB.", "", "Failure! The validation failed for all hosts:", "* localhost"]}
2018-05-16 17:28:22.319412 | primary | ...ignoring
-
additional UC validations are also failing
http:// logs.openstack. org/24/ 567224/ 1/check/ tripleo- ci-centos- 7-containers- multinode/ a7052df/ job-output. txt.gz# _2018-05- 16_17_40_ 44_752465
{ time_to_ live is set to -1.", 192.168. 24.1:5000/ v2.0/tokens): The resource could not be found. (HTTP 404) (Request-ID: req-e4ff4f49- 5c9b-452f- 9b9d-74ff05b104 0b)", data'. Error was a <class 'swiftclient. exceptions. ClientException '>, original message: Container GET failed: http:// 192.168. 24.1:8080/ v1/AUTH_ ed317ea6680e423 98bf3db09455d6b 51/ironic- inspector? format= json 404 Not Found [first 60 chars of response] <html><h1>Not Found</h1><p>The resource could not be found.<",
"changed": false,
"failed": true,
"msg": [
"### ceilometerdb-size FAILED ###",
"Task 'Check values' failed:",
"Host: localhost",
"Message: Value of metering_
"",
"Task 'Check values' failed:",
"Host: localhost",
"Message: Value of event_time_to_live is set to -1.",
"",
"Failure! The validation failed for all hosts:",
"* localhost",
"",
"### deployment-images FAILED ###",
"Task 'Fetch available images' failed:",
"Host: localhost",
"Message: Command `openstack image list --format value --column Name` exited with code: 1: non-zero return code",
"",
"stderr:",
" (http://
"",
"Failure! The validation failed for all hosts:",
"* localhost",
"",
"### switch-vlans FAILED ###",
"Task 'Check that switch vlans are present if used in nic-config files' failed:",
"Host: localhost",
"Message: An unhandled exception occurred while running the lookup plugin 'introspection_
"",
"Failure! The validation failed for all hosts:",
"* localhost",
"",
"### undercloud-debug FAILED ###",
"Task 'Check the services for debug flag' failed:",
"Host: localhost",
"Message: The key 'debug' under the section 'DEFAULT' in file /etc/nova/nova.conf has the value: 'True'",
"",
"Task 'Check the services for debug flag' failed:"...