010-standalone is randomly failing
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Cédric Jeanneret |
Bug Description
010-standalone is randomly failing due to a failed healthcheck for the nova_migration_
2020-01-30 19:47:40.220619 | primary | TASK [validate-services : Print out any failed Systemd services for tripleo_*] ***
2020-01-30 19:47:40.249635 | primary | Thursday 30 January 2020 19:47:40 +0000 (0:00:00.721) 1:03:01.731 ******
2020-01-30 19:47:40.300366 | primary | ok: [undercloud] => {
2020-01-30 19:47:40.300528 | primary | "systemd_
2020-01-30 19:47:40.300742 | primary | "tripleo_
2020-01-30 19:47:40.300816 | primary | ]
2020-01-30 19:47:40.300858 | primary | }
This validate-service has been added back in April 2019[1] and apparently has never caused any issue.
The problem is probably related to this patch:
https:/
A revert of both changes are in the pipes:
Master: https:/
Train: https:/
I'm also looking into this issue in order to understand why it's failing on that only job (the validate-services thing has been added to a total of 4 jobs, 3 standalone and 1 compute[1])
Changed in tripleo: | |
status: | Confirmed → Triaged |
The issue was hit on another job with another healthcheck: ci-centos- 7-containerized -undercloud- upgrades
tripleo-
2020-01-30 16:28:34.334536 | primary | ok: [undercloud] => { state.stdout_ lines": [ ironic_ inspector_ dnsmasq_ healthcheck. service loaded failed failed ironic_ inspector_ dnsmasq healthcheck"
2020-01-30 16:28:34.334841 | primary | "systemd_
2020-01-30 16:28:34.335092 | primary | "tripleo_
2020-01-30 16:28:34.335141 | primary | ]
2020-01-30 16:28:34.335177 | primary | }