vexx: OVB master job running on vexxhost show some nodes failing introspection step

Bug #1885314 reported by Ronelle Landy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Ronelle Landy

Bug Description

One or two nodes are failing introspection in master jobs running on vexxhost.

Example logs are below:

https://logserver.rdoproject.org/79/735179/6/openstack-check/tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-vexxhost/384e111/logs/undercloud/home/zuul/overcloud_introspect.log.txt.gz

2020-06-26 15:04:32 | 2020-06-26 15:04:32.218612 | fa163e10-2375-5c60-d0ee-000000000015 | OK | Nodes that passed introspection | localhost | result={
2020-06-26 15:04:32 | "changed": false,
2020-06-26 15:04:32 | "msg": " b85731ef-c061-43bd-a098-7f8d0b2e85c0 1e88a786-eb8e-47cc-8d7f-0ff27f0570b6"
2020-06-26 15:04:32 | }
2020-06-26 15:04:32 | 2020-06-26 15:04:32.226220 | fa163e10-2375-5c60-d0ee-000000000016 | TASK | Nodes that failed introspection
2020-06-26 15:04:32 | 2020-06-26 15:04:32.227229 | fa163e10-2375-5c60-d0ee-000000000015 | TIMING | Nodes that passed introspection | 0:20:21.214 | 0.12s
2020-06-26 15:04:32 | 2020-06-26 15:04:32.284986 | fa163e10-2375-5c60-d0ee-000000000016 | FATAL | Nodes that failed introspection | localhost | error={
2020-06-26 15:04:32 | "msg": " 726f58b3-59d6-48af-baf3-33af7f12f296 0d94ec59-3049-44b8-a634-596d8f423a95"
2020-06-26 15:04:32 | }

*****************

https://logserver.rdoproject.org/23/737423/2/openstack-check/tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-vexxhost/c064767/logs/undercloud/home/zuul/overcloud_introspect.log.txt.gz

2020-06-26 14:55:32 | TASK [Nodes that passed introspection] *****************************************
2020-06-26 14:55:32 | Friday 26 June 2020 14:55:32 +0000 (0:20:19.713) 0:20:20.405 ***********
2020-06-26 14:55:32 | ok: [localhost] =>
2020-06-26 14:55:32 | msg: ' a93bf3e8-e0ff-48ab-8a81-86f794b1086a 6147b65d-3674-4979-b965-77f082d562a7 5b35041e-7dce-42db-8550-0aed31d8f3b5'
2020-06-26 14:55:32 |
2020-06-26 14:55:32 | TASK [Nodes that failed introspection] *****************************************
2020-06-26 14:55:32 | Friday 26 June 2020 14:55:32 +0000 (0:00:00.117) 0:20:20.522 ***********
2020-06-26 14:55:32 | fatal: [localhost]: FAILED! =>
2020-06-26 14:55:32 | msg: ' 32e3cb1b-2cf0-4989-b43e-2936d738b0a3'

Possibly we need to increase time outs here.

Revision history for this message
Ronelle Landy (rlandy) wrote :
Changed in tripleo:
milestone: none → victoria-1
importance: Undecided → Critical
status: New → Triaged
tags: added: promotion-blocker
Changed in tripleo:
assignee: nobody → Ronelle Landy (rlandy)
status: Triaged → In Progress
summary: - OVB master job running on vexxhost show some nodes failing
+ vexx: OVB master job running on vexxhost show some nodes failing
introspection step
Revision history for this message
Ronelle Landy (rlandy) wrote :

Introspection is looking better - for the stacks that get to that point.
Putting this on hold until vexx is in better shape

Changed in tripleo:
milestone: victoria-1 → victoria-3
Revision history for this message
Marios Andreou (marios-b) wrote :
Revision history for this message
Marios Andreou (marios-b) wrote :
Revision history for this message
chandan kumar (chkumar246) wrote :
Revision history for this message
Bob Fournier (bfournie) wrote :

Patch to increase default DHCP timeout - https://review.opendev.org/#/c/744939/.

Revision history for this message
Marios Andreou (marios-b) wrote :

OK looks like this is done now at least we haven't seen it in a few days.

Moving fix released please move back if you see this again or file a new issue thanks.

Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.