comp-pipeline-network: Selected log directory '/home/zuul/validations' does not exist. Attempting to create it, No validation has been run

Bug #1936218 reported by wes hayutin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Jiri Podivin

Bug Description

2021-07-14 13:05:13.773473 | primary | TASK [Run validations] *********************************************************
2021-07-14 13:05:13.773486 | primary | Wednesday 14 July 2021 13:05:13 +0000 (0:00:00.106) 0:32:17.863 ********
2021-07-14 13:05:15.969326 | primary | fatal: [undercloud]: FAILED! => {
2021-07-14 13:05:15.969453 | primary | "changed": true,
2021-07-14 13:05:15.969475 | primary | "cmd": "validation run --validation undercloud-neutron-sanity-check --validation-dir /usr/share/ansible/validation-playbooks --inventory tripleo-ansible-inventory.yaml --output-log validation_undercloud-neutron-sanity-check.log ",
2021-07-14 13:05:15.969486 | primary | "delta": "0:00:01.802990",
2021-07-14 13:05:15.969495 | primary | "end": "2021-07-14 13:05:15.949676",
2021-07-14 13:05:15.969505 | primary | "rc": 1,
2021-07-14 13:05:15.969513 | primary | "start": "2021-07-14 13:05:14.146686"
2021-07-14 13:05:15.969522 | primary | }
2021-07-14 13:05:15.969531 | primary |
2021-07-14 13:05:15.969540 | primary | STDERR:
2021-07-14 13:05:15.969549 | primary |
2021-07-14 13:05:15.969611 | primary | Selected log directory '/home/zuul/validations' does not exist. Attempting to create it.
2021-07-14 13:05:15.969637 | primary | No validation has been run, please check log in the Ansible working directory.
2021-07-14 13:05:15.969647 | primary |
2021-07-14 13:05:15.969656 | primary |
2021-07-14 13:05:15.969665 | primary | MSG:
2021-07-14 13:05:15.969695 | primary |
2021-07-14 13:05:15.969737 | primary | non-zero return code

https://logserver.rdoproject.org/openstack-component-network/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-standalone-network-master-validation/9bd678a/job-output.txt

https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-standalone-network-master-validation

Revision history for this message
wes hayutin (weshayutin) wrote :
summary: - Selected log directory '/home/zuul/validations' does not exist.
- Attempting to create it, No validation has been run
+ comp-pipeline-network: Selected log directory '/home/zuul/validations'
+ does not exist. Attempting to create it, No validation has been run
Jiri Podivin (jpodivin)
Changed in tripleo:
assignee: nobody → Jiri Podivin (jpodivin)
Revision history for this message
Jiri Podivin (jpodivin) wrote :

Possible fix. Although we will see only when it gets in the build:

https://review.opendev.org/c/openstack/validations-common/+/797618

Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
Jiri Podivin (jpodivin) wrote :
Revision history for this message
Jiri Podivin (jpodivin) wrote (last edit ):

After installing the the versions of the VL and VC from the build on DF5. I've run the validation from the 'periodic-tripleo-ci-centos-8-standalone-network-master-validation' with the following result:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

(undercloud) [stack@undercloud-0 ~]$ openstack tripleo validator run --validation undercloud-neutron-sanity-check
Running Validations without Overcloud settings.
Selected log directory '/home/stack/validations' does not exist. Attempting to create it.
+--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+
| UUID | Validations | Status | Host_Group | Status_by_Host | Unreachable_Hosts | Duration |
+--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+
| 0056ec9a-d4c4-492e-823e-7b8cb138cda6 | undercloud-neutron-sanity-check | FAILED | undercloud | undercloud | | 0:00:05.134 |
+--------------------------------------+---------------------------------+--------+------------+----------------+-------------------+-------------+

One or more validations have failed.
(undercloud) [stack@undercloud-0 ~]$ ls validations/
0056ec9a-d4c4-492e-823e-7b8cb138cda6_undercloud-neutron-sanity-check_2021-07-15T07:49:45.592238Z.json artifacts

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Revision history for this message
Jiri Podivin (jpodivin) wrote :

When run trough tripleo the build works and 'undercloud-neutron-sanity-check'. But the same validation fails in the standalone with the `validation run` command.

However, other validations have no problem running in the standalone when supplied with the same args. For example:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
validation run --validation check-cpu --validation-dir /usr/share/ansible/validation-playbooks --inventory tripleo-ansible-inventory.yaml --output-log validation_undercloud-neutron-sanity-check.log
+--------------------------------------+-------------+--------+------------+----------------+-------------------+-------------+
| UUID | Validations | Status | Host_Group | Status_by_Host | Unreachable_Hosts | Duration |
+--------------------------------------+-------------+--------+------------+----------------+-------------------+-------------+
| a59cbc77-234b-4d7c-ad49-79c8c326196c | check-cpu | FAILED | localhost | localhost | | 0:00:00.895 |
+--------------------------------------+-------------+--------+------------+----------------+-------------------+-------------+
One or more validations have failed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This makes me think that the issue might lie in the validation itself, rather than the framework.

Revision history for this message
Jiri Podivin (jpodivin) wrote :

The contents of ansible.log from the DF5 seem to support my suspicion.

The first several lines represent failures recorded during the run of the failing command with the undercloud-neutron-sanity-check[0] validation.

Following lines were produced by the same command with the 'check-cpu'[1] validation.

[0](https://docs.openstack.org/tripleo-validations/latest/validations-pre-introspection-details.html#undercloud-neutron-sanity-check)
[1](https://docs.openstack.org/validations-common/latest/roles/role-check_cpu.html)

Revision history for this message
David Peacock (davidjpeacock) wrote :

Having followed along with Jiri's train of thought I'm inclined to agree that this points to the validation itself.

The error message is slightly misleading; the initial part of the error is a warning that the validations dir does not exist is actually informational rather than worrying (and perhaps we should rethink this) but this is not the actual key part of the message.

The key part is the final bit: "No validation has been run" - this is caused because conditions in the validation are not such that it can run.

In particular it looks like a hostname doesn't get found in the logs perhaps due to an inventory issue, so nothing was available to run.

Revision history for this message
Jiri Podivin (jpodivin) wrote :
Revision history for this message
Jiri Podivin (jpodivin) wrote :

The periodic-tripleo-ci-centos-8-standalone-tripleo-master-validation failures follow the same pattern and probably share the cause with periodic-tripleo-ci-centos-8-standalone-network-master-validation

Log from periodic-tripleo-ci-centos-8-standalone-tripleo-master-validation:

https://review.rdoproject.org/zuul/build/38c8c926529f4283953795bfa4abaf9a/log/logs/undercloud/home/zuul/ansible.log.txt.gz

Revision history for this message
Jiri Podivin (jpodivin) wrote :

periodic-tripleo-ci-centos-8-standalone-tripleo-master-validation fails with the same errors.

https://review.rdoproject.org/zuul/build/974c650d02ca42bf930fe23eda7b97b0/log/logs/undercloud/home/zuul/ansible.log.txt.gz

Revision history for this message
Jiri Podivin (jpodivin) wrote :
Revision history for this message
David Peacock (davidjpeacock) wrote :
Revision history for this message
Jiri Podivin (jpodivin) wrote :
Revision history for this message
David Peacock (davidjpeacock) wrote :

Ok so it has been discovered that the inventory is no longer in the right place, and the generation viewing tool no longer works; this is the actual cause of these failures.

When using tripleo-ansible-inventory on master:

(undercloud) [centos@undercloud-0 ~]$ tripleo-ansible-inventory
Traceback (most recent call last):
  File "/usr/bin/tripleo-ansible-inventory", line 239, in <module>
    main()
  File "/usr/bin/tripleo-ansible-inventory", line 178, in main
    utils.get_swift_client(auth_variables))
  File "/usr/lib/python3.6/site-packages/tripleo_validations/utils.py", line 91, in list_plan_and_stack
    stacks = [s.stack_name for s in hclient.stacks.list()]
  File "/usr/lib/python3.6/site-packages/tripleo_validations/utils.py", line 91, in <listcomp>
    stacks = [s.stack_name for s in hclient.stacks.list()]
  File "/usr/lib/python3.6/site-packages/heatclient/v1/stacks.py", line 136, in paginate
    stacks = self._list(url, 'stacks')
  File "/usr/lib/python3.6/site-packages/heatclient/common/base.py", line 114, in _list
    body = self.client.get(url).json()
  File "/usr/lib/python3.6/site-packages/keystoneauth1/adapter.py", line 395, in get
    return self.request(url, 'GET', **kwargs)
  File "/usr/lib/python3.6/site-packages/heatclient/common/http.py", line 320, in request
    **kwargs)
  File "/usr/lib/python3.6/site-packages/keystoneauth1/adapter.py", line 554, in request
    resp = super(LegacyJsonAdapter, self).request(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/keystoneauth1/adapter.py", line 257, in request
    return self.session.request(url, method, **kwargs)
  File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 812, in request
    **endpoint_filter)
  File "/usr/lib/python3.6/site-packages/keystoneauth1/session.py", line 1243, in get_endpoint
    return auth.get_endpoint(self, **kwargs)
  File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 380, in get_endpoint
    allow_version_hack=allow_version_hack, **kwargs)
  File "/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py", line 279, in get_endpoint_data
    service_name=service_name)
  File "/usr/lib/python3.6/site-packages/keystoneauth1/access/service_catalog.py", line 462, in endpoint_data_for
    raise exceptions.EndpointNotFound(msg)
keystoneauth1.exceptions.catalog.EndpointNotFound: public endpoint for orchestration service not found

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to validations-common (master)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-quickstart-extras (master)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on validations-common (master)

Change abandoned by "Jiri Podivin <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/validations-common/+/801023
Reason: Abandoned in favor of I3354d7007d17a58c7c84d9de7326673c3d703c9d

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-quickstart-extras (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/801026
Committed: https://opendev.org/openstack/tripleo-quickstart-extras/commit/85d28189ab8d26593f7023cf5f9e628856a56cb0
Submitter: "Zuul (22348)"
Branch: master

commit 85d28189ab8d26593f7023cf5f9e628856a56cb0
Author: Jiri Podivin <email address hidden>
Date: Fri Jul 16 10:24:07 2021 +0200

    Defining tripleo inventory path

    Closes-bug: #1936218

    Signed-off-by: Jiri Podivin <email address hidden>
    Change-Id: I3354d7007d17a58c7c84d9de7326673c3d703c9d

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
Marios Andreou (marios-b) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.