scenario.test_volume_encryption.VolumeEncryptionTest.test_encrypted_cinder_volumes_luks fails in master unexpected repsonse code

Bug #1931516 reported by wes hayutin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Won't Fix
Critical
Unassigned

Bug Description

traceback-9: {{{
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tempest/lib/services/image/v2/images_client.py", line 90, in delete_image
    resp, _ = self.delete(url)
  File "/usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py", line 330, in delete
    return self.request('DELETE', url, extra_headers, headers, body)
  File "/usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py", line 703, in request
    self._error_checker(resp, resp_body)
  File "/usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py", line 884, in _error_checker
    resp=resp)
tempest.lib.exceptions.UnexpectedResponseCode: Unexpected response code received
Details: 503
}}}

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tempest/lib/common/utils/test_utils.py", line 87, in call_and_ignore_notfound_exc
    return func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/tempest/lib/services/compute/floating_ips_client.py", line 79, in delete_floating_ip
    resp, body = self.delete(url)
  File "/usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py", line 330, in delete
    return self.request('DELETE', url, extra_headers, headers, body)
  File "/usr/lib/python3.6/site-packages/tempest/lib/services/compute/base_compute_client.py", line 48, in request
    method, url, extra_headers, headers, body, chunked)
  File "/usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py", line 703, in request
    self._error_checker(resp, resp_body)
  File "/usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py", line 884, in _error_checker
    resp=resp)
tempest.lib.exceptions.UnexpectedResponseCode: Unexpected response code received
Details: 503

https://28d88c1ccddaa6cbd25e-d527ccbde8227159a08a8d34e1ed01e7.ssl.cf5.rackcdn.com/795311/2/gate/tripleo-ci-centos-8-scenario002-standalone/47af14e/logs/undercloud/var/log/tempest/stestr_results.html

also recorded in 16.2
https://bugzilla.redhat.com/show_bug.cgi?id=1967996

Revision history for this message
wes hayutin (weshayutin) wrote :
Download full text (3.4 KiB)

ECONNREFUSED
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova Traceback (most recent call last):
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova File "/usr/bin/nova-scheduler", line 10, in <module>
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova sys.exit(main())
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova File "/usr/lib/python3.6/site-packages/nova/cmd/scheduler.py", line 48, in main
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova binary='nova-scheduler', topic=rpcapi.RPC_TOPIC)
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova File "/usr/lib/python3.6/site-packages/nova/service.py", line 256, in create
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova periodic_interval_max=periodic_interval_max)
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova File "/usr/lib/python3.6/site-packages/nova/service.py", line 116, in __init__
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova self.manager = manager_class(host=self.host, *args, **kwargs)
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova File "/usr/lib/python3.6/site-packages/nova/scheduler/manager.py", line 58, in __init__
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova self.placement_client = report.SchedulerReportClient()
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova File "/usr/lib/python3.6/site-packages/nova/scheduler/client/report.py", line 187, in __init__
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova self._client = self._create_client()
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova File "/usr/lib/python3.6/site-packages/nova/scheduler/client/report.py", line 230, in _create_client
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova client = self._adapter or utils.get_sdk_adapter('placement')
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova File "/usr/lib/python3.6/site-packages/nova/utils.py", line 989, in get_sdk_adapter
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova return getattr(conn, service_type)
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova File "/usr/lib/python3.6/site-packages/openstack/service_description.py", line 87, in __get__
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova proxy = self._make_proxy(instance)
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova File "/usr/lib/python3.6/site-packages/openstack/service_description.py", line 271, in _make_proxy
2021-06-09 23:46:27.202 ERROR /var/log/containers/nova/nova-scheduler.log: 8 ERROR nova ...

Read more...

Revision history for this message
wes hayutin (weshayutin) wrote :

2021-06-10 00:04:54.603 ERROR /var/log/containers/cinder/cinder-api.log: 15 ERROR oslo.messaging._drivers.impl_rabbit [-] [115d88bd-f594-4377-9e5c-5cca8dd53841] AMQP server on standalone.ctlplane.localdomain:5672 is unreachable: [Errno 111] Connection refused. Trying again in 10 seconds.: ConnectionRefusedError: [Errno 111] Connection refused
2021-06-10 00:10:24.931 ERROR /var/log/containers/cinder/cinder-api.log: 15 ERROR cinder.db.sqlalchemy.api [req-45d17f73-6cfd-41c5-9628-6476d62e9d26 b8a0ae812a9d4e7986a6c04db93b9db4 6e93b6f0c0f54edaa39ee84f0432c905 - default default] VolumeType 976c13fd-9ce2-4130-8459-f3dd82ce0618 deletion failed, VolumeType in use.

Revision history for this message
John Eckersberg (jeckersb) wrote :

This just looks like the system is overloaded.

The rabbit monitor fails, but the ovndb also fails at the same time:

Jun 10 00:04:17 standalone.localdomain pacemaker-execd[73696]: warning: rabbitmq-bundle-podman-0_monitor_60000[318225] timed out after 20000ms
Jun 10 00:04:18 standalone.localdomain pacemaker-execd[73696]: warning: ovn-dbs-bundle-podman-0_monitor_60000[318303] timed out after 20000ms

Also from dstat:

https://28d88c1ccddaa6cbd25e-d527ccbde8227159a08a8d34e1ed01e7.ssl.cf5.rackcdn.com/795311/2/gate/tripleo-ci-centos-8-scenario002-standalone/47af14e/logs/undercloud/var/log/extra/dstat.html

There is a huge iowait spike at the same time, with a corresponding spike in load avg.

I see a bunch of stuff related to gnocchi in the journal starting here:

Jun 10 00:03:54 standalone.localdomain haproxy[89611]: 192.168.24.3:57274 [10/Jun/2021:00:03:54.603] gnocchi gnocchi/standalone.ctlplane.localdomain 0/0/0/21/21 202 129 - - ---- 87/2/1/2/0 0/0 "POST /v1/batch/resources/metrics/measures?create_metrics=True HTTP/1.1"

Which corresponds to the beginning of the io/load spike. So possibly related.

Changed in tripleo:
milestone: xena-1 → xena-2
Revision history for this message
Marios Andreou (marios-b) wrote :

Trying to catch up on the status here and not sure what to do with this bug.

Going on the description here I have tried to come up with a query to search the frequency - i tried a few different queries but this [1] gave me 10 hits in 15 days:

build_status: "FAILURE" AND build_name: "tripleo-ci-centos-8-scenario002.*" AND message: "barbican_tempest_plugin.tests.scenario.test_volume_encryption"

From the query I could find some more examples of this test failing like [2][3][4] but I cannot be sure if they are all failing on the same reason.

[1] https://review.rdoproject.org/analytics/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-15d,to:now))&_a=(columns:!(_source),filters:!(),index:logstash,interval:auto,query:(language:kuery,query:'build_status:%20%22FAILURE%22%20%20%20AND%20build_name:%20%22tripleo-ci-centos-8-scenario002.*%22%20AND%20message:%20%22barbican_tempest_plugin.tests.scenario.test_volume_encryption%22'),sort:!())

[2] https://logserver.rdoproject.org/openstack-component-common/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario002-standalone-common-ussuri/5604b0b/logs/undercloud/var/log/tempest/stestr_results.html.gz

[3] https://logserver.rdoproject.org/openstack-periodic-integration-stable2/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario002-standalone-victoria/0f347ad/logs/undercloud/var/log/tempest/stestr_results.html.gz

[4] https://logserver.rdoproject.org/openstack-component-cloudops/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario002-standalone-cloudops-wallaby/a36f962/logs/undercloud/var/log/tempest/stestr_results.html.gz

Revision history for this message
wes hayutin (weshayutin) wrote :

I'm going to review this test pass/fail rates.. and perf.
This is a candidate for a perm skip due to infra constraints.

Revision history for this message
wes hayutin (weshayutin) wrote :

The bug here will serve as documentation for the perm skip.

Revision history for this message
Marios Andreou (marios-b) wrote :

added skip for victoria/ussuri/wallaby based on links in comment #4 https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/798296

Changed in tripleo:
milestone: xena-2 → xena-3
Revision history for this message
Alan Pevec (apevec) wrote :

- test: barbican_tempest_plugin.tests.scenario.test_volume_encryption.VolumeEncryptionTest.test_encrypted_cinder_volumes_luks
is still in https://opendev.org/openstack/openstack-tempest-skiplist/src/branch/master/roles/validate-tempest/vars/tempest_skip.yml
and doesn't seem to be realistic to be ever fixed to work in the CI environment

Changed in tripleo:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.