tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_network_basic_ops failing on periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train

Bug #1959328 reported by Rafael Castillo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Bogdan Dobrelya

Bug Description

Tempest error[1]:
```
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tempest/common/utils/__init__.py", line 90, in wrapper
    return f(*func_args, **func_kwargs)
  File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 431, in test_network_basic_ops
    self._setup_network_and_servers()
  File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 119, in _setup_network_and_servers
    server = self._create_server(self.network, port_id)
  File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 171, in _create_server
    security_groups=security_groups)
  File "/usr/lib/python3.6/site-packages/tempest/scenario/manager.py", line 323, in create_server
    image_id=image_id, **kwargs)
  File "/usr/lib/python3.6/site-packages/tempest/common/compute.py", line 266, in create_test_server
    server['id'])
  File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/usr/local/lib/python3.6/site-packages/six.py", line 719, in reraise
    raise value
  File "/usr/lib/python3.6/site-packages/tempest/common/compute.py", line 237, in create_test_server
    clients.servers_client, server['id'], wait_until)
  File "/usr/lib/python3.6/site-packages/tempest/common/waiters.py", line 76, in wait_for_server_status
    server_id=server_id)
tempest.exceptions.BuildErrorException: Server b4341a17-3e57-42e8-ab61-7eb55b8a1c38 failed to build and is in ERROR status
Details: {'code': 500, 'created': '2022-01-27T16:21:46Z', 'message': 'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance b4341a17-3e57-42e8-ab61-7eb55b8a1c38.'}
```

Affected jobs: https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train%09

[1] https://logserver.rdoproject.org/openstack-periodic-integration-stable4/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train/b0732ee/logs/undercloud/var/log/tempest/tempest_run.log.txt.gz

Revision history for this message
Marios Andreou (marios-b) wrote :
Ronelle Landy (rlandy)
Changed in tripleo:
milestone: xena-3 → yoga-1
Revision history for this message
Amol Kahat (amolkahat) wrote :

I've tested this with skiplist revert[1]. I'm able to see the same error after test[2][3].

[1] https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/827954
[2] https://review.rdoproject.org/r/c/testproject/+/37423
[3] https://logserver.rdoproject.org/23/37423/6/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train/2a6a049/logs/undercloud/var/log/tempest/stestr_results.html.gz

```

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tempest/common/utils/__init__.py", line 90, in wrapper
    return f(*func_args, **func_kwargs)
  File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 431, in test_network_basic_ops
    self._setup_network_and_servers()
  File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 119, in _setup_network_and_servers
    server = self._create_server(self.network, port_id)
  File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 171, in _create_server
    security_groups=security_groups)
  File "/usr/lib/python3.6/site-packages/tempest/scenario/manager.py", line 323, in create_server
    image_id=image_id, **kwargs)
  File "/usr/lib/python3.6/site-packages/tempest/common/compute.py", line 266, in create_test_server
    server['id'])
  File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
    self.force_reraise()
  File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/usr/local/lib/python3.6/site-packages/six.py", line 719, in reraise
    raise value
  File "/usr/lib/python3.6/site-packages/tempest/common/compute.py", line 237, in create_test_server
    clients.servers_client, server['id'], wait_until)
  File "/usr/lib/python3.6/site-packages/tempest/common/waiters.py", line 76, in wait_for_server_status
    server_id=server_id)
tempest.exceptions.BuildErrorException: Server 841e1814-0144-4fae-b0cc-c4aa35aee1af failed to build and is in ERROR status
Details: {'code': 500, 'created': '2022-02-07T08:29:29Z', 'message': 'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 841e1814-0144-4fae-b0cc-c4aa35aee1af.'}

```

Revision history for this message
Ananya Banerjee (frenzyfriday) wrote :

Testproj: https://review.rdoproject.org/r/c/testproject/+/37423/ with revert of skiplist: https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/827954 passed.
Looks like we are not hitting this anymoe.

Revision history for this message
Ananya Banerjee (frenzyfriday) wrote :
Revision history for this message
Ananya Banerjee (frenzyfriday) wrote :

It might be a compute issue instead.

libvirt.libvirtError: internal error: process exited while connecting to monitor: 2022-02-07T08:29:26.596518Z qemu-kvm: Cannot load certificate '/etc/pki/libvirt-vnc/server-cert.pem' & key '/etc/pki/libvirt-vnc/server-key.pem': Error while reading file.

https://logserver.rdoproject.org/23/37423/6/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train/2a6a049/logs/overcloud-novacompute-0/var/log/containers/nova/nova-compute.log.txt.gz

Revision history for this message
Ananya Banerjee (frenzyfriday) wrote (last edit ):

I tracked the following:

Tempest failed test: tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_network_basic_ops

tempest run log: empest.exceptions.BuildErrorException: Server 7222b8b5-1e07-4589-b9f7-eab775f0f046 failed to build and is in ERROR status
    Details: {'code': 500, 'created': '2022-02-21T14:56:14Z', 'message': 'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 7222b8b5-1e07-4589-b9f7-eab775f0f046.'}

Searching with the server uuid in nova-conductor.log.txt.gz on overcloud node:
error: process exited while connecting to monitor: 2022-02-21T14:56:11.787865Z qemu-kvm: Cannot load certificate '/etc/pki/libvirt-vnc/server-cert.pem' & key '/etc/pki/libvirt-vnc/server-key.pem': Error while reading file.

https://logserver.rdoproject.org/23/37423/9/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train/71efe0d/logs/overcloud-controller-1/var/log/containers/nova/nova-conductor.log.txt.gz

Revision history for this message
Dariusz Smigiel (smigiel-dariusz) wrote :

The issue is still present. Like Ananya mentioned, it shows problems with starting VM due to problems with cert.
nova couldn't start VM [1]:
Error launching a defined domain with XML: <domain type='qemu'>

due to issues with libvirt [2] trying to load certificate [3].
Openssl versions installed [4]
openssl.x86_64 1:1.1.1k-5.el8_5 @baseos
openssl-libs.x86_64 1:1.1.1k-5.el8_5 @baseos
openssl-perl.x86_64 1:1.1.1k-5.el8_5 @baseos
openssl-pkcs11.x86_64 0.4.10-2.el8 @anaconda

2022-03-02 17:47:24.265+0000: 39515: info : hostname: overcloud-novacompute-0.ooo.test
2022-03-02 17:47:24.265+0000: 39515: error : qemuMonitorIORead:460 : Unable to read from monitor: Connection reset by peer
2022-03-02 17:47:24.267+0000: 39515: error : qemuProcessReportLogError:2051 : internal error: qemu unexpectedly closed the monitor: 2022-03-02T17:47:24.248974Z qemu-kvm: Cannot load certificate '/etc/pki/libvirt-vnc/server-cert.pem' & key '/etc/pki/libvirt-vnc/server-key.pem': Error while reading file.
2022-03-02 17:47:24.468+0000: 29925: error : qemuProcessReportLogError:2051 : internal error: process exited while connecting to monitor: 2022-03-02T17:47:24.248974Z qemu-kvm: Cannot load certificate '/etc/pki/libvirt-vnc/server-cert.pem' & key '/etc/pki/libvirt-vnc/server-key.pem': Error while reading file.

[1]: https://logserver.rdoproject.org/59/39959/2/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train/ce05229/logs/overcloud-novacompute-0/var/log/containers/nova/nova-compute.log.txt.gz
[2]: https://logserver.rdoproject.org/59/39959/2/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train/ce05229/logs/overcloud-novacompute-0/var/log/containers/libvirt/libvirtd.log.txt.gz
[3]: https://logserver.rdoproject.org/59/39959/2/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train/ce05229/logs/overcloud-novacompute-0/etc/pki/libvirt-vnc/
[4]: https://logserver.rdoproject.org/59/39959/2/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train/ce05229/logs/overcloud-novacompute-0/var/log/extra/package-list-installed.txt.gz

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :
Download full text (4.3 KiB)

The puppet patch https://review.opendev.org/c/openstack/puppet-tripleo/+/839957 didn't address the issue

https://logserver.rdoproject.org/30/42430/1/check/periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train/89c784c/logs/overcloud-novacompute-0/var/log/containers/nova/nova-compute.log.txt.gz 2022-05-02T14:51:58.713864Z qemu-kvm: Cannot load certificate '/etc/pki/libvirt-vnc/server-cert.pem' & key '/etc/pki/libvirt-vnc/server-key.pem': Error while reading file.
2022-05-02 14:51:59.059 7 ERROR nova.virt.libvirt.driver [req-78c829c8-0205-410d-b872-f54e2e9b6201 13f22fe84af445d087f24a18652af2ba fcd549f68dab4059b5943adf280a1ac9 - default default] [instance: 4e8018f1-d2cd-4683-a716-4b1e5b8bce13] Failed to start libvirt guest: libvirt.libvirtError: internal error: process exited while connecting to monitor: 2022-05-02T14:51:58.713864Z qemu-kvm: Cannot load certificate '/etc/pki/libvirt-vnc/server-cert.pem' & key '/etc/pki/libvirt-vnc/server-key.pem': Error while reading file.
2022-05-02 14:51:59.381 7 ERROR nova.compute.manager [req-78c829c8-0205-410d-b872-f54e2e9b6201 13f22fe84af445d087f24a18652af2ba fcd549f68dab4059b5943adf280a1ac9 - default default] [instance: 4e8018f1-d2cd-4683-a716-4b1e5b8bce13] Instance failed to spawn: libvirt.libvirtError: internal error: process exited while connecting to monitor: 2022-05-02T14:51:58.713864Z qemu-kvm: Cannot load certificate '/etc/pki/libvirt-vnc/server-cert.pem' & key '/etc/pki/libvirt-vnc/server-key.pem': Error while reading file.
2022-05-02 14:51:59.381 7 ERROR nova.compute.manager [instance: 4e8018f1-d2cd-4683-a716-4b1e5b8bce13] libvirt.libvirtError: internal error: process exited while connecting to monitor: 2022-05-02T14:51:58.713864Z qemu-kvm: Cannot load certificate '/etc/pki/libvirt-vnc/server-cert.pem' & key '/etc/pki/libvirt-vnc/server-key.pem': Error while reading file.
2022-05-02 14:51:59.816 7 ERROR nova.compute.manager [req-78c829c8-0205-410d-b872-f54e2e9b6201 13f22fe84af445d087f24a18652af2ba fcd549f68dab4059b5943adf280a1ac9 - default default] [instance: 4e8018f1-d2cd-4683-a716-4b1e5b8bce13] Failed to build and run instance: libvirt.libvirtError: internal error: process exited while connecting to monitor: 2022-05-02T14:51:58.713864Z qemu-kvm: Cannot load certificate '/etc/pki/libvirt-vnc/server-cert.pem' & key '/etc/pki/libvirt-vnc/server-key.pem': Error while reading file.
2022-05-02 14:51:59.816 7 ERROR nova.compute.manager [instance: 4e8018f1-d2cd-4683-a716-4b1e5b8bce13] libvirt.libvirtError: internal error: process exited while connecting to monitor: 2022-05-02T14:51:58.713864Z qemu-kvm: Cannot load certificate '/etc/pki/libvirt-vnc/server-cert.pem' & key '/etc/pki/libvirt-vnc/server-key.pem': Error while reading file.
2022-05-02 14:51:59.817 7 DEBUG nova.compute.manager [req-78c829c8-0205-410d-b872-f54e2e9b6201 13f22fe84af445d087f24a18652af2ba fcd549f68dab4059b5943adf280a1ac9 - default default] [instance: 4e8018f1-d2cd-4683-a716-4b1e5b8bce13] Build of instance 4e8018f1-d2cd-4683-a716-4b1e5b8bce13 was re-scheduled: internal error: process exited while connecting to monitor: 2022-05-02T14:51:58.713864Z qemu-kvm: Cannot load certificate '/etc/pki/libvirt-vnc/server-cert.pe...

Read more...

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :
Changed in tripleo:
assignee: nobody → Bogdan Dobrelya (bogdando)
status: Triaged → In Progress
Revision history for this message
Sandeep Yadav (sandeepyadav93) wrote :

puppet-tripleo https://review.opendev.org/c/openstack/puppet-tripleo/+/839957/ patch is merged

periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-train is green with skiplist revert , testproject [1].

We have merged the tempest skiplist revert too.[2]

[1] https://review.rdoproject.org/r/c/testproject/+/39959
[2] https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/831259

Changed in tripleo:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.