Comment 11 for bug 1950916

Revision history for this message
Artom Lifshitz (notartom) wrote : Re: master and wallaby check/gate/periodic tests are failing tempest - connectivity issues

Using zuul@38.102.83.147 as an investigation environment, using the test_snapshot_pattern failed test with the following traceback:

    Traceback (most recent call last):
      File "/usr/lib/python3.6/site-packages/tempest/common/utils/__init__.py", line 70, in wrapper
        return f(*func_args, **func_kwargs)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/test_snapshot_pattern.py", line 63, in test_snapshot_pattern
        server=server)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/manager.py", line 1068, in create_timestamp
        username=username)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/manager.py", line 726, in get_remote_client
        linux_client.validate_authentication()
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 31, in wrapper
        return function(self, *args, **kwargs)
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/utils/linux/remote_client.py", line 114, in validate_authentication
        self.ssh_client.test_connection_auth()
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 216, in test_connection_auth
        connection = self._get_ssh_connection()
      File "/usr/lib/python3.6/site-packages/tempest/lib/common/ssh.py", line 128, in _get_ssh_connection
        password=self.password)
    tempest.lib.exceptions.SSHTimeout: Connection to the 192.168.24.152 via SSH timed out.
    User: cirros, Password: None

That test's boot request was:

    2021-11-15 12:02:30,178 315743 INFO [tempest.lib.common.rest_client] Request (TestSnapshotPattern:test_snapshot_pattern): 202 POST http://192.168.24.3:8774/v2.1/servers 2.413s

With request ID req-301843dd-1ad1-4548-a5d4-d1f10f57b5bb:

        Response - Headers: {'date': 'Mon, 15 Nov 2021 12:02:27 GMT', 'server': 'Apache', 'content-length': '400', 'location': 'http://192.168.24.3:8774/v2.1/servers/e0021d5d-daa0-40bb-9be6-f2d203dc222e', 'openstack-api-version': 'compute 2.1', 'x-openstack-nova-api-version': '2.1', 'vary': 'OpenStack-API-Version,X-OpenStack-Nova-API-Version', 'x-openstack-request-id': 'req-301843dd-1ad1-4548-a5d4-d1f10f57b5bb', 'x-compute-request-id': 'req-301843dd-1ad1-4548-a5d4-d1f10f57b5bb', 'connection': 'close', 'content-type': 'application/json', 'status': '202', 'content-location': 'http://192.168.24.3:8774/v2.1/servers'}

The instance ID is e0021d5d-daa0-40bb-9be6-f2d203dc222e:

2021-11-15 12:02:31.607 7 INFO nova.compute.claims [req-301843dd-1ad1-4548-a5d4-d1f10f57b5bb 091e701229054f2793ed6f17dc4cf675 8ee1e30c6e7f4fa0a346af8667b600db - default default] [instance: e0021d5d-daa0-40bb-9be6-f2d203dc222e] Claim successful on node standalone.localdomain

As far as nova-compute knows, it built just fine:

2021-11-15 12:02:45.126 7 INFO nova.compute.manager [req-301843dd-1ad1-4548-a5d4-d1f10f57b5bb 091e701229054f2793ed6f17dc4cf675 8ee1e30c6e7f4fa0a346af8667b600db - default default] [instance:
e0021d5d-daa0-40bb-9be6-f2d203dc222e] Took 13.59 seconds to build instance.

Nothing suspicious in the instance qemu log at qemu/instance-00000001.log either:

[...]
char device redirected to /dev/pts/0 (label charserial0)
2021-11-15T12:02:44.362617Z qemu-kvm: -device cirrus-vga,id=video0,bus=pcie.0,addr=0x1: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead
2021-11-15T12:08:08.225959Z qemu-kvm: terminating on signal 15 from pid 205746 (/usr/sbin/virtqemud)
2021-11-15 12:08:08.629+0000: shutting down, reason=destroyed

Interestingly, there is no console.log file for our instance. There are however, console.log files for 2 other instances:

/var/lib/nova/instances/9f27d8ce-045d-4bc8-a0fd-735b8952b9c9/console.log
/var/lib/nova/instances/11f894e3-a732-47d9-b5a3-f4762417a958/console.log

Neither of which appear in the tempest_run.log file, presumably because those tests passed?

Using 9f27d8ce-045d-4bc8-a0fd-735b8952b9c9 as a "passing control" of sorts, the only apparent difference in its qemu log file qemu/instance-0000000d.log is the lack of desctruction, it ends with:

char device redirected to /dev/pts/7 (label charserial0)
2021-11-15T14:03:39.005096Z qemu-kvm: -device cirrus-vga,id=video0,bus=pcie.0,addr=0x1: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead

Its virtlogd entires looks identical to the ones for the failing instances:

Good instance:

021-11-15 13:32:35.057+0000: 197248: debug : virRotatingFileWriterEntryNew:104 : Opening /var/log/libvirt/qemu/instance-0000000d.log mode=0600
2021-11-15 13:32:35.057+0000: 197248: debug : virThreadJobClear:122 : Thread 197248 finished job virLogManagerProtocolDispatchDomainOpenLogFile with ret=1
2021-11-15 13:32:35.087+0000: 197248: debug : virThreadJobSet:97 : Thread 197248 is now running job virLogManagerProtocolDispatchDomainOpenLogFile
2021-11-15 13:32:35.087+0000: 197248: debug : virLogManagerProtocolDispatchDomainOpenLogFileHelper:76 : server=0x55d60a698880 client=0x55d60a6a4260 msg=0x55d60a6a8690 rerr=0x7ffe699dad90 arg
s=0x55d60a6aff80 ret=0x55d60a6a8a10
2021-11-15 13:32:35.087+0000: 197248: debug : virRotatingFileWriterEntryNew:104 : Opening /var/lib/nova/instances/9f27d8ce-045d-4bc8-a0fd-735b8952b9c9/console.log mode=0600
2021-11-15 13:32:35.087+0000: 197248: debug : virThreadJobClear:122 : Thread 197248 finished job virLogManagerProtocolDispatchDomainOpenLogFile with ret=1
2021-11-15 13:32:35.087+0000: 197248: debug : virThreadJobSet:97 : Thread 197248 is now running job virLogManagerProtocolDispatchDomainGetLogFilePosition
2021-11-15 13:32:35.087+0000: 197248: debug : virLogManagerProtocolDispatchDomainGetLogFilePositionHelper:49 : server=0x55d60a698880 client=0x55d60a6a4260 msg=0x55d60a6a8690 rerr=0x7ffe699dad90 args=0x55d60a6a8170 ret=0x55d60a6aaec0
2021-11-15 13:32:35.087+0000: 197248: debug : virThreadJobClear:122 : Thread 197248 finished job virLogManagerProtocolDispatchDomainGetLogFilePosition with ret=0

Failing instance:

2021-11-15 12:02:43.503+0000: 197248: debug : virRotatingFileWriterEntryNew:104 : Opening /var/log/libvirt/qemu/instance-00000001.log mode=0600
2021-11-15 12:02:43.503+0000: 197248: debug : virThreadJobClear:122 : Thread 197248 finished job virLogManagerProtocolDispatchDomainOpenLogFile with ret=1
2021-11-15 12:02:43.586+0000: 197248: debug : virThreadJobSet:97 : Thread 197248 is now running job virLogManagerProtocolDispatchDomainOpenLogFile
2021-11-15 12:02:43.586+0000: 197248: debug : virLogManagerProtocolDispatchDomainOpenLogFileHelper:76 : server=0x55d60a698880 client=0x55d60a6a4040 msg=0x55d60a699cb0 rerr=0x7ffe699dad90 arg
s=0x55d60a6a8130 ret=0x55d60a6a8170
2021-11-15 12:02:43.586+0000: 197248: debug : virRotatingFileWriterEntryNew:104 : Opening /var/lib/nova/instances/e0021d5d-daa0-40bb-9be6-f2d203dc222e/console.log mode=0600
2021-11-15 12:02:43.586+0000: 197248: debug : virThreadJobClear:122 : Thread 197248 finished job virLogManagerProtocolDispatchDomainOpenLogFile with ret=1
2021-11-15 12:02:43.594+0000: 197248: debug : virThreadJobSet:97 : Thread 197248 is now running job virLogManagerProtocolDispatchDomainGetLogFilePosition
2021-11-15 12:02:43.594+0000: 197248: debug : virLogManagerProtocolDispatchDomainGetLogFilePositionHelper:49 : server=0x55d60a698880 client=0x55d60a6a4040 msg=0x55d60a699cb0 rerr=0x7ffe699da
d90 args=0x55d60a6a8b60 ret=0x55d60a6a8b80
2021-11-15 12:02:43.594+0000: 197248: debug : virThreadJobClear:122 : Thread 197248 finished job virLogManagerProtocolDispatchDomainGetLogFilePosition with ret=0
2021-11-15 12:02:44.880+0000: 197248: debug : virLogDaemonClientFree:314 : priv=0x55d60a699d80 client=205746