Comment 0 for bug 1950310

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

There is various kernel panics are visible in the guest in the nova-live-migration job. It was so far visible on stable/victoria only.

Example run: https://zuul.opendev.org/t/openstack/build/67c89daf17e3475cb1d632f87beeb60d/logs

Nova stack trace:

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall [-] Dynamic interval looping call 'oslo_service.loopingcall.RetryDecorator.__call__.<locals>._func' failed: nova.exception.DeviceDetachFailed: Device detach failed for vdb: Unable to detach the device from the live config.

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall Traceback (most recent call last):

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall File "/usr/local/lib/python3.8/dist-packages/oslo_service/loopingcall.py", line 150, in _run_loop

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall result = func(*self.args, **self.kw)

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall File "/usr/local/lib/python3.8/dist-packages/oslo_service/loopingcall.py", line 428, in _func

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall return self._sleep_time

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 220, in __exit__

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall self.force_reraise()

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall six.reraise(self.type_, self.value, self.tb)

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall File "/usr/local/lib/python3.8/dist-packages/six.py", line 703, in reraise

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall raise value

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall File "/usr/local/lib/python3.8/dist-packages/oslo_service/loopingcall.py", line 407, in _func

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall result = f(*args, **kwargs)

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 483, in _do_wait_and_retry_detach

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall raise exception.DeviceDetachFailed(

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall nova.exception.DeviceDetachFailed: Device detach failed for vdb: Unable to detach the device from the live config.

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: ERROR oslo.service.loopingcall

Nov 08 16:43:33.983139 ubuntu-focal-rax-iad-0027262323 nova-compute[52781]: WARNING nova.virt.block_device [None req-9e67da96-3f0f-4e77-8bc3-0682b443b86c tempest-LiveAutoBlockMigrationV225Test-2108607333 tempest-LiveAutoBlockMigrationV225Test-2108607333-project] [instance: 06c70e38-8f0e-43a9-b783-cda99fa6b5e0] Guest refused to detach volume 8806a6e3-b560-4f72-8f1b-8cef2c257fa1: nova.exception.DeviceDetachFailed: Device detach failed for vdb: Unable to detach the device from the live config.

https://zuul.opendev.org/t/openstack/build/67c89daf17e3475cb1d632f87beeb60d/log/compute1/logs/screen-n-cpu.txt#10396-10491

Guest kernel panic:

2021-11-08 16:41:35,132 99547 DEBUG [tempest.common.waiters] Console output for 06c70e38-8f0e-43a9-b783-cda99fa6b5e0
body=
[ 15.293919] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
[ 15.298512] BUG: unable to handle page fault for address: ffff91bdc256c400
[ 15.299353] #PF: supervisor instruction fetch in kernel mode
[ 15.299943] #PF: error_code(0x0011) - permissions violation
[ 15.300902] PGD 5e01067 P4D 5e01067 PUD 5e02067 PMD 80000000024001e3
[ 15.302056] Oops: 0011 [#1] SMP NOPTI
[ 15.302770] CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 5.3.0-26-generic #28~18.04.1-Ubuntu
[ 15.303549] Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 15.305100] RIP: 0010:0xffff91bdc256c400
[ 15.305758] Code: 6f 72 69 74 79 20 66 6f 72 20 25 64 20 28 25 73 29 00 25 64 20 28 25 73 29 20 6f 6c 64 20 70 72 69 6f 72 69 74 79 20 25 64 2c <20> 6e 65 77 20 70 72 69 6f 72 69 74 79 20 25 64 0a 00 70 72 6f 63
[ 15.307322] RSP: 0018:ffffa59240053de8 EFLAGS: 00000292
[ 15.307835] RAX: ffff91bdc256c400 RBX: ffff91bdc762b4c0 RCX: ffff91bdc3239900
[ 15.308457] RDX: 0000000000000400 RSI: ffffa59240053df8 RDI: ffff91bdc67e6d88
[ 15.309109] RBP: ffffa59240053e48 R08: 0000000000000000 R09: 0000000000000001
[ 15.309725] R10: ffffa59240053bc0 R11: 000000000430d5a0 R12: ffff91bdc762b510
[ 15.310344] R13: ffffa59240053df8 R14: 000000000000000a R15: 0000000000000202
[ 15.311176] FS: 0000000000000000(0000) GS:ffff91bdc7600000(0000) knlGS:0000000000000000
[ 15.311892] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.312403] CR2: ffff91bdc256c400 CR3: 0000000002f6c000 CR4: 00000000000006f0
[ 15.313336] Call Trace:
[ 15.314947] ? rcu_core+0x2b5/0x450
[ 15.315635] rcu_core_si+0xe/0x10
[ 15.316138] __do_softirq+0xe4/0x2da
[ 15.316573] run_ksoftirqd+0x2b/0x40
[ 15.316952] smpboot_thread_fn+0xfc/0x170
[ 15.317326] kthread+0x121/0x140
[ 15.317648] ? sort_range+0x30/0x30
[ 15.317980] ? kthread_park+0xb0/0xb0
[ 15.318337] ret_from_fork+0x35/0x40
[ 15.318819] Modules linked in: ip_tables x_tables nls_utf8 nls_iso8859_1 nls_ascii isofs hid_generic usbhid hid virtio_rng virtio_gpu drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_scsi virtio_net net_failover failover virtio_input virtio_blk qemu_fw_cfg 9pnet_virtio 9pnet pcnet32 8139cp mii ne2k_pci 8390 e1000
[ 15.322811] CR2: ffff91bdc256c400
[ 15.324234] ---[ end trace 73d738baa971ca73 ]---
[ 15.324797] RIP: 0010:0xffff91bdc256c400
[ 15.325176] Code: 6f 72 69 74 79 20 66 6f 72 20 25 64 20 28 25 73 29 00 25 64 20 28 25 73 29 20 6f 6c 64 20 70 72 69 6f 72 69 74 79 20 25 64 2c <20> 6e 65 77 20 70 72 69 6f 72 69 74 79 20 25 64 0a 00 70 72 6f 63
[ 15.326679] RSP: 0018:ffffa59240053de8 EFLAGS: 00000292
[ 15.327142] RAX: ffff91bdc256c400 RBX: ffff91bdc762b4c0 RCX: ffff91bdc3239900
[ 15.327742] RDX: 0000000000000400 RSI: ffffa59240053df8 RDI: ffff91bdc67e6d88
[ 15.328342] RBP: ffffa59240053e48 R08: 0000000000000000 R09: 0000000000000001
[ 15.328964] R10: ffffa59240053bc0 R11: 000000000430d5a0 R12: ffff91bdc762b510
[ 15.329563] R13: ffffa59240053df8 R14: 000000000000000a R15: 0000000000000202
[ 15.330167] FS: 0000000000000000(0000) GS:ffff91bdc7600000(0000) knlGS:0000000000000000
[ 15.330854] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.331348] CR2: ffff91bdc256c400 CR3: 0000000002f6c000 CR4: 00000000000006f0
[ 15.332107] Kernel panic - not syncing: Fatal exception in interrupt
[ 15.333470] Kernel Offset: 0x34800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 15.334628] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Additional hits: https://paste.opendev.org/show/810868/