Unchecked MSR access error - overcloud deploy "timed out waiting for ping module test
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
Critical
|
Unassigned |
Bug Description
FATAL | Wait for connection to become available | 192.168.24.30 | error={"changed": false, "elapsed": 2402, "msg": "timed out waiting for ping module test success: Data could not be sent to remote host \"192.168.24.30\". Make sure this host can be reached over ssh: Warning: Permanently added '192.168.24.30' (ECDSA) to the list of known hosts.\
One of the OVB baremetal nodes has a "unchecked MSR access error" when booting, and no network interfaces are discovered.
Possibly related bug: https:/
-- Console log for failing VM's from two separate runs --
https:/
https:/
Other VMs are OK, such as:
There is a call trace on the OVB node console:
[[0;32m OK [[ 6.271140] unchecked MSR access error: RDMSR from 0xda0 at rIP: 0xffffffff8ac69e23 (native_
0m] Started Moni[ 6.272285] Call Trace:
toring of LVM2 m[ 6.272733] kvm_arch_
irrors,���sing d[ 6.273323] ? __kmalloc_
meventd or progr[ 6.273887] ? alloc_cpumask_
ess polling.
[ 6.274499] kvm_init+0x98/0x2b0 [kvm]
[ 6.274988] ? svm_hardware_
[ 6.275554] do_one_
[ 6.275970] ? do_init_
[ 6.276395] ? kmem_cache_
[ 6.276906] do_init_
[ 6.277308] load_module+
[ 6.277738] ? __do_sys_
[ 6.278221] __do_sys_
[ 6.278691] do_syscall_
[ 6.279099] entry_SYSCALL_
[ 6.279625] RIP: 0033:0x7fbb1026080e
[ 6.280043] Code: 48 8b 0d 7d 16 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4a 16 2c 00 f7 d8 64 89 01 48
[ 6.282049] RSP: 002b:00007fff43
[ 6.282863] RAX: ffffffffffffffda RBX: 00005584333bb890 RCX: 00007fbb1026080e
[ 6.283619] RDX: 00007fbb10dce86d RSI: 00000000000395e8 RDI: 00005584334005c0
[ 6.284407] RBP: 00007fbb10dce86d R08: 000055843331e01a R09: 0000000000000003
[ 6.285172] R10: 000055843331e010 R11: 0000000000000246 R12: 00005584334005c0
[ 6.285987] R13: 000055843337fbc0 R14: 0000000000020000 R15: 0000000000000000
description: | updated |
We noticed this error in the integration promotion line of various branches, Marking as promotion-blocker as its blocking promotions.
Master:-
https:/ /logserver. rdoproject. org/openstack- periodic- integration- main/opendev. org/openstack/ tripleo- ci/master/ periodic- tripleo- ci-centos- 8-ovb-3ctlr_ 1comp-featurese t035-master/ e04d671/ logs/undercloud /home/zuul/ overcloud_ deploy. log.txt. gz
~~~ a09e-960f- fa12-0000000000 3d | FATAL | Wait for connection to become available | 192.168.24.26 | error={"changed": false, "elapsed": 2401, "msg": "timed out waiting for ping module test success: Data could not be sent to remote host \"192.168.24.26\". Make sure this host can be reached over ssh: Warning: Permanently added '192.168.24.26' (ECDSA) to the list of known hosts.\ r\nheat- admin@192. 168.24. 26: Permission denied (publickey, gssapi- keyex,gssapi- with-mic) .\r\n"}
2021-05-27 03:31:26 | 2021-05-27 03:31:26.426217 | fa163e35-
~~~
https:/ /logserver. rdoproject. org/openstack- periodic- integration- main/opendev. org/openstack/ tripleo- ci/master/ periodic- tripleo- ci-centos- 8-ovb-3ctlr_ 1comp-featurese t035-master/ e04d671/ logs/baremetal_ 1-console. log write_msr+ 0x4/0x20)
~~~
unchecked MSR access error: WRMSR to 0xda0 (tried to write 0x0000000000000000) at rIP: 0xffffffffac069f84 (native_
~~~
Wallaby:- /logserver. rdoproject. org/openstack- periodic- integration- stable1/ opendev. org/openstack/ tripleo- ci/master/ periodic- tripleo- ci-centos- 8-ovb-1ctlr_ 1comp-featurese t002-wallaby/ 00e43b7/ logs/undercloud /home/zuul/ overcloud_ deploy. log.txt. gz
https:/
Ussuri:- /logserver. rdoproject. org/openstack- periodic- integration- stable3/ opendev. org/openstack/ tripleo- ci/master/ periodic- tripleo- ci-centos- 8-ovb-1ctlr_ 1comp-featurese t002-ussuri/ 4f53691/ logs/undercloud /home/zuul/ overcloud_ deploy. log.txt. gz
https:/