Some updates, it turned out to be garbage ifcfg-ens3 network-scripts on overcloud nodes, to confirm this for testing we cleaned it up before running NetworkConfig https://review.opendev.org/#/c/711755/ and the deployment moved forward and we seen a green run for CentOS8. Test results can be seen in https://review.rdoproject.org/r/#/c/25332/.
4) Baremetal nodes provision failed due to qemu scientific notation bug:- https://bugs.launchpad.net/oslo.utils/+bug/1864529
This is fixed already in oslo.utils but required patch is not yet available in current-tripleo. /me didn't got why it failed randomly with same overcloud-full image.
5) overcloud deployment just stucks
For this logs didn't get collected, but i noticed in other underecloud job too, and there we got collected vm console log and it turned out to be kernel panic http://paste.openstack.org/show/790477/, so same could be true here. From kernel panic logs we seen "atop Tainted", don't know if atop binary from el7 on CentOS8 can cause that, but that's being fixed at https://review.opendev.org/711894.
I think all these random issues need to be fixed/diagnosed seperately as needs different expertise.
Some updates, it turned out to be garbage ifcfg-ens3 network-scripts on overcloud nodes, to confirm this for testing we cleaned it up before running NetworkConfig https:/ /review. opendev. org/#/c/ 711755/ and the deployment moved forward and we seen a green run for CentOS8. Test results can be seen in https:/ /review. rdoproject. org/r/# /c/25332/.
Green runs:- /logserver. rdoproject. org/32/ 25332/53/ check/periodic- tripleo- ci-centos- 8-ovb-1ctlr_ 1comp-featurese t001-master/ 820ce3c/ /logserver. rdoproject. org/32/ 25332/53/ check/periodic- tripleo- ci-centos- 8-ovb-3ctlr_ 1comp-featurese t001-master/ d02fa66/
https:/
https:/
Fix for the ens3 issue:-
ens3 can be cleaned up in the base image used to prepare overcloud-full or in some disk image element.
Though we seen two green runs for c8 ovb, there were couple of random issues noticed during multiple runs:-
1) Mar 07 17:13:27 overcloud- novacompute- 0 network[10026]: Bringing up interface eth4: Error: Connection activation failed: IP configuration could not be reserved (no available address, timeout, etc.) novacompute- 0 network[10026]: Hint: use 'journalctl -xe NM_CONNECTION= 84d43311- 57c8-8986- f205-9c78cd6ef5 d2 + NM_DEVICE=eth4' to get more details. novacompute- 0 network[10026]: [FAILED] /logserver. rdoproject. org/32/ 25332/52/ check/periodic- tripleo- ci-centos- 8-ovb-3ctlr_ 1comp-featurese t001-master/ 7d36dbc/ logs/overcloud- novacompute- 0/var/log/ journal. txt.gz
Mar 07 17:13:27 overcloud-
Mar 07 17:13:27 overcloud-
Logs:- https:/
2) Baremetal nodes goes to deploy_failed state randomly:- /logserver. rdoproject. org/32/ 25332/53/ check/periodic- tripleo- ci-centos- 8-ovb-1ctlr_ 1comp-featurese t001-master- test1/2edc8af/ logs/undercloud /var/log/ extra/baremetal _list.txt. gz
Logs:-
https:/
3) Tempest failures:- /logserver. rdoproject. org/32/ 25332/53/ check/periodic- tripleo- ci-centos- 8-ovb-1ctlr_ 1comp-featurese t001-master- test2/0c36de4/ logs/undercloud /var/log/ tempest/ stestr_ results. html.gz
Logs:- https:/
4) Baremetal nodes provision failed due to qemu scientific notation bug:- https:/ /bugs.launchpad .net/oslo. utils/+ bug/1864529
This is fixed already in oslo.utils but required patch is not yet available in current-tripleo. /me didn't got why it failed randomly with same overcloud-full image.
5) overcloud deployment just stucks paste.openstack .org/show/ 790477/, so same could be true here. From kernel panic logs we seen "atop Tainted", don't know if atop binary from el7 on CentOS8 can cause that, but that's being fixed at https:/ /review. opendev. org/711894.
For this logs didn't get collected, but i noticed in other underecloud job too, and there we got collected vm console log and it turned out to be kernel panic http://
I think all these random issues need to be fixed/diagnosed seperately as needs different expertise.