Comment 6 for bug 1922283

Revision history for this message
Stephen Finucane (stephenfinucane) wrote :

I took a look at this this morning. Looking at the CI logs [1], we see:

    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif [req-424c6e06-e7b0-4f3e-9154-999ae266f161 cf6941ba4a9f4e0faafbec5b94d5301d 9e6fe54a7822461a9896649072c253c9 - default default] Failed to plug vif VIFBridge(active=False,address=fa:16:3e:d5:18:32,bridge_name='qbr58ee300d-6b',has_traffic_filtering=True,id=58ee300d-6bc2-42f7-b22b-3bd120109b16,network=Network(5bd33c19-f1b4-4a86-8246-85ac446c3bd8),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap58ee300d-6b'): pyroute2.netlink.exceptions.NetlinkError: (1, 'Operation not permitted')
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif Traceback (most recent call last):
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/os_vif/__init__.py", line 77, in plug
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif plugin.plug(vif, instance_info)
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/vif_plug_ovs/ovs.py", line 299, in plug
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif self._plug_bridge(vif, instance_info)
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/vif_plug_ovs/ovs.py", line 218, in _plug_bridge
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif linux_net.create_veth_pair(v1_name, v2_name, mtu)
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/oslo_privsep/priv_context.py", line 247, in _wrap
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif return self.channel.remote_call(name, args, kwargs)
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/oslo_privsep/daemon.py", line 224, in remote_call
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif raise exc_type(*result[2])
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif pyroute2.netlink.exceptions.NetlinkError: (1, 'Operation not permitted')

Looking at 'secure.txt' [2], we see that the request for elevated permissions was accepted and there doesn't appear to be any issues, so it's probably not that:

    Apr 1 10:07:45 standalone sudo[168779]: nova : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/bin/nova-rootwrap /etc/nova/rootwrap.conf privsep-helper --config-file /usr/share/nova/nova-dist.conf --config-file /etc/nova/nova.conf --privsep_context vif_plug_ovs.privsep.vif_plug --privsep_sock_path /tmp/tmp1ckrh_yo/privsep.sock
    Apr 1 10:07:45 standalone sudo[168779]: pam_unix(sudo:session): session opened for user root by (uid=0)
    Apr 1 10:07:46 standalone sudo[168779]: pam_unix(sudo:session): session closed for user root

Looking at 'journal.txt' [3], we don't see any logs for the offending tap interface, 'tap58ee300d-6b', but we *do* see some entries for the offending bridge interface, 'qbr58ee300d-6b':

    Apr 01 10:07:46 standalone.localdomain NetworkManager[1055]: <info> [1617271666.9819] manager: (qbr58ee300d-6b): new Bridge device (/org/freedesktop/NetworkManager/Devices/17)
    Apr 01 10:07:46 standalone.localdomain systemd-udevd[168834]: Using default interface naming scheme 'rhel-8.0'.
    Apr 01 10:07:46 standalone.localdomain systemd-udevd[168834]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
    Apr 01 10:07:46 standalone.localdomain systemd-udevd[168834]: Could not generate persistent MAC address for qbr58ee300d-6b: No such file or directory

A quick search brings up what appears to be a related systemd issue [4]. The fix for this issue was included in systemd v242 [5]. This machine has systemd 239-44.el8 [6]:

    systemd.x86_64 239-44.el8 @baseos
    systemd-container.x86_64 239-44.el8 @quickstart-centos-base
    systemd-libs.x86_64 239-44.el8 @System
    systemd-pam.x86_64 239-44.el8 @baseos
    systemd-udev.x86_64 239-44.el8 @baseos

So my money is on this being a systemd bug that we need fixes for CentOS 8/Stream.

[1] https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario007-standalone-master/df5f2e4/logs/undercloud/var/log/extra/errors.txt.txt.gz
[2] https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario007-standalone-master/df5f2e4/logs/undercloud/var/log/secure.txt.gz
[3] https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario007-standalone-master/df5f2e4/logs/undercloud/var/log/extra/journal.txt.gz
[4] https://github.com/systemd/systemd/issues/3374#issuecomment-283665247
[5] https://github.com/systemd/systemd/commit/6d36464065601f79a352367cf099be8907d8f9aa
[6] https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario007-standalone-master/df5f2e4/logs/undercloud/var/log/extra/package-list-installed.txt.gz