master: Operation not permitted errors in scenario007

Bug #1922283 reported by wes hayutin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

two similar stack traces in the same job..

 2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif Traceback (most recent call last):
2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/os_vif/__init__.py", line 77, in plug
2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif plugin.plug(vif, instance_info)
2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/vif_plug_ovs/ovs.py", line 299, in plug
2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif self._plug_bridge(vif, instance_info)
2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/vif_plug_ovs/ovs.py", line 218, in _plug_bridge
2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif linux_net.create_veth_pair(v1_name, v2_name, mtu)
2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/oslo_privsep/priv_context.py", line 247, in _wrap
2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif return self.channel.remote_call(name, args, kwargs)
2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/oslo_privsep/daemon.py", line 224, in remote_call
2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif raise exc_type(*result[2])
2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif pyroute2.netlink.exceptions.NetlinkError: (1, 'Operation not permitted')

2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest File "/usr/lib/python3.6/site-packages/nova/virt/libvirt/guest.py", line 155, in launch
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest return self._domain.createWithFlags(flags)
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 190, in doit
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest result = proxy_call(self._autowrap, f, *args, **kwargs)
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 148, in proxy_call
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest rv = execute(f, *args, **kwargs)
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 129, in execute
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest six.reraise(c, e, tb)
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest File "/usr/lib/python3.6/site-packages/six.py", line 703, in reraise
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest raise value
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest File "/usr/lib/python3.6/site-packages/eventlet/tpool.py", line 83, in tworker
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest rv = meth(*args, **kwargs)
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest File "/usr/lib64/python3.6/site-packages/libvirt.py", line 1270, in createWithFlags
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest libvirt.libvirtError: Unable to create tap device tap1feb1b7a-f7: Operation not permitted
2021-04-01 17:57:06.283 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR nova.virt.libvirt.guest

https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario007-standalone-master/df5f2e4/logs/undercloud/var/log/extra/errors.txt.txt.gz

https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario007-standalone-master/b4f955e/logs/undercloud/var/log/extra/errors.txt.txt.gz

wes hayutin (weshayutin)
summary: - master: libvirt.libvirtError: Unable to create tap device
- tap1feb1b7a-f7: Operation not permitted
+ master: Operation not permitted errors in scenario007
Changed in tripleo:
importance: Undecided → Critical
Revision history for this message
chandan kumar (chkumar246) wrote :

https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario007-standalone-master/df5f2e4/logs/undercloud/var/log/tempest/tempest_run.log.txt.gz

{0} tempest.scenario.test_network_basic_ops.TestNetworkBasicOps.test_network_basic_ops [39.196997s] ... FAILED

Captured traceback:
~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):
      File "/usr/lib/python3.6/site-packages/tempest/common/utils/__init__.py", line 70, in wrapper
        return f(*func_args, **func_kwargs)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 432, in test_network_basic_ops
        self._setup_network_and_servers()
      File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 120, in _setup_network_and_servers
        server = self._create_server(self.network, port_id)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/test_network_basic_ops.py", line 172, in _create_server
        security_groups=security_groups)
      File "/usr/lib/python3.6/site-packages/tempest/scenario/manager.py", line 317, in create_server
        image_id=image_id, **kwargs)
      File "/usr/lib/python3.6/site-packages/tempest/common/compute.py", line 266, in create_test_server
        server['id'])
      File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 227, in __exit__
        self.force_reraise()
      File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
        raise self.value
      File "/usr/lib/python3.6/site-packages/tempest/common/compute.py", line 237, in create_test_server
        clients.servers_client, server['id'], wait_until)
      File "/usr/lib/python3.6/site-packages/tempest/common/waiters.py", line 76, in wait_for_server_status
        server_id=server_id)
    tempest.exceptions.BuildErrorException: Server 1b6eabe9-d8ef-4ef0-992d-0294e8822f93 failed to build and is in ERROR status
    Details: {'code': 500, 'created': '2021-04-01T10:07:49Z', 'message': 'Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 1b6eabe9-d8ef-4ef0-992d-0294e8822f93.'}

Revision history for this message
chandan kumar (chkumar246) wrote :

We cannot skip this test https://review.rdoproject.org/r/gitweb?p=rdo-jobs.git;a=blob;f=zuul.d/standalone-jobs.yaml#l324 as only single test run there, after skipping, it will fail the job, I think the issues needs to be fixed.

Revision history for this message
yatin (yatinkarel) wrote :

Just to add it's not specific to master, it's being seen across multiple releases, some context from IRC in past:-
Mar 25 17:05:08 <ykarel> chkumar|ruck, zbr|rover is random failure in scenario007 job already known?
Mar 25 17:05:18 <ykarel> tempest fails
Mar 25 17:05:40 <ykarel> https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-scenario007-standalone-master
Mar 25 17:05:58 <ykarel> seeing similiar failure in other releases too Ussuri+
Mar 25 17:06:21 <ykarel> not seen in train
Mar 25 17:06:51 <ykarel> https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-scenario007-standalone-train
Mar 25 17:20:52 <chkumar|ruck> ykarel: nope
Mar 25 17:20:58 <chkumar|ruck> ykarel: let me keep an eye there
Mar 25 17:21:19 <ykarel> chkumar|ruck, ack then have a bug seen multiple failure across releases
Mar 25 17:23:23 <chkumar|ruck> ykarel: for ussuri line I saw random failure, on rerun it vanished
Mar 25 17:24:02 <ykarel> chkumar|ruck, on master above last 4 failures are similar
Mar 25 17:24:09 <ykarel> seen similar cases in ussuri and victoria

Revision history for this message
chandan kumar (chkumar246) wrote :

By looking at master line https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-scenario007-standalone-master it seems to be passing but we need to investigate why it comes always.

Revision history for this message
Stephen Finucane (stephenfinucane) wrote :
Download full text (5.8 KiB)

I took a look at this this morning. Looking at the CI logs [1], we see:

    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif [req-424c6e06-e7b0-4f3e-9154-999ae266f161 cf6941ba4a9f4e0faafbec5b94d5301d 9e6fe54a7822461a9896649072c253c9 - default default] Failed to plug vif VIFBridge(active=False,address=fa:16:3e:d5:18:32,bridge_name='qbr58ee300d-6b',has_traffic_filtering=True,id=58ee300d-6bc2-42f7-b22b-3bd120109b16,network=Network(5bd33c19-f1b4-4a86-8246-85ac446c3bd8),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=False,vif_name='tap58ee300d-6b'): pyroute2.netlink.exceptions.NetlinkError: (1, 'Operation not permitted')
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif Traceback (most recent call last):
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/os_vif/__init__.py", line 77, in plug
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif plugin.plug(vif, instance_info)
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/vif_plug_ovs/ovs.py", line 299, in plug
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif self._plug_bridge(vif, instance_info)
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/vif_plug_ovs/ovs.py", line 218, in _plug_bridge
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif linux_net.create_veth_pair(v1_name, v2_name, mtu)
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/oslo_privsep/priv_context.py", line 247, in _wrap
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif return self.channel.remote_call(name, args, kwargs)
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif File "/usr/lib/python3.6/site-packages/oslo_privsep/daemon.py", line 224, in remote_call
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif raise exc_type(*result[2])
    2021-04-01 10:07:47.034 ERROR /var/log/containers/nova/nova-compute.log: 7 ERROR os_vif pyroute2.netlink.exceptions.NetlinkError: (1, 'Operation not permitted')

Looking at 'secure.txt' [2], we see that the request for elevated permissions was accepted and there doesn't appear to be any issues, so it's probably not that:

    Apr 1 10:07:45 standalone sudo[168779]: nova : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/bin/nova-rootwrap /etc/nova/rootwrap.conf privsep-helper --config-file /usr/share/nova/nova-dist.conf --config-file /etc/nova/nova.conf --privsep_context vif_plug_ovs.privsep.vif_plug --privsep_sock_path /tmp/tmp1ckrh_yo/privsep.sock
    Apr 1 10:07:45 standalone sudo[168779]: pam_unix(sudo:session): session opened for user root by (uid=0)
    Apr 1 10:07:46 standalone sudo[168779...

Read more...

Changed in tripleo:
milestone: wallaby-rc1 → xena-1
Revision history for this message
wes hayutin (weshayutin) wrote :
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.