RHOSP13- R5.0.2-315- DPDK- VM launch is failing in dpdk setup

Bug #1800345 reported by alok kumar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R5.0
Fix Released
Critical
alexey-mr
Trunk
Fix Committed
Critical
alexey-mr

Bug Description

VM launch in DPDK RHOSP13 setup is failing with below traceback and permission error in nova log:

2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [req-98c9a8f8-e1c4-4cc6-b56a-2892ae975c95 7e651a59737d427fa1b8afcda4c9f844 6766140617b7491c9d70dce5b7605413 - default default] [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] Instance failed to spawn: libvirtError: internal error: process exited while connecting to monitor: 2018-10-28T11:37:52.902261Z qemu-kvm: -chardev socket,id=charnet0,path=/var/run/vrouter/uvh_vif_tapd8da8e1b-03,server: Failed to bind socket to /var/run/vrouter/uvh_vif_tapd8da8e1b-03: Permission denied

2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] Traceback (most recent call last):
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2236, in _build_resources
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] yield resources
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2016, in _build_and_run_instance
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] block_device_info=block_device_info)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3100, in spawn
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] destroy_disks_on_failure=True)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5590, in _create_domain_and_network
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] destroy_disks_on_failure)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] self.force_reraise()
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] six.reraise(self.type_, self.value, self.tb)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5559, in _create_domain_and_network
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] post_xml_callback=post_xml_callback)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5494, in _create_domain
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] guest.launch(pause=pause)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 144, in launch
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] self._encoded_xml, errors='ignore')
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] self.force_reraise()
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] six.reraise(self.type_, self.value, self.tb)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 139, in launch
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] return self._domain.createWithFlags(flags)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] result = proxy_call(self._autowrap, f, *args, **kwargs)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] rv = execute(f, *args, **kwargs)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] six.reraise(c, e, tb)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] rv = meth(*args, **kwargs)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1099, in createWithFlags
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2018-10-28 11:37:53.425 1 ERROR nova.compute.manager [instance: a4343082-ac51-4f15-b017-f692fcdd5d31] libvirtError: internal error: process exited while connecting to monitor: 2018-10-28T11:37:52.902261Z qemu-kvm: -chardev socket,id=charnet0,path=/var/run/vrouter/uvh_vif_tapd8da8e1b-03,server: Failed to bind socket to /var/run/vrouter/uvh_vif_tapd8da8e1b-03: Permission denied

setup info:
undercloud hypervisor: 10.204.217.133
undercloud: 192.168.122.68

(undercloud) [stack@queensa ~]$ openstack server list
+--------------------------------------+--------------------------------+--------+------------------------+----------------+---------------------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+--------------------------------+--------+------------------------+----------------+---------------------+
| 4bc39e01-5c16-414b-85e7-69dd56a4fc60 | overcloud-contrailcontroller-2 | ACTIVE | ctlplane=192.168.24.20 | overcloud-full | contrail-controller |
| 15c99320-c511-4261-99a4-ffd644841546 | overcloud-contraildpdk-0 | ACTIVE | ctlplane=192.168.24.16 | overcloud-full | compute-dpdk |
| 1ec0b1f6-7229-4b9b-8f96-7e22be50b0af | overcloud-contraildpdk-1 | ACTIVE | ctlplane=192.168.24.19 | overcloud-full | compute-dpdk |
| 6ae004ab-e2c2-4995-a2d9-44e7d33f9aec | overcloud-contrailcontroller-1 | ACTIVE | ctlplane=192.168.24.15 | overcloud-full | contrail-controller |
| 93c019ba-ef94-4db8-9bad-098bd4ab912f | overcloud-controller-0 | ACTIVE | ctlplane=192.168.24.13 | overcloud-full | control |
| 57e51180-8b1c-4091-9a42-bedf14a2a57d | overcloud-controller-1 | ACTIVE | ctlplane=192.168.24.21 | overcloud-full | control |
| c22a45b0-8556-473f-ae28-cd26e72f2f19 | overcloud-contraildpdk-2 | ACTIVE | ctlplane=192.168.24.10 | overcloud-full | compute-dpdk |
| c57932e9-2a90-409f-82df-f59df00704a9 | overcloud-controller-2 | ACTIVE | ctlplane=192.168.24.8 | overcloud-full | control |
| 9818d44a-5d03-40d5-84fc-83a367616991 | overcloud-contrailcontroller-0 | ACTIVE | ctlplane=192.168.24.14 | overcloud-full | contrail-controller |
+--------------------------------------+--------------------------------+--------+------------------------+----------------+---------------------+

alok kumar (kalok)
tags: added: blocker
tags: removed: blocker
Revision history for this message
Jeya ganesh babu J (jjeya) wrote :

seems like issue with selinux policy related to qemu-kvm binary. Creation of socket file fails for qemu-kvm.
audit/audit.log:type=AVC msg=audit(1540698565.280:4203): avc: denied { create } for pid=272685 comm="qemu-kvm" name="uvh_vif_tap8683f851-40" scontext=system_u:system_r:svirt_t:s0:c724,c837 tcontext=system_u:object_r:var_run_t:s0 tclass
=sock_file
audit/audit.log:type=AVC msg=audit(1540698592.575:4277): avc: denied { create } for pid=273291 comm="qemu-kvm" name="uvh_vif_tapc7609d88-0e" scontext=system_u:system_r:svirt_t:s0:c139,c849 tcontext=system_u:object_r:var_run_t:s0 tclass
=sock_file

Revision history for this message
Vinod Nair (vinodnair) wrote :

With selinux set to permissive VM's are getting launched..

Revision history for this message
alok kumar (kalok) wrote :

we were not required to change the selinux settings earlier.

Right now default setting is 'Enforcing' and changing it to permissive, VM launch works fine.
[root@overcloud-contraildpdk-1 heat-admin]# getenforce
Enforcing

Jeya, probably this looks like provisioning issue?

alok kumar (kalok)
tags: added: releasenote
removed: sanityblocker
alok kumar (kalok)
tags: added: sanityblocker
Revision history for this message
Vinod Nair (vinodnair) wrote :

SELINUX as enforcing was working till i think build build 311 and is a regression

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] stable/queens

Review in progress for https://review.opencontrail.org/47370
Submitter: alexey-mr (<email address hidden>)

Revision history for this message
alok kumar (kalok) wrote :

Alexey had suggested below steps as a workaround and with this VM launch seems to be working fine even when selinux is set to 'enforcing'.

replace contrail_dpdk.te file at /tmp with below content:
[root@overcloud-contraildpdk-0 tmp]# cat contrail_dpdk.te
module contrail_dpdk 1.0;

require {
        type container_var_run_t;
        type svirt_t;
        type var_run_t;
        class sock_file { create unlink };
        class dir { add_name remove_name write };
}

#============= svirt_t ==============
allow svirt_t container_var_run_t:dir { add_name remove_name write };
allow svirt_t container_var_run_t:sock_file { create unlink };
allow svirt_t var_run_t:sock_file { create unlink };

then execute below commands:

/bin/checkmodule -M -m -o /tmp/contrail_dpdk.mod /tmp/contrail_dpdk.te
/bin/semodule_package -o /tmp/contrail_dpdk.pp -m /tmp/contrail_dpdk.mod
/sbin/semodule -i /tmp/contrail_dpdk.pp

alok kumar (kalok)
tags: added: provisioning
tags: removed: releasenote
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/47370
Committed: http://github.com/Juniper/contrail-tripleo-heat-templates/commit/8f0c42b05bd81b140bffcef5fde8da4b3c6fa038
Submitter: Zuul v3 CI (<email address hidden>)
Branch: stable/queens

commit 8f0c42b05bd81b140bffcef5fde8da4b3c6fa038
Author: alexey-mr <email address hidden>
Date: Tue Oct 30 21:39:53 2018 +0300

Added selinux rule for /var/run:/var/run

Change-Id: I9afa3d6e1a90749024fe45aa41794925934e0a0d
Closes-Bug: #1800345

Revision history for this message
alok kumar (kalok) wrote :

verified it with the fix on newer builds for 5.0.2, VM launch is working fine now.
Verified on builds 5.0.2-330 and 5.0.2-349.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.