Don't apply multi-queue to SRIOV ports

Bug #1641814 reported by Zhenyu Zheng
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Zhenyu Zheng
Newton
Fix Committed
Medium
Matt Riedemann

Bug Description

The multi-queue feature was added:
https://blueprints.launchpad.net/nova/+spec/libvirt-virtio-net-multiqueue
and it is controlled using image metadata: hw_vif_mutliqueue_enabled=true|false
when it is set to be true, the related xml config will be handled:
http://git.openstack.org/cgit/openstack/nova/tree/nova/virt/libvirt/vif.py#n130

when users want to launch an instance with a SRIOV port and several normal ports
with multi-queue feature, ERROR can take place due to wrong driver name for SRIOV
interface:

2016-11-15T06:15:41.621+08:00 localhost nova-compute DEBUG [pid:17224] [MainThread] [tid:115210352] [vif.py:745 plug] [req-52fba1f0-008e-43dc-bc02-16ea378a41bd] vif_type=hw_veb instance=Instance(access_ip_v4=None,access_ip_v6=None,architecture=None,auto_disk_config=False,availability_zone='az1.dc1',cell_name=None,cleaned=False,config_drive='',created_at=2016-11-14T22:15:35Z,default_ephemeral_device=None,default_swap_device=None,deleted=False,deleted_at=None,disable_terminate=False,display_description='fzx_sriov',display_name='fzx_sriov',ec2_ids=EC2Ids,ephemeral_gb=0,ephemeral_key_uuid=None,fault=<?>,flavor=Flavor(11),host='68D366CE-0AF4-8E11-8567-000000821800',hostname='fzx-sriov',id=137,image_ref='5d4b082a-3b47-4505-93b7-f242ca59e940',info_cache=InstanceInfoCache,instance_type_id=11,kernel_id='',key_data=None,key_name=None,launch_index=0,launched_at=None,launched_on='68D366CE-0AF4-8E11-8567-000000821800',locked=False,locked_by=None,memory_mb=1024,metadata={},migration_context=<?>,new_flavor=None,node='68D366CE-0AF4-8E11-8567-000000821800',numa_topology=None,old_flavor=None,os_type=None,pci_devices=PciDeviceList,pci_requests=InstancePCIRequests,power_state=0,progress=0,project_id='0a00fefb88cd407a9b768678fad26f5c',ramdisk_id='',reservation_id='r-0050r6iu',root_device_name='/dev/vda',root_gb=1,security_groups=SecurityGroupList,services=<?>,shutdown_terminate=False,system_metadata={booted_volume='False',image_base_image_ref='5d4b082a-3b47-4505-93b7-f242ca59e940',image_container_format='bare',image_disk_format='qcow2',image_hw_vif_multiqueue_enabled='true',image_min_disk='1',image_min_ram='0',network_allocated='True'},tags=<?>,task_state='spawning',terminated_at=None,updated_at=2016-11-14T22:15:40Z,user_data=None,user_id='20ad13b54d7a4950a30ccb3697eea438',uuid=d8a1c18a-6e20-41da-9115-c3f3e6c6b836,vcpu_model=VirtCPUModel,vcpus=4,vm_mode=None,vm_state='building') vif=VIF({'profile': {u'pci_slot': u'0000:02:10.6', u'physical_network': u'sriov_phynet', u'pci_vendor_info': u'8086:10ed'}, 'ovs_interfaceid': None, 'preserve_on_delete': True, 'network': Network({'bridge': None, 'subnets': [Subnet({'ips': [FixedIP({'meta': {}, 'version': 4, 'type': 'fixed', 'floating_ips': [], 'address': u'10.38.0.3'})], 'version': 4, 'meta': {'dhcp_server': u'10.38.0.2'}, 'dns': [], 'routes': [], 'cidr': u'10.38.0.0/16', 'gateway': IP({'meta': {}, 'version': 4, 'type': 'gateway', 'address': u'10.38.0.1'})})], 'meta': {'injected': False, 'tenant_id': u'0a00fefb88cd407a9b768678fad26f5c', 'physical_network': u'sriov_phynet', 'mtu': 1500}, 'id': u'28c3c2a5-12b5-45ae-910e-64426f6228a1', 'label': u'sriov-net'}), 'devname': u'tap63a1f051-00', 'vnic_type': u'direct', 'qbh_params': None, 'meta': {'pci_slotnum': 3}, 'details': {u'port_filter': False, u'vlan': u'63'}, 'address': u'fa:16:3e:91:f3:4f', 'active': False, 'type': u'hw_veb', 'id': u'63a1f051-0025-4ff4-a050-eef37a67f245', 'qbg_params': None}) plug /usr/lib/python2.7/site-packages/nova/virt/libvirt/vif.py:745
2016-11-15T06:15:41.721+08:00 localhost nova-compute ERROR [pid:17224] [MainThread] [tid:115210352] [guest.py:127 create] [req-52fba1f0-008e-43dc-bc02-16ea378a41bd] Error defining a domain with XML: <domain type="kvm">
  <uuid>d8a1c18a-6e20-41da-9115-c3f3e6c6b836</uuid>
  <name>instance-00000089</name>
  <memory>1048576</memory>
  <maxMemory slots="64" unit="KiB">4398046511104</maxMemory>
  <vcpu current="4" cpuset="0-3,6-11,14-15">255</vcpu>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
      <nova:package version="13.0.1-0.20161112210228.6f14d8b"/>
      <nova:name>fzx_sriov</nova:name>
      <nova:creationTime>2016-11-14 22:15:41</nova:creationTime>
      <nova:flavor name="fzx.1024.1.4">
        <nova:memory>1024</nova:memory>
        <nova:disk>1</nova:disk>
        <nova:swap>0</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>4</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="20ad13b54d7a4950a30ccb3697eea438">nova</nova:user>
        <nova:project uuid="0a00fefb88cd407a9b768678fad26f5c">service</nova:project>
      </nova:owner>
      <nova:root type="image" uuid="5d4b082a-3b47-4505-93b7-f242ca59e940"/>
      <nova:extend_properties>
        <nova:realtime>False</nova:realtime>
      </nova:extend_properties>
    </nova:instance>
  </metadata>
  <sysinfo type="smbios">
    <system>
      <entry name="manufacturer">OpenStack Foundation</entry>
      <entry name="product">OpenStack Nova</entry>
      <entry name="version">13.0.1-0.20161112210228.6f14d8b</entry>
      <entry name="serial">02b471f9-21bb-4e31-a835-72846f082ac0</entry>
      <entry name="uuid">d8a1c18a-6e20-41da-9115-c3f3e6c6b836</entry>
      <entry name="family">Virtual Machine</entry>
    </system>
  </sysinfo>
  <os>
    <type>hvm</type>
    <smbios mode="sysinfo"/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cputune>
    <shares>4096</shares>
  </cputune>
  <clock offset="variable" basis="utc"/>
  <cpu mode="host-passthrough" match="exact">
    <topology sockets="255" cores="1" threads="1"/>
    <numa>
      <cell id="0" cpus="0-254" memory="1048576"/>
    </numa>
  </cpu>
  <devices>
    <disk type="file" device="disk">
      <driver name="qemu" type="raw" cache="none" io="threads"/>
      <source file="/opt/HUAWEI/image/instances/d8a1c18a-6e20-41da-9115-c3f3e6c6b836/disk"/>
      <target bus="virtio" dev="vda"/>
      <boot order="1"/>
    </disk>
    <interface type="hostdev" managed="yes">
      <mac address="fa:16:3e:91:f3:4f"/>
      <driver name="vhost" queues="4"/>
      <source>
        <address type="pci" domain="0x0000" bus="0x02" slot="0x10" function="0x6"/>
      </source>
      <vlan>
        <tag id="63"/>
      </vlan>
      <address bus="0x00" domain="0x0000" function="0x0" slot="0x3" type="pci"/>
    </interface>
    <serial type="file">
      <source path="/var/log/libvirt/qemu/instance-00000089.seriallog"/>
    </serial>
    <input type="tablet" bus="usb"/>
    <video>
      <model type="cirrus"/>
    </video>
    <memballoon model="virtio">
      <stats period="10"/>
    </memballoon>
    <graphics type="vnc" autoport="yes" keymap="en-us" listen="172.28.0.3" passwd="t2I870T4"/>
    <channel type="unix">
      <source mode="bind" path="/var/run/libvirt/qemu/instance-00000089.extend"/>
      <target name="org.qemu.guest_agent.1" type="virtio"/>
      <address bus="0" controller="0" port="1" type="virtio-serial"/>
    </channel>
    <channel type="unix">
      <source mode="bind" path="/var/run/libvirt/qemu/instance-00000089.agent"/>
      <target name="org.qemu.guest_agent.0" type="virtio"/>
      <address bus="0" controller="0" port="2" type="virtio-serial"/>
    </channel>
    <channel type="unix">
      <source mode="bind" path="/var/run/libvirt/qemu/instance-00000089.hostd"/>
      <target name="org.qemu.guest_agent.2" type="virtio"/>
      <address bus="0" controller="0" port="3" type="virtio-serial"/>
    </channel>
    <channel type="unix">
      <source mode="bind" path="/var/run/libvirt/qemu/instance-00000089.upgraded"/>
      <target name="org.qemu.guest_agent.3" type="virtio"/>
      <address bus="0" controller="0" port="4" type="virtio-serial"/>
    </channel>
  </devices>
  <qemu:commandline xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0">
    <qemu:arg value="-device"/>
    <qemu:arg value="pv_channel,bus=pci.0,addr=0x1f"/>
    <qemu:arg value="-chardev"/>
    <qemu:arg value="file,id=seabios,path=/var/log/libvirt/qemu/instance-00000089.seabios,mux=off"/>
    <qemu:arg value="-device"/>
    <qemu:arg value="isa-debugcon,iobase=0x402,chardev=seabios"/>
    <qemu:arg value="-consolelog"/>
    <qemu:arg value="path=/var/log/libvirt/qemu/instance-00000089.consolelog"/>
  </qemu:commandline>
</domain>
2016-11-15T06:15:41.721+08:00 localhost nova-compute ERROR [pid:17224] [MainThread] [tid:115210352] [manager.py:542 _build_resources] [req-52fba1f0-008e-43dc-bc02-16ea378a41bd] [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] Instance failed to spawn
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] Traceback (most recent call last):
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/nova/huawei/compute/manager.py", line 536, in _build_resources
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] yield resources
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2065, in _build_and_run_instance
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] block_device_info=block_device_info)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/nova/huawei/virt/libvirt/driver.py", line 2416, in spawn
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] block_device_info)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2770, in spawn
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] block_device_info=block_device_info)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4910, in _create_domain_and_network
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] xml, pause=pause, power_on=power_on)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4836, in _create_domain
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] guest = libvirt_guest.Guest.create(xml, self._host)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 127, in create
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] encodeutils.safe_decode(xml))
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] self.force_reraise()
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] six.reraise(self.type_, self.value, self.tb)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 123, in create
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] domain = host.write_instance_config(xml)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 981, in write_instance_config
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] return self.get_connection().defineXML(xml)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] result = proxy_call(self._autowrap, f, *args, **kwargs)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] rv = execute(f, *args, **kwargs)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] six.reraise(c, e, tb)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] rv = meth(*args, **kwargs)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3751, in defineXML
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self)
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836] libvirtError: unsupported configuration: Unknown PCI device <driver name='vhost'/> has been specified
2016-11-15 06:15:41.624 17224 ERROR nova.huawei.compute.manager [instance: d8a1c18a-6e20-41da-9115-c3f3e6c6b836]

Changed in nova:
assignee: nobody → Zhenyu Zheng (zhengzhenyu)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/397545

Changed in nova:
status: New → In Progress
Revision history for this message
Matt Riedemann (mriedem) wrote :

nova.huawei.compute.manager huh :)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/397545
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=fb3b2d9eaf8317cec62e52a99f17946610b8cdca
Submitter: Jenkins
Branch: master

commit fb3b2d9eaf8317cec62e52a99f17946610b8cdca
Author: Kevin_Zheng <email address hidden>
Date: Tue Nov 15 14:13:58 2016 +0800

    Don't apply multi-queue to SRIOV ports

    The multi-queue feature was added:
    https://blueprints.launchpad.net/nova/+spec/libvirt-virtio-net-multiqueue
    and it is controlled using image metadata:
    hw_vif_mutliqueue_enabled=true|false

    when users want to launch an instance with
    a SRIOV port and several normal ports
    with multi-queue feature, ERROR can take
    place due to wrong driver name for SRIOV
    interface. Details could be found in bug
    description.

    Change-Id: I52c51ff17f43133154f6ea8aa4107f1673f82dde
    Closes-bug: #1641814

Changed in nova:
status: In Progress → Fix Released
Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/416299

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/newton)

Reviewed: https://review.openstack.org/416299
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a6a67d821446afc71ac34d19e2f17d7d0e03d178
Submitter: Jenkins
Branch: stable/newton

commit a6a67d821446afc71ac34d19e2f17d7d0e03d178
Author: Kevin_Zheng <email address hidden>
Date: Tue Nov 15 14:13:58 2016 +0800

    Don't apply multi-queue to SRIOV ports

    The multi-queue feature was added:
    https://blueprints.launchpad.net/nova/+spec/libvirt-virtio-net-multiqueue
    and it is controlled using image metadata:
    hw_vif_mutliqueue_enabled=true|false

    when users want to launch an instance with
    a SRIOV port and several normal ports
    with multi-queue feature, ERROR can take
    place due to wrong driver name for SRIOV
    interface. Details could be found in bug
    description.

    Conflicts:
            nova/tests/unit/virt/libvirt/test_vif.py
            nova/virt/libvirt/vif.py

    NOTE(mriedem): The test conflict is due to
    77dee9505860be11c2b14c35d403fe2c49a0bcfd not being in
    Newton. The vif.py conflict is due to
    e5e4dfcfdb918b57dcbd3e3cfb171e3b70e3c701 not being in
    Newton.

    Change-Id: I52c51ff17f43133154f6ea8aa4107f1673f82dde
    Closes-bug: #1641814
    (cherry picked from commit fb3b2d9eaf8317cec62e52a99f17946610b8cdca)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.0.0b3

This issue was fixed in the openstack/nova 15.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 14.0.4

This issue was fixed in the openstack/nova 14.0.4 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.