VM with single GPU flavor gets two GPUs assigned

Bug #1895316 reported by Vlad Sorokin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

Description
===========
Some VMs with the requested single PCI device (GPU) got provisioned with actual two GPUs attached.

Steps to reproduce
==================
Deploy a GPU-assigned VM with the following Heat template:
$ openstack stack template show vmaas-p1220-kvm1
description: Template to create a single VM for a self service project in CCCC OpenStack
heat_template_version: rocky
outputs:
  instance_ip:
    description: The IP address of the deployed instance
    value:
      get_attr:
      - server_1
      - first_address
  instance_name:
    description: Name of the instance
    value:
      get_attr:
      - server_1
      - name
parameters:
  flavor:
    default: vmaas.p9.2xxlarge.v100-32.1
    description: Type of instance (flavor) to be used
    label: Flavor
    type: string
  image:
    default: rhel7.6alt-ppc64le
    description: Image to be used for compute instance
    label: Image name or ID
    type: string
  instance_boot_disk_name:
    default: p1220-kvm1-boot
    description: Name of instance boot volume
    label: Instance disk name
    type: string
  instance_boot_disk_size:
    default: '200'
    description: Size of instance boot volume
    label: Instance disk size
    type: string
  instance_ip:
    default: AAA.BB.CC.DD
    description: IP address of compute instance
    label: Instance IP address
    type: string
  instance_name:
    default: p1220-kvm1
    description: Name of compute instance
    label: Instance name
    type: string
  key:
    default: ''
    description: Name of existing ssh key-pair to be used for compute instance
    label: Key name
    type: string
  project_vlan:
    default: '1220'
    description: Project VLAN to attach instance to
    label: Network name or ID
    type: string
resources:
  cloud_config_part1:
    properties:
      cloud_config:
        write_files:
        - content: ==cloud_config_data_here===
          encoding: b64
          owner: root:root
          path: /cloud-config.sh
          permissions: '0700'
    type: OS::Heat::CloudConfig
  cloud_config_part2:
    ==cloud_config_data_here===
    type: OS::Heat::CloudConfig
  cloud_config_run:
    properties:
      parts:
      - config:
          get_resource: cloud_config_part1
      - config:
          get_resource: cloud_config_part2
    type: OS::Heat::MultipartMime
  server_1:
    depends_on:
    - cloud_config_run
    - volume_1
    properties:
      block_device_mapping_v2:
      - boot_index: 0
        delete_on_termination: true
        volume_id:
          get_resource: volume_1
      config_drive: true
      flavor:
        get_param: flavor
      key_name:
        get_param: key
      metadata:
        Flavor:
          get_param: flavor
        Image:
          get_param: image
        Project: XXXXXXXX
        ProjectDescription: ''
        Reservation: YYYYYYYY
        Submitter: Portal/ZZZZZZZ
      name:
        get_param: instance_name
      networks:
      - fixed_ip:
          get_param: instance_ip
        network:
          list_join:
          - ''
          - - v
            - get_param: project_vlan
      user_data:
        get_resource: cloud_config_run
      user_data_format: RAW
    type: OS::Nova::Server
  volume_1:
    properties:
      image:
        get_param: image
      metadata:
        Project: XXXXXXXX
        Reservation: YYYYYYYY
      name:
        get_param: instance_boot_disk_name
      size:
        get_param: instance_boot_disk_size
    type: OS::Cinder::Volume

And flavor:
$ openstack flavor show 02542b5c-3bab-43df-9dcd-59f2867f344b
+----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| access_project_ids | 4fcaf92a3fa148b4bc15d1a170bb67a0 |
| disk | 0 |
| id | 02542b5c-3bab-43df-9dcd-59f2867f344b |
| name | vmaas.p9.2xxlarge.v100-32.1 |
| os-flavor-access:is_public | False |
| properties | aggregate_instance_extra_specs:cpu='p9', aggregate_instance_extra_specs:env='vmaas', aggregate_instance_extra_specs:gpu='v100-32', hw:cpu_cores='8', hw:cpu_sockets='1', hw:cpu_threads='4', pci_passthrough:alias='v100-32:1' |
| ram | 65536 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 32 |
+----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Expected result
===============
Get a VM assigned with the single PCI device (GPU)

Actual result
=============
Got a VM with two GPUs attached.
p1220-kvm1 ~]$ lspci
0000:00:01.0 Ethernet controller: Red Hat, Inc. Virtio network device
0000:00:02.0 SCSI storage controller: Red Hat, Inc. Virtio SCSI
0000:00:03.0 USB controller: Red Hat, Inc. QEMU XHCI Host Controller (rev 01)
0000:00:04.0 SCSI storage controller: Red Hat, Inc. Virtio block device
0000:00:05.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon
0000:00:06.0 VGA compatible controller: Device 1234:1111 (rev 02)
0001:00:01.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 32GB] (rev a1)
0002:00:01.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 32GB] (rev a1)

$ nvidia-smi
Fri Sep 11 11:28:21 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000001:00:01.0 Off | 0 |
| N/A 26C P0 38W / 300W | 0MiB / 32510MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla V100-SXM2... On | 00000002:00:01.0 Off | 0 |
| N/A 28C P0 39W / 300W | 0MiB / 32510MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

# virsh dumpxml instance-00003268
<domain type='kvm' id='24'>
  <name>instance-00003268</name>
  <uuid>a49ba344-8b50-4014-baf3-24f8252f212e</uuid>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
      <nova:package version="18.2.1"/>
      <nova:name>p1220-kvm1</nova:name>
      <nova:creationTime>2020-09-11 08:28:22</nova:creationTime>
      <nova:flavor name="vmaas.p9.2xxlarge.v100-32.1">
        <nova:memory>65536</nova:memory>
        <nova:disk>0</nova:disk>
        <nova:swap>0</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>32</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="cc738b41714c47d5917a0a4a152ad133">vmaas</nova:user>
        <nova:project uuid="4fcaf92a3fa148b4bc15d1a170bb67a0">vmaas</nova:project>
      </nova:owner>
    </nova:instance>
  </metadata>
  <memory unit='KiB'>67108864</memory>
  <currentMemory unit='KiB'>67108864</currentMemory>
  <vcpu placement='static'>32</vcpu>
  <cputune>
    <shares>32768</shares>
  </cputune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='ppc64le' machine='pseries-bionic'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='8' threads='4'/>
  </cpu>
  <clock offset='utc'>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='rtc' tickpolicy='catchup'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/bin/kvm</emulator>
    <disk type='network' device='cdrom'>
      <driver name='qemu' type='raw' cache='none' discard='unmap'/>
      <auth username='nova-compute'>
        <secret type='ceph' uuid='514c9fca-8cbe-11e2-9c52-3bc8c7819472'/>
      </auth>
      <source protocol='rbd' name='nova/a49ba344-8b50-4014-baf3-24f8252f212e_disk.config'>
        <host name='10.0.0.11' port='6789'/>
        <host name='10.0.0.12' port='6789'/>
        <host name='10.0.0.13' port='6789'/>
      </source>
      <target dev='sda' bus='scsi'/>
      <readonly/>
      <alias name='scsi0-0-0-1'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='none' discard='unmap'/>
      <auth username='cinder-ceph'>
        <secret type='ceph' uuid='046a66b2-bf3d-4be9-a8c1-1334c3fbc3d7'/>
      </auth>
      <source protocol='rbd' name='cinder-ceph/volume-b7584535-d482-4bd8-bd0f-c08435708f23'>
        <host name='10.0.0.11' port='6789'/>
        <host name='10.0.0.12' port='6789'/>
        <host name='10.0.0.13' port='6789'/>
      </source>
      <target dev='vda' bus='virtio'/>
      <serial>b7584535-d482-4bd8-bd0f-c08435708f23</serial>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <controller type='scsi' index='0' model='virtio-scsi'>
      <alias name='scsi0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </controller>
    <controller type='usb' index='0' model='qemu-xhci'>
      <alias name='usb'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <model name='spapr-pci-host-bridge'/>
      <target index='0'/>
      <alias name='pci.0'/>
    </controller>
    <controller type='pci' index='1' model='pci-root'>
      <model name='spapr-pci-host-bridge'/>
      <target index='1'/>
      <alias name='pci.1'/>
    </controller>
    <controller type='pci' index='2' model='pci-root'>
      <model name='spapr-pci-host-bridge'/>
      <target index='2'/>
      <alias name='pci.2'/>
    </controller>
    <interface type='bridge'>
      <mac address='fa:16:3e:36:5a:d5'/>
      <source bridge='br-int'/>
      <virtualport type='openvswitch'>
        <parameters interfaceid='bb1b7bfb-bbe9-4560-8dd0-ae1688cdb16c'/>
      </virtualport>
      <target dev='tapbb1b7bfb-bb'/>
      <model type='virtio'/>
      <mtu size='9000'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </interface>
    <serial type='pty'>
      <source path='/dev/pts/0'/>
      <log file='/var/lib/nova/instances/a49ba344-8b50-4014-baf3-24f8252f212e/console.log' append='off'/>
      <target type='spapr-vio-serial' port='0'>
        <model name='spapr-vty'/>
      </target>
      <alias name='serial0'/>
      <address type='spapr-vio' reg='0x30000000'/>
    </serial>
    <console type='pty' tty='/dev/pts/0'>
      <source path='/dev/pts/0'/>
      <log file='/var/lib/nova/instances/a49ba344-8b50-4014-baf3-24f8252f212e/console.log' append='off'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
      <address type='spapr-vio' reg='0x30000000'/>
    </console>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
      <address type='usb' bus='0' port='1'/>
    </input>
    <input type='keyboard' bus='usb'>
      <alias name='input1'/>
      <address type='usb' bus='0' port='2'/>
    </input>
    <input type='mouse' bus='usb'>
      <alias name='input2'/>
      <address type='usb' bus='0' port='3'/>
    </input>
    <graphics type='vnc' port='5900' autoport='yes' listen='0.0.0.0' keymap='en-us'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <video>
      <model type='vga' vram='16384' heads='1' primary='yes'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </video>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0004' bus='0x04' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0004' bus='0x05' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev1'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x01' function='0x0'/>
    </hostdev>
    <memballoon model='virtio'>
      <stats period='10'/>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </memballoon>
    <panic model='pseries'/>
  </devices>
  <seclabel type='dynamic' model='apparmor' relabel='yes'>
    <label>libvirt-a49ba344-8b50-4014-baf3-24f8252f212e</label>
    <imagelabel>libvirt-a49ba344-8b50-4014-baf3-24f8252f212e</imagelabel>
  </seclabel>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>+64055:+116</label>
    <imagelabel>+64055:+116</imagelabel>
  </seclabel>
</domain>

Environment
===========
1. Canonical OpenStack Rocky on ppc64le
dpkg -l | grep nova
ii nova-api-os-compute 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - OpenStack Compute API frontend
ii nova-common 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - common files
ii nova-conductor 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - conductor service
ii nova-consoleauth 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - Console Authenticator
ii nova-novncproxy 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - NoVNC proxy
ii nova-placement-api 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - placement API frontend
ii nova-scheduler 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - virtual machine scheduler
ii python-novaclient 2:11.0.0-0ubuntu1~cloud0 all client library for OpenStack Compute API - Python 2.7
ii python3-nova 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute Python 3 libraries

2. Which hypervisor did you use?
   QEMU-KVM ppc64le version 2.11
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.3 LTS
Release: 18.04
Codename: bionic

# dpkg -l | egrep "libvirt|kvm|nova"
ii libvirt-clients 4.0.0-1ubuntu8.13 ppc64el Programs for the libvirt library
ii libvirt-daemon 4.0.0-1ubuntu8.13 ppc64el Virtualization daemon
ii libvirt-daemon-driver-storage-rbd 4.0.0-1ubuntu8.13 ppc64el Virtualization daemon RBD storage driver
ii libvirt-daemon-system 4.0.0-1ubuntu8.13 ppc64el Libvirt daemon configuration files
ii libvirt0:ppc64el 4.0.0-1ubuntu8.13 ppc64el library for interfacing with different virtualization systems
ii nova-api-metadata 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - metadata API frontend
ii nova-common 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - common files
ii nova-compute 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - compute node base
ii nova-compute-kvm 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - compute node (KVM)
ii nova-compute-libvirt 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute - compute node libvirt support
ii python3-libvirt 4.0.0-1 ppc64el libvirt Python 3 bindings
ii python3-nova 2:18.2.1-0ubuntu1~cloud4 all OpenStack Compute Python 3 libraries
ii python3-novaclient 2:11.0.0-0ubuntu1~cloud0 all client library for OpenStack Compute API - 3.x
ii qemu-kvm 1:2.11+dfsg-1ubuntu7.17 ppc64el QEMU Full virtualization on x86 hardware

Logs & Configs
==============
Logs and configs attached

Tags: libvirt pci
Revision history for this message
Vlad Sorokin (vvsorokin) wrote :
Vlad Sorokin (vvsorokin)
description: updated
tags: added: libvirt pci
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

This is weird, I only see 0004:05:00.0 being allocated for GPU passthrough as tracked in the pci_devices SQL query, but for some reason the guest itself is using both this PCI address and 0004:04:00.0

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Could you please try to hard reboot the instance so the instance XML would be regenerated and we would see whether this fixes your problem ?

Changed in nova:
status: New → Incomplete
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Marked the bug as Incomplete, please mark it back to New when you reply so we could see it, thanks.

Revision history for this message
Vlad Sorokin (vvsorokin) wrote :

Sylvian, thanks for the quick response.
Hard reboot helps removing an unwanted GPU. However, I just got another newly provisioned VM with the extra "illegal" GPU, and had to hard-reset it to fix.

Changed in nova:
status: Incomplete → New
Revision history for this message
Vlad Sorokin (vvsorokin) wrote :
Download full text (3.3 KiB)

More info. I have a VM with a single PCI GPU assigned to it as requested:
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0035' bus='0x03' slot='0x00' function='0x0'/>
      </source>
      <alias name='hostdev0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0x0'/>
    </hostdev>

But that GPU showed as 'available' in Nova pci_devices table:
# mysql -u root -e "select * from nova.pci_devices where compute_node_id=211"
+---------------------+---------------------+------------+---------+-----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-----------+------------+---------------+------------+-----------+-------------+--------------------------------------+
| created_at | updated_at | deleted_at | deleted | id | compute_node_id | address | product_id | vendor_id | dev_type | dev_id | label | status | extra_info | instance_uuid | request_id | numa_node | parent_addr | uuid |
+---------------------+---------------------+------------+---------+-----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-----------+------------+---------------+------------+-----------+-------------+--------------------------------------+
| 2020-09-17 15:08:15 | 2020-09-25 20:46:17 | NULL | 0 | 754 | 211 | 0004:04:00.0 | 1db5 | 10de | type-PCI | pci_0004_04_00_0 | label_10de_1db5 | available | {} | NULL | NULL | 0 | NULL | 47c54215-3c9a-4cf3-80fc-cd6f5323e8cd |
| 2020-09-17 15:08:15 | 2020-09-25 20:46:17 | NULL | 0 | 757 | 211 | 0004:05:00.0 | 1db5 | 10de | type-PCI | pci_0004_05_00_0 | label_10de_1db5 | available | {} | NULL | NULL | 0 | NULL | f43cc300-7865-4c8a-b778-cae582797347 |
| 2020-09-17 15:08:15 | 2020-09-25 20:46:17 | NULL | 0 | 760 | 211 | 0035:03:00.0 | 1db5 | 10de | type-PCI | pci_0035_03_00_0 | label_10de_1db5 | available | {} | NULL | NULL | 8 | NULL | f626ccf6-5214-45a9-a102-16e3a25c6ab0 |
| 2020-09-17 15:08:15 | 2020-09-25 20:46:17 | NULL | 0 | 763 | 211 | 0035:04:00.0 | 1db5 | 10de | type-PCI | pci_0035_04_00_0 | label_10de_1db5 | available | {} | NULL | NULL | 8 | NULL | ac84c6e3-5964-427d-80b5-a344c9cdb714 |
+---------------------+---------------------+------------+---------+-----+-----------------+--------------+------------+-----------+----------+------------------+-----------------+-----------+------------+---------------+------------+-----------+-------------+--------------------------------------+

nova-ompute log entry:
020-09-25 20:55:21.827 82160 INFO nova.compute.resource_tracker [req-270e5e57-0f7e-49cc-8560-f046ec09fd1c - - - - -] Final resource view: name=os-kvm18 phys_ram=1047177MB used_ram=16384MB phys_disk=369596GB used_disk=0GB total_vcpus=160 used_vcp...

Read more...

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Looks like the PCI tracker somehow messed up...

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

Could this be related to https://bugs.launchpad.net/nova/+bug/1901170 ? @Vlad: could you check if re-schedule happend during the instance creation?

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

I'm setting this Incomplete. It is very similar to https://bugs.launchpad.net/nova/+bug/1901170 but does not have enough information to mark it as duplicate.

@Vald if you can check the logs for re-schedule and see that it happened in the case when duplication happened, then please let us know and I will mark this as duplicate. If no re-schedule happened then it we need to look further.

Changed in nova:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.