Failed to create vgpu cause of IOError

Bug #1837681 reported by Eric Xie
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned

Bug Description

Description
===========
I used 'Tesla V100' to create vm with vgpu.
Got error.

Steps to reproduce
==================
* Create flavor with resources:VGPU='1'
* Create vm with CLI `openstack server create --image 27dc8e63-6d28-4f80-a6f4-e5a855a02e46 --flavor 224e1385-7de4-4c0b-931d-a7431d329f78 --network net-1 ins-vgpu-t`

Expected result
===============
Create successfully

Actual result
=============
Got ERROR

Environment
===========
1. Exact version of OpenStack you are running. See the following
  # apt list --installed | grep nova

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

nova-common/xenial,xenial,now 2:17.0.7-6~u16.01 all [installed]
nova-compute/xenial,xenial,now 2:17.0.7-6~u16.01 all [installed,automatic]
nova-compute-kvm/xenial,xenial,now 2:17.0.7-6~u16.01 all [installed]
python-nova/xenial,xenial,now 2:17.0.7-6~u16.01 all [installed,automatic]
python-novaclient/xenial,xenial,now 2:9.1.1-1~u16.04 all [installed]

2. Which hypervisor did you use?
    Libvirt + KVM

Logs & Configs
==============
2019-07-22 08:12:18,500.500 21346 ERROR nova.virt.libvirt.driver [req-4053b3df-ae7d-4378-b3c4-1c26e8482e24 4c31323efa7e4abf824399b63a687ff8 187e1165ec2a40e9a72efab673e940d9 - default default] [instance: c9737cde-af6c-40b5-b719-2190428a0a03] Failed to start libvirt guest: libvirtError: internal error: qemu unexpectedly closed the monitor: 2019-07-22T00:12:18.186786Z qemu-system-x86_64: -device vfio-pci,id=hostdev0,sysfsdev=/sys/bus/mdev/devices/78c27f7b-e2ed-4fe8-afcf-84c6107620b9,bus=pci.0,addr=0x7: vfio error: 78c27f7b-e2ed-4fe8-afcf-84c6107620b9: error getting device from group 0: Input/output error

Tags: libvirt vgpu
Matt Riedemann (mriedem)
tags: added: libvirt vgpu
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

While I understand your problem, it's not related to a Nova bug.
See https://gridforums.nvidia.com/default/topic/9541/general-discussion/trouble-assigning-vgpu/
In general, it's due to the nvidia driver that doesn't support ECC (yet) so you need to call the same : "nvidia-smi -e 0"

Closing the bug here since like I said it's not related to Nova.

Changed in nova:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.