rescued instance doesn't have attached vGPUs

Bug #1762688 reported by Sylvain Bauza on 2018-04-10
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Sylvain Bauza
Queens
Medium
Sylvain Bauza

Bug Description

With the libvirt driver, rescuing an instance means that the attached mediated devices for virtual GPUs will be released. When unrescuing the instance, the driver will use the existing guest XML so it will attach again the mediated devices, but it could be a race condition in case the related vGPUs are now attached to a separate instance.

We should attach the mediated devices to the rescued instance too.

Changed in nova:
importance: Low → High

Fix proposed to branch: master
Review: https://review.openstack.org/577424

Changed in nova:
status: Confirmed → In Progress
Changed in nova:
assignee: Sylvain Bauza (sylvain-bauza) → Matt Riedemann (mriedem)
Matt Riedemann (mriedem) on 2018-06-28
Changed in nova:
assignee: Matt Riedemann (mriedem) → Sylvain Bauza (sylvain-bauza)

Reviewed: https://review.openstack.org/577424
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1c59397e09de5506bccba513ef31ffb8585fcdc3
Submitter: Zuul
Branch: master

commit 1c59397e09de5506bccba513ef31ffb8585fcdc3
Author: Sylvain Bauza <email address hidden>
Date: Fri Jun 22 14:53:57 2018 +0200

    libvirt: Fix the rescue race for vGPU instances

    When rescuing an instance having a vGPU, we were not using the vGPU.
    There would then be a race condition during the rescue where the vGPU
    could be passed to another instance.
    Instead, we should just make sure the vGPU would also be in the rescued
    instance.

    Change-Id: I7150e15694bb149ae67da37b5e43b6ea7507fe82
    Closes-bug: #1762688

Changed in nova:
status: In Progress → Fix Released

Reviewed: https://review.openstack.org/579503
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3163c9391ec2569f77e583bccf6163c309e45482
Submitter: Zuul
Branch: stable/queens

commit 3163c9391ec2569f77e583bccf6163c309e45482
Author: Sylvain Bauza <email address hidden>
Date: Fri Jun 22 14:53:57 2018 +0200

    libvirt: Fix the rescue race for vGPU instances

    When rescuing an instance having a vGPU, we were not using the vGPU.
    There would then be a race condition during the rescue where the vGPU
    could be passed to another instance.
    Instead, we should just make sure the vGPU would also be in the rescued
    instance.

    Change-Id: I7150e15694bb149ae67da37b5e43b6ea7507fe82
    Closes-bug: #1762688
    (cherry picked from commit 1c59397e09de5506bccba513ef31ffb8585fcdc3)

This issue was fixed in the openstack/nova 18.0.0.0b3 development milestone.

This issue was fixed in the openstack/nova 17.0.6 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers