Delete ARQs for an instance when the instance is deleted only delete bound arqs

Bug #1872730 reported by sean mooney
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Brin Zhang
Ussuri
In Progress
High
sean mooney

Bug Description

During development of the cyborg integration with nova the patch that
Delete ARQs for an instance when the instance is deleted
Icb95890d8f16cad1f7dc18487a48def2f7c9aec2 failed to do so in some cases
as noted in https://review.opendev.org/#/c/673735/46/nova/conductor/manager.py@1632

if the arq are successfully created in the conductor and then the binding fails those
arqs would be leaked as they never entered the bound state. As a result if the instance was deleted
the ARQs that were created for the instance but not bound would not be deleted when the instance bound ARQs are clean up.

This bug tracks addressing that edge case.

Revision history for this message
Brin Zhang (zhangbailin) wrote :

> if the arq are successfully created in the conductor and then the binding fails those
> arqs would be leaked as they never entered the bound state. As a result if the instance was deleted
> the ARQs that were created for the instance but not bound would not be deleted when the instance
> bound ARQs are clean up.

Right, that will be leaked in Cyborg db, and cannot be requested by another instance, although it was not bond for any instance.

Changed in nova:
assignee: nobody → sean mooney (sean-k-mooney)
status: Triaged → In Progress
Changed in nova:
assignee: sean mooney (sean-k-mooney) → Brin Zhang (zhangbailin)
Changed in nova:
assignee: Brin Zhang (zhangbailin) → Wenping Song (wenping1)
Changed in nova:
assignee: Wenping Song (wenping1) → Brin Zhang (zhangbailin)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/716186
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d94ea23d3d64ecd3f2539a337c066487b938fcad
Submitter: Zuul
Branch: master

commit d94ea23d3d64ecd3f2539a337c066487b938fcad
Author: Sundar Nadathur <email address hidden>
Date: Mon Mar 30 19:24:30 2020 -0700

    Delete ARQs by UUID if Cyborg ARQ bind fails.

    During the reivew of the cyborg series it was noted that
    in some cases ARQs could be leaked during binding.
    See https://review.opendev.org/#/c/673735/46/nova/conductor/manager.py@1632

    This change adds a delete_arqs_by_uuid function that can delete
    unbound ARQs by instance uuid.

    This change modifies build_instances and schedule_and_build_instances
    to handel the AcceleratorRequestBindingFailed exception raised when
    binding fails and clean up instance arqs.

    Co-Authored-By: Wenping Song <email address hidden>

    Closes-Bug: #1872730
    Change-Id: I86c2f00e2368fe02211175e7328b2cd9c0ebf41b
    Blueprint: nova-cyborg-interaction

Changed in nova:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.