Failed (but retryable) device detaches are logged as ERROR

Bug #1972023 reported by Mohammed Naser
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Unassigned

Bug Description

At the moment, if a device attempts to be detached and times out (using libvirt), it will log a message:

https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2570-L2573

However, this is not a failure, since we actually retry the process a few more times depending on configuration, and then if it is a full failure, we do report that:

https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L2504

In high load environments where this timeout might be hit, this triggers "ERROR" messages that might seem problematic to the operator, however, since the follow up attempt succeeds, there's no need for attention. This message should be logged as a WARNING since the operator will only need to intervene if the ERROR is logged and it is a full failure of detaching the device.

Changed in nova:
status: New → In Progress
Revision history for this message
Mohammed Naser (mnaser) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/840985
Committed: https://opendev.org/openstack/nova/commit/7c87c2f5f744a86d4d854e47848b903ab2674795
Submitter: "Zuul (22348)"
Branch: master

commit 7c87c2f5f744a86d4d854e47848b903ab2674795
Author: Mohammed Naser <email address hidden>
Date: Fri May 6 16:27:11 2022 -0400

    Switch libvirt event timeout message to warning

    At the moment, if libvirt times out in detaching a device, it
    reports this as an ERROR even if the process will be retried
    and eventually succeed.

    We should just log a warning since there's nothing to do, and
    if the process fails after all the retries, it will log an ERROR
    anyways.

    Closes-Bug: #1972023
    Change-Id: Idda12db5758706a97b7841571b9ecd3dc6e6905e

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 26.0.0.0rc1

This issue was fixed in the openstack/nova 26.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.