TypeError: object of type 'object' has no len() from resources_from_request_spec when cells are down

Bug #1857139 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Matt Riedemann
Train
Confirmed
Low
Unassigned

Bug Description

Seen here:

https://zuul.opendev.org/t/openstack/build/c187e207bc1c48a0a7fa49ef9798b696/log/logs/screen-n-sch.txt.gz#2529

cell1 is down so the call to scatter_gather_cells in get_compute_nodes_by_host_or_node yields a result but it's not a ComputeNodeList, it's the did_not_respond_sentinel object:

https://github.com/openstack/nova/blob/02019d2660dfce3facdd64ecdb2bd60ba4a91c6d/nova/scheduler/host_manager.py#L705

https://github.com/openstack/nova/blob/02019d2660dfce3facdd64ecdb2bd60ba4a91c6d/nova/context.py#L454

which results in an error here:

https://github.com/openstack/nova/blob/02019d2660dfce3facdd64ecdb2bd60ba4a91c6d/nova/scheduler/utils.py#L612

The HostManager.get_compute_nodes_by_host_or_node method should filter out fail/timeout results from the scatter_gather_cells results. We'll get a NoValidHost either way but this is better than the traceback with the TypeError in it.

Tags: scheduler
Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/700186

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/700752

Changed in nova:
assignee: Matt Riedemann (mriedem) → Choi-Sung-Hoon (knu-cse)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Choi-Sung-Hoon (<email address hidden>) on branch: master
Review: https://review.opendev.org/700752

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/700753

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Choi-Sung-Hoon (<email address hidden>) on branch: master
Review: https://review.opendev.org/700753
Reason: Following Brin Zhang's comment, I abandon this change.

Changed in nova:
assignee: Choi-Sung-Hoon (knu-cse) → Matt Riedemann (mriedem)
Changed in nova:
assignee: Matt Riedemann (mriedem) → Choi-Sung-Hoon (knu-cse)
Changed in nova:
assignee: Choi-Sung-Hoon (knu-cse) → Matt Riedemann (mriedem)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/700186
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0d9622f581e830e7b7bc9763aaa09ba02e99b8bb
Submitter: Zuul
Branch: master

commit 0d9622f581e830e7b7bc9763aaa09ba02e99b8bb
Author: Matt Riedemann <email address hidden>
Date: Fri Dec 20 10:03:23 2019 -0500

    Handle cell failures in get_compute_nodes_by_host_or_node

    get_compute_nodes_by_host_or_node uses the scatter_gather_cells
    function but was not handling the case that a failure result
    was returned, which could be the called function raising some
    exception or the cell timing out. This causes issues when the
    caller of get_compute_nodes_by_host_or_node expects to get a
    ComputeNodeList back and can do something like len(nodes) on it
    which fails when the result is not iterable.

    To be clear, if a cell is down there are going to be problems
    which likely result in a NoValidHost error during scheduling, but
    this avoids an ugly TypeError traceback in the scheduler logs.

    Change-Id: Ia54b5adf0a125ae1f9b86887a07dd1d79821dd54
    Closes-Bug: #1857139

Changed in nova:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.