Nova should list instances even if it can't connect to a cell DB

Bug #1726301 reported by Belmiro Moreira
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Surya Seetharaman
Pike
Won't Fix
Medium
Unassigned
Queens
Fix Released
Medium
Surya Seetharaman

Bug Description

Description
===========
One of the goals of cells is to allow nova scale and to have cells as failure domains.
However, if a cell DB goes down nova doesn't list any instance. Even if the project doesn't have any instance in the affected cell. This affects all users.

The behavior that I would expect is nova to show what's available from the nova_api DB if a cell DB is not available. (UUIDs and can we look into the request_spec?)

Steps to reproduce
==================
Have at least 2 child cells.
Stop the DB in one of them.

"nova list" fails with "ERROR (ClientException): Unexpected API Error."
Not given any more information to the user.

Expected result
===============
List the project instances.
For the instances in the affect cell, list the available information in the nova_api.

Actual result
=============
$nova list
fails without showing the project instances.

Environment
===========
nova master (commit: 8d21d711000fff80eb367692b157d09b6532923f)

description: updated
Changed in nova:
assignee: nobody → Belmiro Moreira (moreira-belmiro-email-lists)
Changed in nova:
assignee: Belmiro Moreira (moreira-belmiro-email-lists) → nobody
assignee: nobody → Belmiro Moreira (moreira-belmiro-email-lists)
Changed in nova:
assignee: Belmiro Moreira (moreira-belmiro-email-lists) → Surya Seetharaman (tssurya)
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

For sure we need to add more failproof mechanism when a cell goes down.

Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/567785

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/575734

Changed in nova:
assignee: Surya Seetharaman (tssurya) → Matt Riedemann (mriedem)
Matt Riedemann (mriedem)
Changed in nova:
assignee: Matt Riedemann (mriedem) → Surya Seetharaman (tssurya)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/575734
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ee461b5bf6c4aa1b66be458776f478f718ef809b
Submitter: Zuul
Branch: master

commit ee461b5bf6c4aa1b66be458776f478f718ef809b
Author: Surya Seetharaman <email address hidden>
Date: Fri Jun 15 15:25:13 2018 +0200

    Make nova list and migration-list ignore down cells

    This patch makes InstanceLister and MigrationLister ignore down
    cells and list the instances/records from the up cell as opposed to
    giving 500 to the users as is the current situation.

    Change-Id: I308b494ab07f6936bef94f4c9da45e9473e3534d
    Partial-Bug: #1726301

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/578152

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.openstack.org/578152
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0626dd0f5bbafdfd38b85a074513894d3dc724af
Submitter: Zuul
Branch: stable/queens

commit 0626dd0f5bbafdfd38b85a074513894d3dc724af
Author: Surya Seetharaman <email address hidden>
Date: Fri Jun 15 15:25:13 2018 +0200

    Make nova list and migration-list ignore down cells

    This patch makes InstanceLister and MigrationLister ignore down
    cells and list the instances/records from the up cell as opposed to
    giving 500 to the users as is the current situation.

    Conflicts:
        nova/compute/multi_cell_list.py
        nova/tests/unit/compute/test_instance_list.py
        because I442621e2b4acd63d2cfc8a66ab5b32b64ebcaea0 is missing in Queens.

    Change-Id: I308b494ab07f6936bef94f4c9da45e9473e3534d
    Partial-Bug: #1726301
    (cherry picked from commit ee461b5bf6c4aa1b66be458776f478f718ef809b)

tags: added: in-stable-queens
Revision history for this message
Matt Riedemann (mriedem) wrote :

We can't backport the fix to pike because the cross-cell listing framework is not in pike and would be a big backport.

https://review.openstack.org/#/q/topic:instance-list+(status:open+OR+status:merged)

Changed in nova:
assignee: Surya Seetharaman (tssurya) → Matt Riedemann (mriedem)
Changed in nova:
assignee: Matt Riedemann (mriedem) → Surya Seetharaman (tssurya)
Changed in nova:
assignee: Surya Seetharaman (tssurya) → Matt Riedemann (mriedem)
Changed in nova:
assignee: Matt Riedemann (mriedem) → Surya Seetharaman (tssurya)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/592428
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=21c5f3e2e5eee162e9781f733f2eac7ebd94655f
Submitter: Zuul
Branch: master

commit 21c5f3e2e5eee162e9781f733f2eac7ebd94655f
Author: Surya Seetharaman <email address hidden>
Date: Thu Aug 16 15:51:35 2018 +0200

    Making instance/migration listing skipping down cells configurable

    Presently if a cell is down, the instances in that cell are
    skipped from results. Sometimes this may not be desirable for
    operators as it may confuse the users who saw more instances in
    their previous listing than now. This patch adds a new api config
    option called list_records_by_skipping_down_cells which can be set to
    False (True by default) if the operator desires to just return an
    API error altogether if the user has any instance in the down cell
    instead of skipping. This is essentially a configurable revert of
    change I308b494ab07f6936bef94f4c9da45e9473e3534d for bug 1726301 so
    that operators can opt into the 500 response behaviour during listing.

    Change-Id: Id749761c58d4e1bc001b745d49b6ff0f3732e133
    Related-Bug: #1726301

Matt Riedemann (mriedem)
Changed in nova:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.