nova's server group API returns deleted instances as members

Bug #1682845 reported by Amrith Kumar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Matt Riedemann
OpenStack DBaaS (Trove)
Invalid
High
Amrith Kumar

Bug Description

trove gate fails with error in wait_for_delete_non_affinity_master

This is a hard failure that has started occurring in the last couple of days.

AssertionError: Found left-over server group: <ServerGroup: 34e6dcaa-58f9-4aa8-a959-9d6dc8a6be6d>

Error first seen 2017-04-04.

Amrith Kumar (amrith)
description: updated
Revision history for this message
Amrith Kumar (amrith) wrote :

The behavior that is causing this problem is illustrated from the command line (see illustration.txt). The issue appears to be that Nova is (now) returning deleted instances in a server group, something that it likely didn't do in the past.

description: updated
summary: - trove gate fails with error in wait_for_delete_non_affinity_master
+ nova's server group API now deleted instances as members
summary: - nova's server group API now deleted instances as members
+ nova's server group API returns deleted instances as members
Revision history for this message
Matt Riedemann (mriedem) wrote :

My guess would be something related to this recent change:

https://review.openstack.org/#/c/443293/

I haven't debugged this in detail though.

Revision history for this message
Matt Riedemann (mriedem) wrote :

I think the regression is probably because we don't actually delete instance_mappings records when we delete an instance, which is another known bug:

https://bugs.launchpad.net/nova/+bug/1679941

So when we delete an instance, it's gone from the main cell database query but the instance mapping isn't deleted, so we're still going to be returning those group members since we're using instance mappings now.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/457097

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: New → In Progress
Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → High
Changed in trove:
status: Confirmed → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/457097
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3e463d4f4383976927dad0092c2bff9b88ad22fe
Submitter: Jenkins
Branch: master

commit 3e463d4f4383976927dad0092c2bff9b88ad22fe
Author: Matt Riedemann <email address hidden>
Date: Sun Apr 16 21:55:27 2017 +0000

    Revert "Make server_groups determine deleted-ness from InstanceMappingList"

    This reverts commit 0af57a9d4be86afe75c986de667cbd9750017f64.

    As reported in bug 1679941, we don't delete instance_mappings records when
    we delete instances from the cells, so we can't really rely on the
    InstanceMappingList at this point to tell if an instance exists or not, since
    the instance might have been deleted but the mapping hasn't.

    Change-Id: Id75868ab9bef5136930d6bc33e197473b2c19977
    Closes-Bug: #1682845

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.0.0b2

This issue was fixed in the openstack/nova 16.0.0.0b2 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.