"Unexpected API Error" when use "openstack usage show" command

Bug #1837995 reported by GUO Larry
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Undecided
melanie witt
Train
New
Undecided
Unassigned
Ussuri
Fix Released
Undecided
Unassigned
Victoria
Fix Released
Undecided
Unassigned

Bug Description

Description
===========
For a non-admin project, if you have instance launched. And try to query the usage information on GUI by clicking Overview or on CLI: openstack usage show

I will got "Error: Unable to retrieve usage information." on GUI. and the following ERROR for CLI output:

$ openstack usage show
Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'nova.exception.OrphanedObjectError'> (HTTP 500) (Request-ID: req-cbea9542-ecce-42fd-b660-fc5f996ea3c3)

Steps to reproduce
==================
Execute "openstack usage show" command
Or click Project - Compute - Overview button on the GUI.

Expected result
===============
No Error report and the usage information shown

Actual result
=============
Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'nova.exception.OrphanedObjectError'> (HTTP 500) (Request-ID: req-cbea9542-ecce-42fd-b660-fc5f996ea3c3)

Environment
===========
1. Exact version of OpenStack you are running.
    Openstack Stein on CentOS7

$ rpm -qa | grep nova
openstack-nova-api-19.0.1-1.el7.noarch
puppet-nova-14.4.0-1.el7.noarch
python2-nova-19.0.1-1.el7.noarch
openstack-nova-conductor-19.0.1-1.el7.noarch
openstack-nova-novncproxy-19.0.1-1.el7.noarch
openstack-nova-migration-19.0.1-1.el7.noarch
openstack-nova-common-19.0.1-1.el7.noarch
openstack-nova-scheduler-19.0.1-1.el7.noarch
openstack-nova-console-19.0.1-1.el7.noarch
python2-novaclient-13.0.1-1.el7.noarch
openstack-nova-placement-api-19.0.1-1.el7.noarch
openstack-nova-compute-19.0.1-1.el7.noarch

2. Which hypervisor did you use?
   Libvirt + KVM
   $ rpm -qa | grep kvm
qemu-kvm-ev-2.12.0-18.el7_6.5.1.x86_64
libvirt-daemon-kvm-4.5.0-10.el7_6.12.x86_64
qemu-kvm-common-ev-2.12.0-18.el7_6.5.1.x86_64
$ rpm -qa | grep libvirt
libvirt-gconfig-1.0.0-1.el7.x86_64
libvirt-daemon-driver-nwfilter-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-interface-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-config-nwfilter-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-storage-mpath-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-storage-core-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-secret-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-lxc-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-storage-rbd-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-kvm-4.5.0-10.el7_6.12.x86_64
libvirt-bash-completion-4.5.0-10.el7_6.12.x86_64
libvirt-4.5.0-10.el7_6.12.x86_64
libvirt-glib-1.0.0-1.el7.x86_64
libvirt-daemon-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-qemu-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-config-network-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-storage-disk-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-storage-4.5.0-10.el7_6.12.x86_64
libvirt-python-4.5.0-1.el7.x86_64
libvirt-libs-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-storage-scsi-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-network-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-nodedev-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-storage-logical-4.5.0-10.el7_6.12.x86_64
libvirt-daemon-driver-storage-iscsi-4.5.0-10.el7_6.12.x86_64
libvirt-client-4.5.0-10.el7_6.12.x86_64
libvirt-gobject-1.0.0-1.el7.x86_64

Logs & Configs
==============

nova-api.log

2019-07-26 16:12:53.967 8673 INFO nova.osapi_compute.wsgi.server [req-69d7df76-7dd9-4d42-8eeb-347ef1c9d0a5 f887cc44f21043dca85438d74a47d68d 0d47cfd5b9c94a5790fa4472e576cba6 - default default] c5f::e2 "GET /v2.1/0d47cfd5b9c94a5790fa4472e576cba6/servers/detail?all_tenants=True&changes-since=2019-07-26T08%3A07%3A55.280119%2B00%3A00 HTTP/1.1" status: 200 len: 413 time: 0.0639658
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi [req-cbea9542-ecce-42fd-b660-fc5f996ea3c3 1e45ea9a7d5647a6a938c2ac027822f2 85dd8936d21b46a8878ed59678c7ad9a - default default] Unexpected exception in API method: OrphanedObjectError: Cannot call obj_load_attr on orphaned Instance object
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi Traceback (most recent call last):
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/api/openstack/wsgi.py", line 671, in wrapped
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi return f(*args, **kwargs)
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/api/openstack/compute/simple_tenant_usage.py", line 291, in show
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi return self._show(req, id)
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/api/openstack/compute/simple_tenant_usage.py", line 354, in _show
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi detailed=True, limit=limit, marker=marker)
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/api/openstack/compute/simple_tenant_usage.py", line 155, in _tenant_usages_for_period
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi flavor = self._get_flavor(context, instance, flavors)
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/api/openstack/compute/simple_tenant_usage.py", line 89, in _get_flavor
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi return instance.get_flavor()
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 1191, in get_flavor
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi return getattr(self, attr)
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 67, in getter
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi self.obj_load_attr(name)
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 1114, in obj_load_attr
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi self._obj_load_attr(attrname)
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 1158, in _obj_load_attr
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi self._load_flavor()
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 967, in _load_flavor
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi self.flavor = instance.flavor
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 67, in getter
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi self.obj_load_attr(name)
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 1101, in obj_load_attr
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi objtype=self.obj_name())
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi OrphanedObjectError: Cannot call obj_load_attr on orphaned Instance object
2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi
2019-07-26 16:12:57.212 8682 INFO nova.api.openstack.wsgi [req-cbea9542-ecce-42fd-b660-fc5f996ea3c3 1e45ea9a7d5647a6a938c2ac027822f2 85dd8936d21b46a8878ed59678c7ad9a - default default] HTTP exception thrown: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'nova.exception.OrphanedObjectError'>
2019-07-26 16:12:57.213 8682 INFO nova.osapi_compute.wsgi.server [req-cbea9542-ecce-42fd-b660-fc5f996ea3c3 1e45ea9a7d5647a6a938c2ac027822f2 85dd8936d21b46a8878ed59678c7ad9a - default default] c5f::e2 "GET /v2.1/85dd8936d21b46a8878ed59678c7ad9a/os-simple-tenant-usage/85dd8936d21b46a8878ed59678c7ad9a?start=2019-06-28T08:12:56.417129&end=2019-07-27T08:12:56.417129 HTTP/1.1" status: 500 len: 641 time: 0.3035989
2019-07-26 16:13:00.458 8708 INFO nova.osapi_compute.wsgi.server [req-986dcfad-604b-429d-b9a9-11c8d1ad0036 f887cc44f21043dca85438d74a47d68d 0d47cfd5b9c94a5790fa4472e576cba6 - default default] c5f::e2 "GET /v2.1/0d47cfd5b9c94a5790fa4472e576cba6/servers/detail?all_tenants=True&changes-since=2019-07-26T08%3A08%3A02.099543%2B00%3A00 HTTP/1.1" status: 200 len: 413 time: 0.2606220
2019-07-26 16:13:05.857 8692 INFO nova.osapi_compute.wsgi.server [req-45304a2b-9bc3-4765-aeaf-5be12accbb11 f887cc44f21043dca85438d74a47d68d 0d47cfd5b9c94a5790fa4472e576cba6 - default default] c5f::fc "GET /v2.1/0d47cfd5b9c94a5790fa4472e576cba6/servers/detail?all_tenants=True&changes-since=2019-07-26T08%3A08%3A08.053015%2B00%3A00 HTTP/1.1" status: 200 len: 413 time: 0.0643589

Revision history for this message
Matt Riedemann (mriedem) wrote :

Looks like this is the request:

2019-07-26 16:12:57.213 8682 INFO nova.osapi_compute.wsgi.server [req-cbea9542-ecce-42fd-b660-fc5f996ea3c3 1e45ea9a7d5647a6a938c2ac027822f2 85dd8936d21b46a8878ed59678c7ad9a - default default] c5f::e2 "GET /v2.1/85dd8936d21b46a8878ed59678c7ad9a/os-simple-tenant-usage/85dd8936d21b46a8878ed59678c7ad9a?start=2019-06-28T08:12:56.417129&end=2019-07-27T08:12:56.417129 HTTP/1.1" status: 500 len: 641 time: 0.3035989

I then see this in the stacktrace:

2019-07-26 16:12:57.211 8682 ERROR nova.api.openstack.wsgi File "/usr/lib/python2.7/site-packages/nova/api/openstack/compute/simple_tenant_usage.py", line 291, in show

which should be this API:

https://docs.openstack.org/api-ref/compute/?expanded=#show-usage-statistics-for-tenant

it looks like the error is on trying to lazy-load an instance.flavor:

https://github.com/openstack/nova/blob/25e222c68a75f321015f655e9ad6a25365c86b9d/nova/api/openstack/compute/simple_tenant_usage.py#L155

It looks like the issue is probably that in this method that gets the instances from all cells:

https://github.com/openstack/nova/blob/25e222c68a75f321015f655e9ad6a25365c86b9d/nova/api/openstack/compute/simple_tenant_usage.py#L109

It's not storing the cell targeted context on the instance objects so trying to lazy load the flavor data from the cell fails. But why would it try to lazy-load the flavor since we are joining the flavor when we pull the instance from the DB:

https://github.com/openstack/nova/blob/25e222c68a75f321015f655e9ad6a25365c86b9d/nova/api/openstack/compute/simple_tenant_usage.py#L119

So that doesn't make much sense. I would have to try and recreate this in a devstack environment.

Revision history for this message
GUO Larry (larrygu) wrote :

Hi Matt Riedemann,

Thank you for looking into on this. I am trying to dig more today but found everything working well now for this command, no matter the user is admin or not, no matter the project is empty or not.

I didn't change anything between the time I reported the issue and now, just a weekend passed.

The timestamps I installed the OpenStack via packstack(RDO) is 20190723-185805-_OlLOQ. You can see the log of today:
2019-07-29 10:47:50.535 8692 INFO nova.osapi_compute.wsgi.server [req-d192ffe7-0085-4911-bb44-82fce08c1ea4 1e45ea9a7d5647a6a938c2ac027822f2 85dd8936d21b46a8878ed59678c7ad9a - default default] c5f::e2 "GET /v2.1/85dd8936d21b46a8878ed59678c7ad9a/os-simple-tenant-usage/85dd8936d21b46a8878ed59678c7ad9a?start=2019-07-01T02:47:49.717906&end=2019-07-30T02:47:49.717906 HTTP/1.1" status: 200 len: 1049 time: 0.2921960

The only difference from the API request is that the start time, the old one is June, the working one is July now. (Just Guess since My lab installed in July)

I have no lucky to trigger it again even I create a new project to try. But I can try to install a new one OpenStack in upcoming August to do a cross month query again.

Revision history for this message
sean mooney (sean-k-mooney) wrote :

i have tried to repoduce this with a devstack deploymen i left running over the weekend and it seams to work as intended with no error.

im going to unstack and restack to see if this only happen in a short interval after a cloud is first deploy but so far i cannot repoduce this.

Revision history for this message
sean mooney (sean-k-mooney) wrote :

on a fresh install i see the seam behavior, it just works fine with no error.
this seams to be specic to your actual deployment rather then a generic bug.

i wounder if this could be realted to a down cell when you have delete a flavor.

i have not looked at teh code just read matts comment but if the flavor was deleted after
the instance was booted then maybe it wont show up in the join requireing use to lazy load
the embeded copy. if the cell db connection drops at the same time when we lazy load it
then perhaps it triggers this error. that is just a guess but i doubt this is something that happens
frequently or other peole would have reported it.

Revision history for this message
Matt Riedemann (mriedem) wrote :

If the cell was down we'd fail to pull the instance from it. If the flavor is deleted it's deleted from the API DB not the cell DB, and the instance has a copy of the flavor in the instance_extra table anyway.

Changed in nova:
status: New → Incomplete
Revision history for this message
Matt Riedemann (mriedem) wrote :

Marking as Incomplete since we don't have a reproducer.

Revision history for this message
Matt Riedemann (mriedem) wrote :

I was also unable to recreate on a master devstack I created today. I created three servers as the demo tenant, the first failed and went to ERROR because I used a network the tenant didn't have access to. I deleted another of the successfully built servers. openstack usage show output was OK.

stack@devstack:~/devstack$ source openrc admin admin
stack@devstack:~/devstack$ openstack server list --deleted --all-projects
+--------------------------------------+----------------------+---------+----------+--------------------------+---------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+----------------------+---------+----------+--------------------------+---------+
| 91d4e14d-9192-490c-9656-582fa7d492c2 | test-vm2-private-net | DELETED | | cirros-0.4.0-x86_64-disk | m1.tiny |
+--------------------------------------+----------------------+---------+----------+--------------------------+---------+
stack@devstack:~/devstack$ source openrc demo demo
WARNING: setting legacy OS_TENANT_NAME to support cli tools.
stack@devstack:~/devstack$ openstack server list
+--------------------------------------+----------------------+--------+--------------------------------------------------------+--------------------------+---------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+----------------------+--------+--------------------------------------------------------+--------------------------+---------+
| a6cec8f5-ccae-4b90-b2c2-c7d919b802f1 | test-vm3-private-net | ACTIVE | private=fdb8:bc01:8364:0:f816:3eff:fe4a:ce3c, 10.0.0.4 | cirros-0.4.0-x86_64-disk | m1.tiny |
| 4ca7c725-c6df-4860-a5e7-58964a109cc1 | test-vm1 | ERROR | | cirros-0.4.0-x86_64-disk | m1.tiny |
+--------------------------------------+----------------------+--------+--------------------------------------------------------+--------------------------+---------+
stack@devstack:~/devstack$ openstack usage show
Usage from 2019-07-02 to 2019-07-31 on project d091b3b4126942768751c69e2ae510ac:
+---------------+-------+
| Field | Value |
+---------------+-------+
| CPU Hours | 0.05 |
| Disk GB-Hours | 0.05 |
| RAM MB-Hours | 26.92 |
| Servers | 2 |
+---------------+-------+
stack@devstack:~/devstack$

Nova commit:

commit 8ad437d1422292432e375f3f5c20fc2781726085 (HEAD -> master, origin/master, origin/HEAD)
Merge: e9e75e9f16 6be668e519
Author: Zuul <email address hidden>
Date: Tue Jul 30 12:31:32 2019 +0000

    Merge "Stop sending bad values from libosinfo to libvirt"

Revision history for this message
GUO Larry (larrygu) wrote :

Thank you all for taking care of this. I can't reproduce it via installing a new small OpenStack. You may close it as it can't reproduced.
I'll update it if I encounter again.

Chris Dent (cdent)
Changed in nova:
status: Incomplete → Invalid
Revision history for this message
Mohammed Naser (mnaser) wrote :

I've actually just been able to replicate this in an environment. The was due to the fact that instance_extras actually had 2 records for the same instance_uuid, one which was fully empty and the second which was populated.

When checking the database log, it seems that the system was actually failing to create instance_info_caches because of a conflict in the database, complaining that a record with the same instance_uuid already existed (...somehow?)

I assume that on the first run, the code creates instance_extra that's empty (but then fails to create instance_info_cache for ????) and on the second run it actually creates instance_extra once again (and the latter row starts getting populated).

We would fail harder on this if we had a constraint on a unique instance_uuid in instance_extra which we don't seem to have.

Revision history for this message
melanie witt (melwitt) wrote :

FWIW I am able to reproduce a state where the error is raised:

Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'nova.exception.OrphanedObjectError'> (HTTP 500) (Request-ID: req-75f3d4f1-9cb3-4d46-a0da-54f63764a0bb)

by using the following steps to get into a state where the database has been only partially archived:

http://paste.openstack.org/show/801782

To get an error for 'openstack usage', the deleted [but not purged] instance has to have been ACTIVE at some point in the past and to do a full archive before starting the --max_rows 1 loop to speed up hitting the bug.

If an instance's 'instance_extr'a record is archived (moved to the shadow table) while the 'instances' record remains, OrphanedObjectError will be hit for various API requests.

Revision history for this message
melanie witt (melwitt) wrote :
Changed in nova:
assignee: nobody → melanie witt (melwitt)
status: Invalid → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 23.0.0.0rc1

This issue was fixed in the openstack/nova 23.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/nova/+/785068
Committed: https://opendev.org/openstack/nova/commit/21241b38dd4bc2c74a69aed489ada5c611395920
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 21241b38dd4bc2c74a69aed489ada5c611395920
Author: melanie witt <email address hidden>
Date: Fri Feb 12 21:40:16 2021 +0000

    Add functional test for bug 1837995

    This adds a functional test to reproduce the bug where when calling an
    API that will examine a deleted instance record (examples: GET /servers/detail
    and GET /os-simple-tenant-usage), if that instance's related records
    have only been partially archived (example: 'instance_extra' record was
    archived but 'instances' record has not been archived yet) then the API
    call fails with the following error:

      Unexpected API Error ... <class 'nova.exception.OrphanedObjectError'>
      (HTTP 500)

    Related-Bug: #1837995

    Change-Id: Ia225ee86536248d2540de0d7a1cd012333e708a2
    (cherry picked from commit 3d4e53f832792cc49001f1588164e28b84c16d7a)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/victoria)
Download full text (4.2 KiB)

Reviewed: https://review.opendev.org/c/openstack/nova/+/785069
Committed: https://opendev.org/openstack/nova/commit/7b4f4796478941eafa9c0997f7ef03293c442d94
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 7b4f4796478941eafa9c0997f7ef03293c442d94
Author: melanie witt <email address hidden>
Date: Wed Jan 27 22:49:19 2021 +0000

    Dynamically archive FK related records in archive_deleted_rows

    Currently, it is possible to "partially archive" the database by
    running 'nova-manage db archive_deleted_rows' with --max_rows or by
    interrupting the archive process in any way. When this happens, it is
    possible to have archived a record with a foreign key relationship to a
    parent record (example: 'instance_extra' table record is archived while
    the 'instances' table record remains).

    When an instance's records become "split" in this way, any API request
    that can (1) access the deleted instance and (2) tries to access data
    that should be in a child table (example: the embedded flavor for an
    instance) will fail with an OrphanedObjectError and HTTP 500 to the
    user. Examples of APIs that are affected by this are the tenant usage
    APIs and listing of deleted instances as admin.

    In the tenant usage example, the API looks at deleted instances to
    calculate usage over a time period. It pulls deleted and non-deleted
    instances and does instance.get_flavor() to calculate their usage. The
    flavor data is expected to be present because
    expecteds_attrs=['flavor'] is used to do a join with the
    'instance_extra' table and populate the instance object's flavor data.
    When get_flavor() is called, it tries to access the instance.flavor
    attribute (which hasn't been populated because the 'instance_extra'
    record is gone). That triggers a lazy-load of the flavor which loads
    the instance from the database again with expected_attrs=['flavor']
    again which doesn't populate instance.flavor (again) because the
    'instance_extra' record is gone. Then the Instance._load_flavor code
    intentionally orphans the instance record to avoid triggering
    lazy-loads while it attempts to populate instance.flavor,
    instance.new_flavor, and instance.old_flavor. Finally, another
    lazy-load is triggered (because instance.flavor is still not populated)
    and fails with OrphanedObjectError.

    One way to solve this problem is to make it impossible for
    archive_deleted_records to orphan records that are related by foreign
    key relationships. The approach is to process parent tables first
    (opposite of today where we process child tables first) and find all of
    the tables that refer to it by foreign keys, create and collect
    insert/delete statements for those child records, and then put them all
    together in a single database transaction to archive all related
    records "atomically". The idea is that if anything were to interrupt
    the transaction (errors or other) it would roll back and keep all the
    related records together. Either all or archived or none are archived.

    This changes the logic of the per table archive to discover tables ...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 22.2.1

This issue was fixed in the openstack/nova 22.2.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/nova/+/785071
Committed: https://opendev.org/openstack/nova/commit/37cff3c74d54bf0d19abe9502e5979b81e24f918
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 37cff3c74d54bf0d19abe9502e5979b81e24f918
Author: melanie witt <email address hidden>
Date: Fri Feb 12 21:40:16 2021 +0000

    Add functional test for bug 1837995

    This adds a functional test to reproduce the bug where when calling an
    API that will examine a deleted instance record (examples: GET /servers/detail
    and GET /os-simple-tenant-usage), if that instance's related records
    have only been partially archived (example: 'instance_extra' record was
    archived but 'instances' record has not been archived yet) then the API
    call fails with the following error:

      Unexpected API Error ... <class 'nova.exception.OrphanedObjectError'>
      (HTTP 500)

    Related-Bug: #1837995

    Change-Id: Ia225ee86536248d2540de0d7a1cd012333e708a2
    (cherry picked from commit 3d4e53f832792cc49001f1588164e28b84c16d7a)
    (cherry picked from commit 21241b38dd4bc2c74a69aed489ada5c611395920)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ussuri)
Download full text (4.1 KiB)

Reviewed: https://review.opendev.org/c/openstack/nova/+/785072
Committed: https://opendev.org/openstack/nova/commit/8a858eca55544f25427c3762f14523b1507261ee
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 8a858eca55544f25427c3762f14523b1507261ee
Author: melanie witt <email address hidden>
Date: Wed Jan 27 22:49:19 2021 +0000

    Dynamically archive FK related records in archive_deleted_rows

    Currently, it is possible to "partially archive" the database by
    running 'nova-manage db archive_deleted_rows' with --max_rows or by
    interrupting the archive process in any way. When this happens, it is
    possible to have archived a record with a foreign key relationship to a
    parent record (example: 'instance_extra' table record is archived while
    the 'instances' table record remains).

    When an instance's records become "split" in this way, any API request
    that can (1) access the deleted instance and (2) tries to access data
    that should be in a child table (example: the embedded flavor for an
    instance) will fail with an OrphanedObjectError and HTTP 500 to the
    user. Examples of APIs that are affected by this are the tenant usage
    APIs and listing of deleted instances as admin.

    In the tenant usage example, the API looks at deleted instances to
    calculate usage over a time period. It pulls deleted and non-deleted
    instances and does instance.get_flavor() to calculate their usage. The
    flavor data is expected to be present because
    expecteds_attrs=['flavor'] is used to do a join with the
    'instance_extra' table and populate the instance object's flavor data.
    When get_flavor() is called, it tries to access the instance.flavor
    attribute (which hasn't been populated because the 'instance_extra'
    record is gone). That triggers a lazy-load of the flavor which loads
    the instance from the database again with expected_attrs=['flavor']
    again which doesn't populate instance.flavor (again) because the
    'instance_extra' record is gone. Then the Instance._load_flavor code
    intentionally orphans the instance record to avoid triggering
    lazy-loads while it attempts to populate instance.flavor,
    instance.new_flavor, and instance.old_flavor. Finally, another
    lazy-load is triggered (because instance.flavor is still not populated)
    and fails with OrphanedObjectError.

    One way to solve this problem is to make it impossible for
    archive_deleted_records to orphan records that are related by foreign
    key relationships. The approach is to process parent tables first
    (opposite of today where we process child tables first) and find all of
    the tables that refer to it by foreign keys, create and collect
    insert/delete statements for those child records, and then put them all
    together in a single database transaction to archive all related
    records "atomically". The idea is that if anything were to interrupt
    the transaction (errors or other) it would roll back and keep all the
    related records together. Either all or archived or none are archived.

    This changes the logic of the per table archive to discover tables th...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 21.2.3

This issue was fixed in the openstack/nova 21.2.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/train)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/785074
Reason: stable/train branch of nova projects' have been tagged as End of Life. All open patches have to be abandoned in order to be able to delete the branch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/785075
Reason: stable/train branch of nova projects' have been tagged as End of Life. All open patches have to be abandoned in order to be able to delete the branch.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.