Dynamically archive FK related records in archive_deleted_rows
Currently, it is possible to "partially archive" the database by
running 'nova-manage db archive_deleted_rows' with --max_rows or by
interrupting the archive process in any way. When this happens, it is
possible to have archived a record with a foreign key relationship to a
parent record (example: 'instance_extra' table record is archived while
the 'instances' table record remains).
When an instance's records become "split" in this way, any API request
that can (1) access the deleted instance and (2) tries to access data
that should be in a child table (example: the embedded flavor for an
instance) will fail with an OrphanedObjectError and HTTP 500 to the
user. Examples of APIs that are affected by this are the tenant usage
APIs and listing of deleted instances as admin.
In the tenant usage example, the API looks at deleted instances to
calculate usage over a time period. It pulls deleted and non-deleted
instances and does instance.get_flavor() to calculate their usage. The
flavor data is expected to be present because
expecteds_attrs=['flavor'] is used to do a join with the
'instance_extra' table and populate the instance object's flavor data.
When get_flavor() is called, it tries to access the instance.flavor
attribute (which hasn't been populated because the 'instance_extra'
record is gone). That triggers a lazy-load of the flavor which loads
the instance from the database again with expected_attrs=['flavor']
again which doesn't populate instance.flavor (again) because the
'instance_extra' record is gone. Then the Instance._load_flavor code
intentionally orphans the instance record to avoid triggering
lazy-loads while it attempts to populate instance.flavor,
instance.new_flavor, and instance.old_flavor. Finally, another
lazy-load is triggered (because instance.flavor is still not populated)
and fails with OrphanedObjectError.
One way to solve this problem is to make it impossible for
archive_deleted_records to orphan records that are related by foreign
key relationships. The approach is to process parent tables first
(opposite of today where we process child tables first) and find all of
the tables that refer to it by foreign keys, create and collect
insert/delete statements for those child records, and then put them all
together in a single database transaction to archive all related
records "atomically". The idea is that if anything were to interrupt
the transaction (errors or other) it would roll back and keep all the
related records together. Either all or archived or none are archived.
This changes the logic of the per table archive to discover tables that
refer to the table by foreign keys and generates insert/delete query
statements to execute in the same database transaction as the table
archive itself. The extra records archived along with the table are
added to the rows_archived result. The existing code for "archiving
records if instance is deleted" also has to be removed along with this
because the new logic does the same thing dynamically and makes it
obsolete. Finally, some assertions in the unit tests need to be changed
or removed because they were assuming certain types of archiving
failures due to foreign key constraint violations that can no longer
occur with the new dynamic logic for archiving child records.
Closes-Bug: #1837995
Conflicts: nova/db/sqlalchemy/api.py
NOTE(melwitt): The conflict is because change
I23bb9e539d08f5c6202909054c2dd49b6c7a7a0e (Remove six.text_type (1/2))
is not in Victoria.
Change-Id: Ie653e5ec69d16ae469f1f8171fee85aea754edff
(cherry picked from commit becb94ae643ab4863daa564783646921b4a2b372)
Reviewed: https:/ /review. opendev. org/c/openstack /nova/+ /785069 /opendev. org/openstack/ nova/commit/ 7b4f4796478941e afa9c0997f7ef03 293c442d94
Committed: https:/
Submitter: "Zuul (22348)"
Branch: stable/victoria
commit 7b4f4796478941e afa9c0997f7ef03 293c442d94
Author: melanie witt <email address hidden>
Date: Wed Jan 27 22:49:19 2021 +0000
Dynamically archive FK related records in archive_ deleted_ rows
Currently, it is possible to "partially archive" the database by deleted_ rows' with --max_rows or by
running 'nova-manage db archive_
interrupting the archive process in any way. When this happens, it is
possible to have archived a record with a foreign key relationship to a
parent record (example: 'instance_extra' table record is archived while
the 'instances' table record remains).
When an instance's records become "split" in this way, any API request
that can (1) access the deleted instance and (2) tries to access data
that should be in a child table (example: the embedded flavor for an
instance) will fail with an OrphanedObjectError and HTTP 500 to the
user. Examples of APIs that are affected by this are the tenant usage
APIs and listing of deleted instances as admin.
In the tenant usage example, the API looks at deleted instances to get_flavor( ) to calculate their usage. The attrs=[ 'flavor' ] is used to do a join with the extra' table and populate the instance object's flavor data. attrs=[ 'flavor' ] extra' record is gone. Then the Instance. _load_flavor code new_flavor, and instance. old_flavor. Finally, another rror.
calculate usage over a time period. It pulls deleted and non-deleted
instances and does instance.
flavor data is expected to be present because
expecteds_
'instance_
When get_flavor() is called, it tries to access the instance.flavor
attribute (which hasn't been populated because the 'instance_extra'
record is gone). That triggers a lazy-load of the flavor which loads
the instance from the database again with expected_
again which doesn't populate instance.flavor (again) because the
'instance_
intentionally orphans the instance record to avoid triggering
lazy-loads while it attempts to populate instance.flavor,
instance.
lazy-load is triggered (because instance.flavor is still not populated)
and fails with OrphanedObjectE
One way to solve this problem is to make it impossible for deleted_ records to orphan records that are related by foreign
archive_
key relationships. The approach is to process parent tables first
(opposite of today where we process child tables first) and find all of
the tables that refer to it by foreign keys, create and collect
insert/delete statements for those child records, and then put them all
together in a single database transaction to archive all related
records "atomically". The idea is that if anything were to interrupt
the transaction (errors or other) it would roll back and keep all the
related records together. Either all or archived or none are archived.
This changes the logic of the per table archive to discover tables that
refer to the table by foreign keys and generates insert/delete query
statements to execute in the same database transaction as the table
archive itself. The extra records archived along with the table are
added to the rows_archived result. The existing code for "archiving
records if instance is deleted" also has to be removed along with this
because the new logic does the same thing dynamically and makes it
obsolete. Finally, some assertions in the unit tests need to be changed
or removed because they were assuming certain types of archiving
failures due to foreign key constraint violations that can no longer
occur with the new dynamic logic for archiving child records.
Closes-Bug: #1837995
Conflicts:
nova/db/ sqlalchemy/ api.py
NOTE(melwitt): The conflict is because change 08f5c6202909054 c2dd49b6c7a7a0e (Remove six.text_type (1/2))
I23bb9e539d
is not in Victoria.
Change-Id: Ie653e5ec69d16a e469f1f8171fee8 5aea754edff 63daa5647836469 21b4a2b372)
(cherry picked from commit becb94ae643ab48