Metadata API cross joining instance_metadata and instance_system_metadata

Bug #1799298 reported by Sergio de Carvalho
32
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
melanie witt
Ocata
Triaged
Medium
Unassigned
Pike
In Progress
Medium
Unassigned
Queens
Fix Released
Medium
s10
Rocky
Fix Released
Medium
s10
Stein
Fix Released
Undecided
s10
Train
Fix Released
Undecided
s10
Ussuri
Fix Released
Undecided
s10
Victoria
Fix Released
Undecided
s10
OpenStack Security Advisory
Won't Fix
Undecided
Unassigned

Bug Description

Description
===========

While troubleshooting a production issue we identified that the Nova metadata API is fetching a lot more raw data from the database than seems necessary. The problem appears to be caused by the SQL query used to fetch instance data, which joins the "instance" table with, among others, two metadata tables: "instance_metadata" and "instance_system_metadata". Below is a simplified version of this query which was captured by adding extra logging (the full query is listed at the end of this bug report):

SELECT ...
  FROM (SELECT ...
          FROM `instances`
         WHERE `instances` . `deleted` = ?
           AND `instances` . `uuid` = ?
         LIMIT ?) AS `anon_1`
  LEFT OUTER JOIN `instance_system_metadata` AS `instance_system_metadata_1`
    ON `anon_1` . `instances_uuid` = `instance_system_metadata_1` . `instance_uuid`
  LEFT OUTER JOIN (`security_group_instance_association` AS `security_group_instance_association_1`
                   INNER JOIN `security_groups` AS `security_groups_1`
                   ON `security_groups_1` . `id` = `security_group_instance_association_1` . `security_group_id`
                   AND `security_group_instance_association_1` . `deleted` = ?
                   AND `security_groups_1` . `deleted` = ? )
    ON `security_group_instance_association_1` . `instance_uuid` = `anon_1` . `instances_uuid`
   AND `anon_1` . `instances_deleted` = ?
  LEFT OUTER JOIN `security_group_rules` AS `security_group_rules_1`
    ON `security_group_rules_1` . `parent_group_id` = `security_groups_1` . `id`
   AND `security_group_rules_1` . `deleted` = ?
  LEFT OUTER JOIN `instance_info_caches` AS `instance_info_caches_1`
    ON `instance_info_caches_1` . `instance_uuid` = `anon_1` . `instances_uuid`
  LEFT OUTER JOIN `instance_extra` AS `instance_extra_1`
    ON `instance_extra_1` . `instance_uuid` = `anon_1` . `instances_uuid`
  LEFT OUTER JOIN `instance_metadata` AS `instance_metadata_1`
    ON `instance_metadata_1` . `instance_uuid` = `anon_1` . `instances_uuid`
   AND `instance_metadata_1` . `deleted` = ?

The instance table has a 1-to-many relationship to both "instance_metadata" and "instance_system_metadata" tables, so the query is effectively producing a cross join of both metadata tables.

Steps to reproduce
==================

To illustrate the impact of this query, add 2 properties to a running instance and verify that it has 2 records in "instance_metadata", as well as other records in "instance_system_metadata" such as base image properties:

> select instance_uuid,`key`,value from instance_metadata where instance_uuid = 'a6cf4a6a-effe-4438-9b7f-d61b23117b9b';
+--------------------------------------+-----------+--------+
| instance_uuid | key | value |
+--------------------------------------+-----------+--------+
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value |
+--------------------------------------+-----------+--------+
2 rows in set (0.61 sec)

> select instance_uuid,`key`,valusystem_metadata where instance_uuid = 'a6cf4a6a-effe-4438-9b7f-d61b23117b9b';
+------------------------+--------------------------------------+
| key | value |
+------------------------+--------------------------------------+
| image_disk_format | qcow2 |
| image_min_ram | 0 |
| image_min_disk | 20 |
| image_base_image_ref | 39cd564f-6a29-43e2-815b-62097968486a |
| image_container_format | bare |
+------------------------+--------------------------------------+
5 rows in set (0.00 sec)

For this particular instance, the generated query used by the metadata API will fetch 10 records from the database:

+--------------------------------------+-------------------------+---------------------------+--------------------------------+--------------------------------------+
| anon_1_instances_uuid | instance_metadata_1_key | instance_metadata_1_value | instance_system_metadata_1_key | instance_system_metadata_1_value |
+--------------------------------------+-------------------------+---------------------------+--------------------------------+--------------------------------------+
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | image_disk_format | qcow2 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | image_disk_format | qcow2 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | image_min_ram | 0 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | image_min_ram | 0 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | image_min_disk | 20 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | image_min_disk | 20 |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | image_base_image_ref | 39cd564f-6a29-43e2-815b-62097968486a |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | image_base_image_ref | 39cd564f-6a29-43e2-815b-62097968486a |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property1 | value1 | image_container_format | bare |
| a6cf4a6a-effe-4438-9b7f-d61b23117b9b | property2 | value | image_container_format | bare |
+--------------------------------------+-------------------------+---------------------------+--------------------------------+--------------------------------------+
10 rows in set (0.00 sec)

Of course this is only a problem when instances have a lot of metadata records. An instance with 50 rows in "instance_metadata" and 50 rows in "instance_system_metadata" will fetch 50 x 50 = 2,500 rows from the database. It's not difficult to see how this can escalate quickly. This can be a particularly significant problem in a HA scenario with multiple API nodes pulling data from multiple database nodes.

This issue is affecting clusters running OpenStack Mitaka. I verified that this is not an issue on clusters running Icehouse because, in Icehouse, instance data is pulled as needed, executing separate queries for each table. However, I as far as I could see, this issue could be affecting every release since Mitaka.

Full SQL
========

The generated SQL query below was captured by adding extra logging to _build_instance_get in

https://github.com/openstack/nova/blob/mitaka-eol/nova/db/sqlalchemy/api.py#L2005

SELECT `anon_1` . `instances_created_at` AS `anon_1_instances_created_at`,
       `anon_1` . `instances_updated_at` AS `anon_1_instances_updated_at`,
       `anon_1` . `instances_deleted_at` AS `anon_1_instances_deleted_at`,
       `anon_1` . `instances_deleted` AS `anon_1_instances_deleted`,
       `anon_1` . `instances_id` AS `anon_1_instances_id`,
       `anon_1` . `instances_user_id` AS `anon_1_instances_user_id`,
       `anon_1` . `instances_project_id` AS `anon_1_instances_project_id`,
       `anon_1` . `instances_image_ref` AS `anon_1_instances_image_ref`,
       `anon_1` . `instances_kernel_id` AS `anon_1_instances_kernel_id`,
       `anon_1` . `instances_ramdisk_id` AS `anon_1_instances_ramdisk_id`,
       `anon_1` . `instances_hostname` AS `anon_1_instances_hostname`,
       `anon_1` . `instances_launch_index` AS `anon_1_instances_launch_index`,
       `anon_1` . `instances_key_name` AS `anon_1_instances_key_name`,
       `anon_1` . `instances_key_data` AS `anon_1_instances_key_data`,
       `anon_1` . `instances_power_state` AS `anon_1_instances_power_state`,
       `anon_1` . `instances_vm_state` AS `anon_1_instances_vm_state`,
       `anon_1` . `instances_task_state` AS `anon_1_instances_task_state`,
       `anon_1` . `instances_memory_mb` AS `anon_1_instances_memory_mb`,
       `anon_1` . `instances_vcpus` AS `anon_1_instances_vcpus`,
       `anon_1` . `instances_root_gb` AS `anon_1_instances_root_gb`,
       `anon_1` . `instances_ephemeral_gb` AS `anon_1_instances_ephemeral_gb`,
       `anon_1` . `instances_ephemeral_key_uuid` AS `anon_1_instances_ephemeral_key_uuid`,
       `anon_1` . `instances_host` AS `anon_1_instances_host`,
       `anon_1` . `instances_node` AS `anon_1_instances_node`,
       `anon_1` . `instances_instance_type_id` AS `anon_1_instances_instance_type_id`,
       `anon_1` . `instances_user_data` AS `anon_1_instances_user_data`,
       `anon_1` . `instances_reservation_id` AS `anon_1_instances_reservation_id`,
       `anon_1` . `instances_launched_at` AS `anon_1_instances_launched_at`,
       `anon_1` . `instances_terminated_at` AS `anon_1_instances_terminated_at`,
       `anon_1` . `instances_availability_zone` AS `anon_1_instances_availability_zone`,
       `anon_1` . `instances_display_name` AS `anon_1_instances_display_name`,
       `anon_1` . `instances_display_description` AS `anon_1_instances_display_description`,
       `anon_1` . `instances_launched_on` AS `anon_1_instances_launched_on`,
       `anon_1` . `instances_locked` AS `anon_1_instances_locked`,
       `anon_1` . `instances_locked_by` AS `anon_1_instances_locked_by`,
       `anon_1` . `instances_os_type` AS `anon_1_instances_os_type`,
       `anon_1` . `instances_architecture` AS `anon_1_instances_architecture`,
       `anon_1` . `instances_vm_mode` AS `anon_1_instances_vm_mode`,
       `anon_1` . `instances_uuid` AS `anon_1_instances_uuid`,
       `anon_1` . `instances_root_device_name` AS `anon_1_instances_root_device_name`,
       `anon_1` . `instances_default_ephemeral_device` AS `anon_1_instances_default_ephemeral_device`,
       `anon_1` . `instances_default_swap_device` AS `anon_1_instances_default_swap_device`,
       `anon_1` . `instances_config_drive` AS `anon_1_instances_config_drive`,
       `anon_1` . `instances_access_ip_v4` AS `anon_1_instances_access_ip_v4`,
       `anon_1` . `instances_access_ip_v6` AS `anon_1_instances_access_ip_v6`,
       `anon_1` . `instances_auto_disk_config` AS `anon_1_instances_auto_disk_config`,
       `anon_1` . `instances_progress` AS `anon_1_instances_progress`,
       `anon_1` . `instances_shutdown_terminate` AS `anon_1_instances_shutdown_terminate`,
       `anon_1` . `instances_disable_terminate` AS `anon_1_instances_disable_terminate`,
       `anon_1` . `instances_cell_name` AS `anon_1_instances_cell_name`,
       `anon_1` . `instances_internal_id` AS `anon_1_instances_internal_id`,
       `anon_1` . `instances_cleaned` AS `anon_1_instances_cleaned`,
       `instance_system_metadata_1` . `created_at` AS `instance_system_metadata_1_created_at`,
       `instance_system_metadata_1` . `updated_at` AS `instance_system_metadata_1_updated_at`,
       `instance_system_metadata_1` . `deleted_at` AS `instance_system_metadata_1_deleted_at`,
       `instance_system_metadata_1` . `deleted` AS `instance_system_metadata_1_deleted`,
       `instance_system_metadata_1` . `id` AS `instance_system_metadata_1_id`,
       `instance_system_metadata_1` . `key` AS `instance_system_metadata_1_key`,
       `instance_system_metadata_1` . `value` AS `instance_system_metadata_1_value`,
       `instance_system_metadata_1` . `instance_uuid` AS `instance_system_metadata_1_instance_uuid`,
       `security_groups_1` . `created_at` AS `security_groups_1_created_at`,
       `security_groups_1` . `updated_at` AS `security_groups_1_updated_at`,
       `security_groups_1` . `deleted_at` AS `security_groups_1_deleted_at`,
       `security_groups_1` . `deleted` AS `security_groups_1_deleted`,
       `security_groups_1` . `id` AS `security_groups_1_id`,
       `security_groups_1` . `name` AS `security_groups_1_name`,
       `security_groups_1` . `description` AS `security_groups_1_description`,
       `security_groups_1` . `user_id` AS `security_groups_1_user_id`,
       `security_groups_1` . `project_id` AS `security_groups_1_project_id`,
       `security_group_rules_1` . `created_at` AS `security_group_rules_1_created_at`,
       `security_group_rules_1` . `updated_at` AS `security_group_rules_1_updated_at`,
       `security_group_rules_1` . `deleted_at` AS `security_group_rules_1_deleted_at`,
       `security_group_rules_1` . `deleted` AS `security_group_rules_1_deleted`,
       `security_group_rules_1` . `id` AS `security_group_rules_1_id`,
       `security_group_rules_1` . `parent_group_id` AS `security_group_rules_1_parent_group_id`,
       `security_group_rules_1` . `protocol` AS `security_group_rules_1_protocol`,
       `security_group_rules_1` . `from_port` AS `security_group_rules_1_from_port`,
       `security_group_rules_1` . `to_port` AS `security_group_rules_1_to_port`,
       `security_group_rules_1` . `cidr` AS `security_group_rules_1_cidr`,
       `security_group_rules_1` . `group_id` AS `security_group_rules_1_group_id`,
       `instance_info_caches_1` . `created_at` AS `instance_info_caches_1_created_at`,
       `instance_info_caches_1` . `updated_at` AS `instance_info_caches_1_updated_at`,
       `instance_info_caches_1` . `deleted_at` AS `instance_info_caches_1_deleted_at`,
       `instance_info_caches_1` . `deleted` AS `instance_info_caches_1_deleted`,
       `instance_info_caches_1` . `id` AS `instance_info_caches_1_id`,
       `instance_info_caches_1` . `network_info` AS `instance_info_caches_1_network_info`,
       `instance_info_caches_1` . `instance_uuid` AS `instance_info_caches_1_instance_uuid`,
       `instance_extra_1` . `flavor` AS `instance_extra_1_flavor`,
       `instance_extra_1` . `created_at` AS `instance_extra_1_created_at`,
       `instance_extra_1` . `updated_at` AS `instance_extra_1_updated_at`,
       `instance_extra_1` . `deleted_at` AS `instance_extra_1_deleted_at`,
       `instance_extra_1` . `deleted` AS `instance_extra_1_deleted`,
       `instance_extra_1` . `id` AS `instance_extra_1_id`,
       `instance_extra_1` . `instance_uuid` AS `instance_extra_1_instance_uuid`,
       `instance_metadata_1` . `created_at` AS `instance_metadata_1_created_at`,
       `instance_metadata_1` . `updated_at` AS `instance_metadata_1_updated_at`,
       `instance_metadata_1` . `deleted_at` AS `instance_metadata_1_deleted_at`,
       `instance_metadata_1` . `deleted` AS `instance_metadata_1_deleted`,
       `instance_metadata_1` . `id` AS `instance_metadata_1_id`,
       `instance_metadata_1` . `key` AS `instance_metadata_1_key`,
       `instance_metadata_1` . `value` AS `instance_metadata_1_value`,
       `instance_metadata_1` . `instance_uuid` AS `instance_metadata_1_instance_uuid`
  FROM (SELECT `instances` . `created_at` AS `instances_created_at`,
               `instances` . `updated_at` AS `instances_updated_at`,
               `instances` . `deleted_at` AS `instances_deleted_at`,
               `instances` . `deleted` AS `instances_deleted`,
               `instances` . `id` AS `instances_id`,
               `instances` . `user_id` AS `instances_user_id`,
               `instances` . `project_id` AS `instances_project_id`,
               `instances` . `image_ref` AS `instances_image_ref`,
               `instances` . `kernel_id` AS `instances_kernel_id`,
               `instances` . `ramdisk_id` AS `instances_ramdisk_id`,
               `instances` . `hostname` AS `instances_hostname`,
               `instances` . `launch_index` AS `instances_launch_index`,
               `instances` . `key_name` AS `instances_key_name`,
               `instances` . `key_data` AS `instances_key_data`,
               `instances` . `power_state` AS `instances_power_state`,
               `instances` . `vm_state` AS `instances_vm_state`,
               `instances` . `task_state` AS `instances_task_state`,
               `instances` . `memory_mb` AS `instances_memory_mb`,
               `instances` . `vcpus` AS `instances_vcpus`,
               `instances` . `root_gb` AS `instances_root_gb`,
               `instances` . `ephemeral_gb` AS `instances_ephemeral_gb`,
               `instances` . `ephemeral_key_uuid` AS `instances_ephemeral_key_uuid`,
               `instances` . `host` AS `instances_host`,
               `instances` . `node` AS `instances_node`,
               `instances` . `instance_type_id` AS `instances_instance_type_id`,
               `instances` . `user_data` AS `instances_user_data`,
               `instances` . `reservation_id` AS `instances_reservation_id`,
               `instances` . `launched_at` AS `instances_launched_at`,
               `instances` . `terminated_at` AS `instances_terminated_at`,
               `instances` . `availability_zone` AS `instances_availability_zone`,
               `instances` . `display_name` AS `instances_display_name`,
               `instances` . `display_description` AS `instances_display_description`,
               `instances` . `launched_on` AS `instances_launched_on`,
               `instances` . `locked` AS `instances_locked`,
               `instances` . `locked_by` AS `instances_locked_by`,
               `instances` . `os_type` AS `instances_os_type`,
               `instances` . `architecture` AS `instances_architecture`,
               `instances` . `vm_mode` AS `instances_vm_mode`,
               `instances` . `uuid` AS `instances_uuid`,
               `instances` . `root_device_name` AS `instances_root_device_name`,
               `instances` . `default_ephemeral_device` AS `instances_default_ephemeral_device`,
               `instances` . `default_swap_device` AS `instances_default_swap_device`,
               `instances` . `config_drive` AS `instances_config_drive`,
               `instances` . `access_ip_v4` AS `instances_access_ip_v4`,
               `instances` . `access_ip_v6` AS `instances_access_ip_v6`,
               `instances` . `auto_disk_config` AS `instances_auto_disk_config`,
               `instances` . `progress` AS `instances_progress`,
               `instances` . `shutdown_terminate` AS `instances_shutdown_terminate`,
               `instances` . `disable_terminate` AS `instances_disable_terminate`,
               `instances` . `cell_name` AS `instances_cell_name`,
               `instances` . `internal_id` AS `instances_internal_id`,
               `instances` . `cleaned` AS `instances_cleaned`
          FROM `instances`
         WHERE `instances` . `deleted` = ?
           AND `instances` . `uuid` = ?
         LIMIT ?) AS `anon_1`
  LEFT OUTER JOIN `instance_system_metadata` AS `instance_system_metadata_1`
    ON `anon_1` . `instances_uuid` = `instance_system_metadata_1` . `instance_uuid`
  LEFT OUTER JOIN (`security_group_instance_association` AS `security_group_instance_association_1`
                   INNER JOIN `security_groups` AS `security_groups_1`
                   ON `security_groups_1` . `id` = `security_group_instance_association_1` . `security_group_id`
                   AND `security_group_instance_association_1` . `deleted` = ?
                   AND `security_groups_1` . `deleted` = ? )
    ON `security_group_instance_association_1` . `instance_uuid` = `anon_1` . `instances_uuid`
   AND `anon_1` . `instances_deleted` = ?
  LEFT OUTER JOIN `security_group_rules` AS `security_group_rules_1`
    ON `security_group_rules_1` . `parent_group_id` = `security_groups_1` . `id`
   AND `security_group_rules_1` . `deleted` = ?
  LEFT OUTER JOIN `instance_info_caches` AS `instance_info_caches_1`
    ON `instance_info_caches_1` . `instance_uuid` = `anon_1` . `instances_uuid`
  LEFT OUTER JOIN `instance_extra` AS `instance_extra_1`
    ON `instance_extra_1` . `instance_uuid` = `anon_1` . `instances_uuid`
  LEFT OUTER JOIN `instance_metadata` AS `instance_metadata_1`
    ON `instance_metadata_1` . `instance_uuid` = `anon_1` . `instances_uuid`
   AND `instance_metadata_1` . `deleted` = ?

Environment
===========

1. Exact version of OpenStack you are running.

OpenStack Mitaka

openstack-nova-api.noarch 1:13.1.4-1.el7

Code suggests this could be affecting every release since Mitaka.

2. Which hypervisor did you use?

qemu-kvm.x86_64 10:1.5.3-141.el7_4.6
qemu-kvm-common.x86_64 10:1.5.3-141.el7_4.6

Reference
=========

Discussion on the OpenStack dev mailing list:

http://lists.openstack.org/pipermail/openstack-dev/2018-October/135945.html

Matt Riedemann (mriedem)
tags: added: api db metadata performance
Changed in nova:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Matt Riedemann (mriedem) wrote :

I think we can make a couple of simple enhancements by:

1. only join on security_groups if using nova-network

2. only join on system_metadata if using dynamic vendordata providers

Backporting the system_metadata change would be tricky though since we did use system_metadata to perform an online data migration for at least the instance.flavor information. Having said that, we removed that system_metadata <> flavor migration code in Rocky:

https://review.openstack.org/#/c/508357/

And as that commit message says, you couldn't upgrade to Liberty if you hadn't migrated the flavor data:

https://review.openstack.org/#/c/174480/

So maybe it's OK to at least backport this metadata-api join improvement to at least Pike.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/624778

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.openstack.org/624778
Reason: The join in security_groups should be trivial since the table should be empty in neutron cases, and when getting base metadata we also return the password which is stored in instance.system_metadata, so pre-joining instance_system_metadata is actually something we should be doing when we pull the instance from the cell database - it's used more than just vendordata v2 requests, which makes this change essentially not worthwhile.

We either need to optimize how we do the system_metadata joins in the sqlalchemy API code, or we need to reconsider how we store the instance password (move it out of system_metadata).

Matt Riedemann (mriedem)
Changed in nova:
status: In Progress → Confirmed
assignee: Matt Riedemann (mriedem) → nobody
Revision history for this message
Summer Long (slong-g) wrote :

Hi, a Red Hat customer is now seeing this issue with DoS as result. Reproduce by:
1.add 128 properties to an image
2.add 128 metadata
3.Max out the user_data

Result: 1Gb of data transfer is then allowed, creating huge load and instability on the system.

Because of the DoS, RH sees this as a security issue/CVE (CWE-400).

Revision history for this message
melanie witt (melwitt) wrote :

I have changed the information type on this bug to Public Security so it will show up on the Vulnerability Management Team's radar.

information type: Public → Public Security
Revision history for this message
Jeremy Stanley (fungi) wrote :

I'm inclined to classify this as a security hardening opportunity, since there are likely countless ways a server instance can hammer the metadata API and connecting networks, and the offending system can be readily identified and disabled by operators. The security impact also obviously only impacts environments which expose the metadata service, so is not a risk for those providing only configdrive.

Changed in ossa:
status: New → Incomplete
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/758928

Changed in nova:
assignee: nobody → melanie witt (melwitt)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/758928
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e728fe668a6de886455f2dbaf655c8a151462c8c
Submitter: Zuul
Branch: master

commit e728fe668a6de886455f2dbaf655c8a151462c8c
Author: melanie witt <email address hidden>
Date: Tue Oct 20 21:46:13 2020 +0000

    Use subqueryload() instead of joinedload() for (system_)metadata

    Currently, when we "get" a single instance from the database and we
    load metadata and system_metadata, we do so using a joinedload() which
    does JOINs with the respective tables. Because of the one-to-many
    relationship between an instance and (system_)metadata records, doing
    the database query this way can result in a large number of additional
    rows being returned unnecessarily and cause a large data transfer.

    This is similar to the problem addressed by change
    I0610fb16ccce2ee95c318589c8abcc30613a3fe9 which added separate queries
    for (system_)metadata when we "get" multiple instances. We don't,
    however, reuse the same code for this change because
    _instances_fill_metadata converts the instance database object to a
    dict, and some callers of _instance_get_by_uuid need to be able to
    access an instance database object attached to the session (example:
    instance_update_and_get_original).

    By using subqueryload() [1], we can perform the additional queries for
    (system_)metadata to solve the problem with a similar approach.

    Closes-Bug: #1799298

    [1] https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html#subquery-eager-loading

    Change-Id: I5c071f70f669966e9807b38e99077c1cae5b4606

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
Jeremy Stanley (fungi) wrote :

As there have been no objections to my proposal to treat this as security hardening, I'm switching our advisory task to won't fix status indicating we won't publish any formal security advisory about it. This decision can of course be reversed if people feel differently about the relative severity and practicality of exploits for it.

Changed in ossa:
status: Incomplete → Won't Fix
information type: Public Security → Public
tags: added: security
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/761809

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/761810

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/761811

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/761812

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/761813

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/761814

Revision history for this message
melanie witt (melwitt) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 22.1.0

This issue was fixed in the openstack/nova 22.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 21.2.0

This issue was fixed in the openstack/nova 21.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 23.0.0.0rc1

This issue was fixed in the openstack/nova 23.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/rocky)

Reviewed: https://review.opendev.org/c/openstack/nova/+/761813
Committed: https://opendev.org/openstack/nova/commit/68f80d201310dcaf688d8437a46c8d272ead317d
Submitter: "Zuul (22348)"
Branch: stable/rocky

commit 68f80d201310dcaf688d8437a46c8d272ead317d
Author: melanie witt <email address hidden>
Date: Tue Oct 20 21:46:13 2020 +0000

    Use subqueryload() instead of joinedload() for (system_)metadata

    Currently, when we "get" a single instance from the database and we
    load metadata and system_metadata, we do so using a joinedload() which
    does JOINs with the respective tables. Because of the one-to-many
    relationship between an instance and (system_)metadata records, doing
    the database query this way can result in a large number of additional
    rows being returned unnecessarily and cause a large data transfer.

    This is similar to the problem addressed by change
    I0610fb16ccce2ee95c318589c8abcc30613a3fe9 which added separate queries
    for (system_)metadata when we "get" multiple instances. We don't,
    however, reuse the same code for this change because
    _instances_fill_metadata converts the instance database object to a
    dict, and some callers of _instance_get_by_uuid need to be able to
    access an instance database object attached to the session (example:
    instance_update_and_get_original).

    By using subqueryload() [1], we can perform the additional queries for
    (system_)metadata to solve the problem with a similar approach.

    Closes-Bug: #1799298

    [1] https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html#subquery-eager-loading

    Change-Id: I5c071f70f669966e9807b38e99077c1cae5b4606
    (cherry picked from commit e728fe668a6de886455f2dbaf655c8a151462c8c)
    (cherry picked from commit 63d2e62c3a223f883ca810f4c66a2a236cf3d483)
    (cherry picked from commit e7a45e0335e4cf44fec7f7b8d2505f5b95445cf9)
    (cherry picked from commit 4350074029ffbc03ab238c442ec86fab6560e365)
    (cherry picked from commit ad7e4fb8f4ea6c458af00ec7aa0b321dc37c097c)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 20.6.1

This issue was fixed in the openstack/nova 20.6.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.opendev.org/c/openstack/nova/+/761814
Committed: https://opendev.org/openstack/nova/commit/c1de4fb13e4c7d220ff2e42f24f06b4bbe53521f
Submitter: "Zuul (22348)"
Branch: stable/queens

commit c1de4fb13e4c7d220ff2e42f24f06b4bbe53521f
Author: melanie witt <email address hidden>
Date: Tue Oct 20 21:46:13 2020 +0000

    Use subqueryload() instead of joinedload() for (system_)metadata

    Currently, when we "get" a single instance from the database and we
    load metadata and system_metadata, we do so using a joinedload() which
    does JOINs with the respective tables. Because of the one-to-many
    relationship between an instance and (system_)metadata records, doing
    the database query this way can result in a large number of additional
    rows being returned unnecessarily and cause a large data transfer.

    This is similar to the problem addressed by change
    I0610fb16ccce2ee95c318589c8abcc30613a3fe9 which added separate queries
    for (system_)metadata when we "get" multiple instances. We don't,
    however, reuse the same code for this change because
    _instances_fill_metadata converts the instance database object to a
    dict, and some callers of _instance_get_by_uuid need to be able to
    access an instance database object attached to the session (example:
    instance_update_and_get_original).

    By using subqueryload() [1], we can perform the additional queries for
    (system_)metadata to solve the problem with a similar approach.

    Closes-Bug: #1799298

    [1] https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html#subquery-eager-loading

    Change-Id: I5c071f70f669966e9807b38e99077c1cae5b4606
    (cherry picked from commit e728fe668a6de886455f2dbaf655c8a151462c8c)
    (cherry picked from commit 63d2e62c3a223f883ca810f4c66a2a236cf3d483)
    (cherry picked from commit e7a45e0335e4cf44fec7f7b8d2505f5b95445cf9)
    (cherry picked from commit 4350074029ffbc03ab238c442ec86fab6560e365)
    (cherry picked from commit ad7e4fb8f4ea6c458af00ec7aa0b321dc37c097c)
    (cherry picked from commit 68f80d201310dcaf688d8437a46c8d272ead317d)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.opendev.org/c/openstack/nova/+/799533

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/pike)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/pike
Review: https://review.opendev.org/c/openstack/nova/+/799533
Reason: stable/pike has transitioned to End of Life for nova, open patches need to be abandoned in order to be able to delete the branch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova queens-eol

This issue was fixed in the openstack/nova queens-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova rocky-eol

This issue was fixed in the openstack/nova rocky-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova stein-eol

This issue was fixed in the openstack/nova stein-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.