create_subnet takes O(N) time

Bug #1575247 reported by Ryan Moats
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Expired
High
Unassigned

Bug Description

Could this be part of the problem?

When creating a subnet, the network query appears to translate to:

     67 Query SELECT networks.tenant_id AS networks_tenant_id, networks.id AS networks_id, networks.name AS networks_name, networks.status AS networks_status, networks.admin_state_up AS networks_admin_state_up, networks.mtu AS networks_mtu, networks.vlan_transparent AS networks_vlan_transparent, networks.availability_zone_hints AS networks_availability_zone_hints, networks.standard_attr_id AS networks_standard_attr_id, subnetpoolprefixes_1.cidr AS subnetpoolprefixes_1_cidr, subnetpoolprefixes_1.subnetpool_id AS subnetpoolprefixes_1_subnetpool_id, standardattributes_1.created_at AS standardattributes_1_created_at, standardattributes_1.updated_at AS standardattributes_1_updated_at, standardattributes_1.id AS standardattributes_1_id, standardattributes_1.resource_type AS standardattributes_1_resource_type, standardattributes_1.description AS standardattributes_1_description, tags_1.standard_attr_id AS tags_1_standard_attr_id, tags_1.tag AS tags_1_tag, subnetpools_1.tenant_id AS subnetpools_1_tenant_id, subnetpools_1.id AS subnetpools_1_id, subnetpools_1.name AS subnetpools_1_name, subnetpools_1.ip_version AS subnetpools_1_ip_version, subnetpools_1.default_prefixlen AS subnetpools_1_default_prefixlen, subnetpools_1.min_prefixlen AS subnetpools_1_min_prefixlen, subnetpools_1.max_prefixlen AS subnetpools_1_max_prefixlen, subnetpools_1.shared AS subnetpools_1_shared, subnetpools_1.is_default AS subnetpools_1_is_default, subnetpools_1.default_quota AS subnetpools_1_default_quota, subnetpools_1.hash AS subnetpools_1_hash, subnetpools_1.address_scope_id AS subnetpools_1_address_scope_id, subnetpools_1.standard_attr_id AS subnetpools_1_standard_attr_id, ipallocationpools_1.id AS ipallocationpools_1_id, ipallocationpools_1.subnet_id AS ipallocationpools_1_subnet_id, ipallocationpools_1.first_ip AS ipallocationpools_1_first_ip, ipallocationpools_1.last_ip AS ipallocationpools_1_last_ip, dnsnameservers_1.address AS dnsnameservers_1_address, dnsnameservers_1.subnet_id AS dnsnameservers_1_subnet_id, dnsnameservers_1.`order` AS dnsnameservers_1_order, subnetroutes_1.destination AS subnetroutes_1_destination, subnetroutes_1.nexthop AS subnetroutes_1_nexthop, subnetroutes_1.subnet_id AS subnetroutes_1_subnet_id, networkrbacs_1.tenant_id AS networkrbacs_1_tenant_id, networkrbacs_1.id AS networkrbacs_1_id, networkrbacs_1.target_tenant AS networkrbacs_1_target_tenant, networkrbacs_1.action AS networkrbacs_1_action, networkrbacs_1.object_id AS networkrbacs_1_object_id, standardattributes_2.created_at AS standardattributes_2_created_at, standardattributes_2.updated_at AS standardattributes_2_updated_at, standardattributes_2.id AS standardattributes_2_id, standardattributes_2.resource_type AS standardattributes_2_resource_type, standardattributes_2.description AS standardattributes_2_description, tags_2.standard_attr_id AS tags_2_standard_attr_id, tags_2.tag AS tags_2_tag, subnets_1.tenant_id AS subnets_1_tenant_id, subnets_1.id AS subnets_1_id, subnets_1.name AS subnets_1_name, subnets_1.network_id AS subnets_1_network_id, subnets_1.subnetpool_id AS subnets_1_subnetpool_id, subnets_1.ip_version AS subnets_1_ip_version, subnets_1.cidr AS subnets_1_cidr, subnets_1.gateway_ip AS subnets_1_gateway_ip, subnets_1.enable_dhcp AS subnets_1_enable_dhcp, subnets_1.ipv6_ra_mode AS subnets_1_ipv6_ra_mode, subnets_1.ipv6_address_mode AS subnets_1_ipv6_address_mode, subnets_1.standard_attr_id AS subnets_1_standard_attr_id, networkrbacs_2.tenant_id AS networkrbacs_2_tenant_id, networkrbacs_2.id AS networkrbacs_2_id, networkrbacs_2.target_tenant AS networkrbacs_2_target_tenant, networkrbacs_2.action AS networkrbacs_2_action, networkrbacs_2.object_id AS networkrbacs_2_object_id, agents_1.id AS agents_1_id, agents_1.agent_type AS agents_1_agent_type, agents_1.`binary` AS agents_1_binary, agents_1.topic AS agents_1_topic, agents_1.host AS agents_1_host, agents_1.availability_zone AS agents_1_availability_zone, agents_1.admin_state_up AS agents_1_admin_state_up, agents_1.created_at AS agents_1_created_at, agents_1.started_at AS agents_1_started_at, agents_1.heartbeat_timestamp AS agents_1_heartbeat_timestamp, agents_1.description AS agents_1_description, agents_1.configurations AS agents_1_configurations, agents_1.resource_versions AS agents_1_resource_versions, agents_1.`load` AS agents_1_load, standardattributes_3.created_at AS standardattributes_3_created_at, standardattributes_3.updated_at AS standardattributes_3_updated_at, standardattributes_3.id AS standardattributes_3_id, standardattributes_3.resource_type AS standardattributes_3_resource_type, standardattributes_3.description AS standardattributes_3_description, tags_3.standard_attr_id AS tags_3_standard_attr_id, tags_3.tag AS tags_3_tag, externalnetworks_1.network_id AS externalnetworks_1_network_id, externalnetworks_1.is_default AS externalnetworks_1_is_default, networksecuritybindings_1.network_id AS networksecuritybindings_1_network_id, networksecuritybindings_1.port_security_enabled AS networksecuritybindings_1_port_security_enabled, qos_network_policy_bindings_1.policy_id AS qos_network_policy_bindings_1_policy_id, qos_network_policy_bindings_1.network_id AS qos_network_policy_bindings_1_network_id
FROM networks LEFT OUTER JOIN externalnetworks ON networks.id = externalnetworks.network_id LEFT OUTER JOIN subnets AS subnets_1 ON networks.id = subnets_1.network_id LEFT OUTER JOIN subnetpools AS subnetpools_1 ON subnets_1.subnetpool_id = subnetpools_1.id LEFT OUTER JOIN subnetpoolprefixes AS subnetpoolprefixes_1 ON subnetpools_1.id = subnetpoolprefixes_1.subnetpool_id LEFT OUTER JOIN standardattributes AS standardattributes_1 ON standardattributes_1.id = subnetpools_1.standard_attr_id LEFT OUTER JOIN tags AS tags_1 ON standardattributes_1.id = tags_1.standard_attr_id LEFT OUTER JOIN ipallocationpools AS ipallocationpools_1 ON subnets_1.id = ipallocationpools_1.subnet_id LEFT OUTER JOIN dnsnameservers AS dnsnameservers_1 ON subnets_1.id = dnsnameservers_1.subnet_id LEFT OUTER JOIN subnetroutes AS subnetroutes_1 ON subnets_1.id = subnetroutes_1.subnet_id LEFT OUTER JOIN networkrbacs AS networkrbacs_1 ON subnets_1.network_id = networkrbacs_1.object_id LEFT OUTER JOIN standardattributes AS standardattributes_2 ON standardattributes_2.id = subnets_1.standard_attr_id LEFT OUTER JOIN tags AS tags_2 ON standardattributes_2.id = tags_2.standard_attr_id LEFT OUTER JOIN networkrbacs AS networkrbacs_2 ON networks.id = networkrbacs_2.object_id LEFT OUTER JOIN (networkdhcpagentbindings AS networkdhcpagentbindings_1 INNER JOIN agents AS agents_1 ON agents_1.id = networkdhcpagentbindings_1.dhcp_agent_id) ON networks.id = networkdhcpagentbindings_1.network_id LEFT OUTER JOIN standardattributes AS standardattributes_3 ON standardattributes_3.id = networks.standard_attr_id LEFT OUTER JOIN tags AS tags_3 ON standardattributes_3.id = tags_3.standard_attr_id LEFT OUTER JOIN externalnetworks AS externalnetworks_1 ON networks.id = externalnetworks_1.network_id LEFT OUTER JOIN networksecuritybindings AS networksecuritybindings_1 ON networks.id = networksecuritybindings_1.network_id LEFT OUTER JOIN qos_network_policy_bindings AS qos_network_policy_bindings_1 ON networks.id = qos_network_policy_bindings_1.network_id
WHERE networks.id = 'b51ecade-9f8f-4f38-b045-a84e96a4e13a' ORDER BY dnsnameservers_1.`order`

Is this complexity really necessary?

Tags: db
Henry Gessau (gessau)
Changed in neutron:
status: New → Confirmed
Revision history for this message
Ryan Moats (rmoats) wrote :

Note: this statement was generated during a rally job run of create_and_list_subnets

Doug Wiegley (dougwig)
Changed in neutron:
importance: Undecided → High
Revision history for this message
Kevin Benton (kevinbenton) wrote :

What do you want to do? These are all related items (availability zones, external network status, etc).

Changed in neutron:
status: Confirmed → Incomplete
Revision history for this message
Ryan Tidwell (ryan-tidwell) wrote :

Just took a look at the SQL rendered by sqlalchemy at _get_by_id() in neutron/db/common_db_mixin.py, can confirm that I see this anytime you call this method to get a network by ID. Could this be related to the backrefs in the model? I wonder if sqlalchemy is following all the backrefs in the model which causes these joins to occur so the model object can be constructed.

Revision history for this message
Ryan Moats (rmoats) wrote :

@kevinbenton: right now, this query is showing linear execution time under scale. Therefore, I'm looking to see if we can't simplify it to something that executes in more O(1) time.

Revision history for this message
Kevin Benton (kevinbenton) wrote :

Linear with respect to what? Number of subnets on a single network, number of subnets total, number of IP addresses?

Revision history for this message
Ryan Moats (rmoats) wrote :

The rally test in question (create_and_list_subnets) creates two subnets per network. The linearity is in the total number of subnets and networks.

Revision history for this message
Ryan Tidwell (ryan-tidwell) wrote :

@Ryan: This is what I would expect _get_by_id() to render under in SQL. create_subnet() works operates on more fields than just id. It also works on subnets and external. That's just what I see from a quick glance at the code. I'm not sure what I would change here, sqlalchemy is rendering this SQL so it can construct a model object that is in fact used in legitimate ways during the creation of a subnet. Are we actually seeing this become a problem at scale? There doesn't seem to be much to go on here, I'd like to see some more concrete numbers that indicate create_subnet() doesn't scale as implemented.

Revision history for this message
Carl Baldwin (carl-baldwin) wrote :

Could we express this bug in terms of factors that really matter from an external point of view?

I don't think that the complexity of the query is what we should state as the bug. I think we need to rephrase the bug in a way that tells how the performance of the code is not meeting some reasonable expectation. First, start with what use case is being run. What is the observed performance? What was expected? What is acceptable?

The complexity of the query might the underlying problem but I'd like to see the process that led to that conclusion. Yesterday, I approved the revert of an optimization that was added because someone looked under the hood, thought some code would perform better if written another way, and proposed a change that didn't affect performance in any significant way but introduced a regression. I want to make sure that we're focused on real problems and not fixing things that look bad to us but don't make any real difference in the big picture.

Revision history for this message
Ryan Moats (rmoats) wrote :

@ryan-tidwell: graph is https://imagebin.ca/v/2f6m4SpZC5R6
vertical axis is execution time in seconds, horizontal axis is number of subnets (1-256).

Henry Gessau (gessau)
summary: - network query when creating subnet looks too complex
+ create_subnet takes O(N) time
description: updated
Revision history for this message
Kevin Benton (kevinbenton) wrote :

Is that graph showing the the number of subnets for the "subnets_per_network" parameter?

Creating subnets on a single network is always going to express this behavior just because of the fact that they need to be checked for overlapping CIDRs.

Revision history for this message
Ryan Moats (rmoats) wrote :

@kevinbenton: no... that is showing the total number of subnets - subnets_per_network is set to 2, so it is 128 networks for the 256 ports. I'm running a trial where the parameter is set to 1, so that each network has only a single subnet.

Revision history for this message
Ryan Moats (rmoats) wrote :

https://imagebin.ca/v/2f78G63grnsm holds the time with subnets_per_network = 1, so it's not overlapping CIDR checks.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.