performance degradation in agent<->server port wiring process
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
High
|
Kevin Benton |
Bug Description
The server<->agent communication process for wiring a port is taking a pretty significant amount of time. See the following analysis:
http://
Changed in neutron: | |
assignee: | nobody → Kevin Benton (kevinbenton) |
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master) | #1 |
Changed in neutron: | |
status: | New → In Progress |
OpenStack Infra (hudson-openstack) wrote : | #2 |
Fix proposed to branch: master
Review: https:/
Changed in neutron: | |
importance: | Undecided → High |
OpenStack Infra (hudson-openstack) wrote : | #3 |
Fix proposed to branch: master
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #4 |
Fix proposed to branch: master
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #5 |
Fix proposed to branch: master
Review: https:/
Kevin Benton (kevinbenton) wrote : | #6 |
I have a string of patches that will help address this increasing in complexity. We will have to decide what is safe enough to back-port for each one.
OpenStack Infra (hudson-openstack) wrote : | #7 |
Fix proposed to branch: master
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ocata) | #8 |
Fix proposed to branch: stable/ocata
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master) | #9 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 840e04b6f122f02
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 20:49:45 2017 -0800
Skip native DHCP notifications on status change
On profiling the get_devices_details communications between
the agent and the server, a significant amount of time
(60% in my dev env) is being spent in the AFTER_UPDATE events
for the port updates resulting from the port status changes.
One of the major offenders is the native DHCP agent notifier.
On each port update it ends up retrieving the network for the
port, the DHCP agents for the network, and the segments.
This patch addresses this particular issue by adding logic to
skip a DHCP notification if the only thing that changed on the
port was the status. The DHCP agent doesn't do anything based on
the status field so there is no need to update it when this is
the only change.
Change-Id: I948132924ec502
Partial-Bug: #1665215
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/newton) | #10 |
Fix proposed to branch: stable/newton
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/newton) | #11 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/newton
commit 3d272b220139279
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 20:49:45 2017 -0800
Skip native DHCP notifications on status change
On profiling the get_devices_details communications between
the agent and the server, a significant amount of time
(60% in my dev env) is being spent in the AFTER_UPDATE events
for the port updates resulting from the port status changes.
One of the major offenders is the native DHCP agent notifier.
On each port update it ends up retrieving the network for the
port, the DHCP agents for the network, and the segments.
This patch addresses this particular issue by adding logic to
skip a DHCP notification if the only thing that changed on the
port was the status. The DHCP agent doesn't do anything based on
the status field so there is no need to update it when this is
the only change.
Change-Id: I948132924ec502
Partial-Bug: #1665215
(cherry picked from commit 840e04b6f122f02
tags: | added: in-stable-newton |
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ocata) | #12 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/ocata
commit 958388fff0ae69f
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 20:49:45 2017 -0800
Skip native DHCP notifications on status change
On profiling the get_devices_details communications between
the agent and the server, a significant amount of time
(60% in my dev env) is being spent in the AFTER_UPDATE events
for the port updates resulting from the port status changes.
One of the major offenders is the native DHCP agent notifier.
On each port update it ends up retrieving the network for the
port, the DHCP agents for the network, and the segments.
This patch addresses this particular issue by adding logic to
skip a DHCP notification if the only thing that changed on the
port was the status. The DHCP agent doesn't do anything based on
the status field so there is no need to update it when this is
the only change.
Change-Id: I948132924ec502
Partial-Bug: #1665215
(cherry picked from commit 840e04b6f122f02
tags: | added: in-stable-ocata |
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master) | #13 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 4134f882cc695a4
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 21:13:18 2017 -0800
Make ML2 OVO push notification asynchronous
The OVO push notification logic was blocking the AFTER_UPDATE
event for all core resources. This isn't problematic for individual
HTTP API calls. However, the agent current updates the status of
every port it wires two times, including on agent restarts.
In order to drastically reduce the impact of this notifier, this
patch moves it into a spawned eventlet coroutine so it doesn't
block the main AFTER_UPDATE events. A semaphore is used to prevent
multiple coroutines from trying to hit the database at once in the
background.
This doesn't change the semantics of the notifier since its goal
is always to send the latest events.
Partial-Bug: #1665215
Change-Id: Ic259bad6f65f4b
OpenStack Infra (hudson-openstack) wrote : | #14 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 2c01ad9c564dcc2
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 21:36:12 2017 -0800
Avoid segment DB lookup in _expand_segment on port
The port context already has a network attached to it that contains
all of the segments for that network. In all cases the segments the
port wants to lookup should be present here so we can just iterate
over them without a separate DB query.
Partial-Bug: #1665215
Change-Id: I9014c4e401134f
OpenStack Infra (hudson-openstack) wrote : | #15 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 60edb4c9519543a
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 21:40:07 2017 -0800
Allow no network to be passed into PortContext
This allows a PortContext to be constructed without a network
passed in for cases like update_port_status that don't examine
bindings, segments, or any other network properties.
In the event that a mechanism driver is loaded that does reference
the 'network' property for these occasions, the context will look
up the network then in a late binding fashion.
This will improve the performance of the update_port_status call
from the provisioning blocks callback, which doesn't provide a
cached network to update_port_status.
Partial-Bug: #1665215
Change-Id: I498791614fd456
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ocata) | #16 |
Fix proposed to branch: stable/ocata
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #17 |
Fix proposed to branch: stable/ocata
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/newton) | #18 |
Fix proposed to branch: stable/newton
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ocata) | #19 |
Fix proposed to branch: stable/ocata
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/newton) | #20 |
Fix proposed to branch: stable/newton
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/newton) | #21 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/newton
commit ad6983141d7f6ac
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 21:36:12 2017 -0800
Avoid segment DB lookup in _expand_segment on port
The port context already has a network attached to it that contains
all of the segments for that network. In all cases the segments the
port wants to lookup should be present here so we can just iterate
over them without a separate DB query.
Conflicts:
neutron/
Partial-Bug: #1665215
Change-Id: I9014c4e401134f
(cherry picked from commit 2c01ad9c564dcc2
OpenStack Infra (hudson-openstack) wrote : | #22 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/newton
commit 7502c26d385072c
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 21:40:07 2017 -0800
Allow no network to be passed into PortContext
This allows a PortContext to be constructed without a network
passed in for cases like update_port_status that don't examine
bindings, segments, or any other network properties.
In the event that a mechanism driver is loaded that does reference
the 'network' property for these occasions, the context will look
up the network then in a late binding fashion.
This will improve the performance of the update_port_status call
from the provisioning blocks callback, which doesn't provide a
cached network to update_port_status.
Conflicts:
neutron/
Partial-Bug: #1665215
Change-Id: I498791614fd456
(cherry picked from commit 60edb4c9519543a
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ocata) | #23 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/ocata
commit bd7902f965848b3
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 21:40:07 2017 -0800
Allow no network to be passed into PortContext
This allows a PortContext to be constructed without a network
passed in for cases like update_port_status that don't examine
bindings, segments, or any other network properties.
In the event that a mechanism driver is loaded that does reference
the 'network' property for these occasions, the context will look
up the network then in a late binding fashion.
This will improve the performance of the update_port_status call
from the provisioning blocks callback, which doesn't provide a
cached network to update_port_status.
Partial-Bug: #1665215
Change-Id: I498791614fd456
(cherry picked from commit 60edb4c9519543a
OpenStack Infra (hudson-openstack) wrote : | #24 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/ocata
commit f465aa84238cb60
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 21:13:18 2017 -0800
Make ML2 OVO push notification asynchronous
The OVO push notification logic was blocking the AFTER_UPDATE
event for all core resources. This isn't problematic for individual
HTTP API calls. However, the agent current updates the status of
every port it wires two times, including on agent restarts.
In order to drastically reduce the impact of this notifier, this
patch moves it into a spawned eventlet coroutine so it doesn't
block the main AFTER_UPDATE events. A semaphore is used to prevent
multiple coroutines from trying to hit the database at once in the
background.
This doesn't change the semantics of the notifier since its goal
is always to send the latest events.
Partial-Bug: #1665215
Change-Id: Ic259bad6f65f4b
(cherry picked from commit 4134f882cc695a4
OpenStack Infra (hudson-openstack) wrote : | #25 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/ocata
commit 1960f384aacbdb5
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 21:36:12 2017 -0800
Avoid segment DB lookup in _expand_segment on port
The port context already has a network attached to it that contains
all of the segments for that network. In all cases the segments the
port wants to lookup should be present here so we can just iterate
over them without a separate DB query.
Partial-Bug: #1665215
Change-Id: I9014c4e401134f
(cherry picked from commit 2c01ad9c564dcc2
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master) | #26 |
Fix proposed to branch: master
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #27 |
Fix proposed to branch: master
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master) | #28 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 604e598a7d43b8b
Author: Kevin Benton <email address hidden>
Date: Thu Apr 6 05:01:30 2017 -0700
Allow offloading lookups in driver contexts
This allows segments looked up ahead of time to be passed
into NetworkContext objects and NetworkContext objects to
be passed into PortContext objects. This allows us to avoid
doing segments lookups for every PortContext construction
when handling a bunch of ports (e.g. in RPC handler).
Change-Id: Ib4c43a7894fe12
Partial-Bug: #1665215
OpenStack Infra (hudson-openstack) wrote : | #29 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 323eb7f2e146ecc
Author: Kevin Benton <email address hidden>
Date: Thu Apr 6 05:42:25 2017 -0700
Add some bulk lookup methods to ML2 for RPC handling
This adds three methods to make working with bulk port
DB lookups easier in ML2:
* partial_
full port IDs. This will allow us to eliminate many LIKE
queries and do one just once for all ports on an RPC call.
* get_port_
a map to port DB objects. This allows us to get access to
sqla obejcts for a bunch of ports without a custom
session.query call.
* get_network_
a bulk construction of NetworkContext objects and returns
them as a map of network_id to NetworkContext to avoid
expensive net lookups when constructing lots of PortContext
objects.
Partial-Bug: #1665215
Change-Id: I330eefbf429bd6
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master) | #30 |
Fix proposed to branch: master
Review: https:/
Changed in neutron: | |
assignee: | Kevin Benton (kevinbenton) → Brian Haley (brian-haley) |
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master) | #31 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 529da4e583aefe7
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 22:27:53 2017 -0800
Bulk up port context retrieval
With the switch to subquery relationships, individual get_port calls
can get expensive with large numbers of ports
(100ms per port in my dev environment). This patch bulks up the
retrieval of the port contexts so one set of queries covers all
of the devices in an RPC call.
Partial-Bug: #1665215
Change-Id: I63757e143b23c2
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master) | #32 |
Fix proposed to branch: master
Review: https:/
Changed in neutron: | |
assignee: | Brian Haley (brian-haley) → Kevin Benton (kevinbenton) |
Changed in neutron: | |
assignee: | Kevin Benton (kevinbenton) → Lujin Luo (luo-lujin) |
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ocata) | #33 |
Fix proposed to branch: stable/ocata
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/newton) | #34 |
Fix proposed to branch: stable/newton
Review: https:/
Changed in neutron: | |
assignee: | Lujin Luo (luo-lujin) → Kevin Benton (kevinbenton) |
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master) | #35 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit a04332ca4a3182d
Author: Kevin Benton <email address hidden>
Date: Thu Apr 27 11:56:56 2017 -0700
Separate port status update from getting details
This is a small refactor to separate updating the port status
from the method retrieving device details. This is in preparation
for patch I99c2b77b35e6ea
updating ports' status in bulk.
Change-Id: Ifa78f6911cfbbd
Partial-Bug: #1665215
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ocata) | #36 |
Fix proposed to branch: stable/ocata
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #37 |
Fix proposed to branch: stable/ocata
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ocata) | #38 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/ocata
commit a2ae48c2cee71d8
Author: Kevin Benton <email address hidden>
Date: Thu Apr 6 05:01:30 2017 -0700
Allow offloading lookups in driver contexts
This allows segments looked up ahead of time to be passed
into NetworkContext objects and NetworkContext objects to
be passed into PortContext objects. This allows us to avoid
doing segments lookups for every PortContext construction
when handling a bunch of ports (e.g. in RPC handler).
Conflicts:
neutron/
Change-Id: Ib4c43a7894fe12
Partial-Bug: #1665215
(cherry picked from commit 604e598a7d43b8b
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master) | #39 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 9f55d77016083ff
Author: Kevin Benton <email address hidden>
Date: Mon May 15 19:08:05 2017 -0700
Notify L2pop driver from update_
This patch calls update_port_up and update_port_down inside
of the l2pop driver directly from update_device_up and
update_
driver to setup forwarding entries completely independent of
the port status update events.
This will allow L2pop to function without requiring the ports to
transition from ACTIVE-
device details.
This will unblock the push notifications work and will additionally
enable us to remove the update to BUILD status as part of a performance
improvement backport for bug #1665215.
Partial-Bug: #1665215
Partially-
Change-Id: Icd4cd4e3f735e8
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ocata) | #40 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/ocata
commit ce0e4b25d5cbaba
Author: Kevin Benton <email address hidden>
Date: Thu Apr 6 05:42:25 2017 -0700
Add some bulk lookup methods to ML2 for RPC handling
This adds three methods to make working with bulk port
DB lookups easier in ML2:
* partial_
full port IDs. This will allow us to eliminate many LIKE
queries and do one just once for all ports on an RPC call.
* get_port_
a map to port DB objects. This allows us to get access to
sqla obejcts for a bunch of ports without a custom
session.query call.
* get_network_
a bulk construction of NetworkContext objects and returns
them as a map of network_id to NetworkContext to avoid
expensive net lookups when constructing lots of PortContext
objects.
Conflicts:
neutron/
neutron/
Partial-Bug: #1665215
Change-Id: I330eefbf429bd6
(cherry picked from commit 323eb7f2e146ecc
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master) | #41 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: master
commit 1be00e8239db3db
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 23:05:42 2017 -0800
Bulk up port status updating in ML2 RPC
This eliminates the last of the bottlenecks in
get_
updates use a bulk data retrieval as well.
The last remaining thing that will impact performance is the status
update back to ACTIVE on removal of the provisioning blocks. However,
that will require a much larger refactor since it is callback driven
at the individual port level.
Elimination of the L2pop driver will ultimately solve this completely
since we won't need to cycle the port status anymore on every single
agent restart.
Closes-Bug: #1665215
Change-Id: I99c2b77b35e6ea
Changed in neutron: | |
status: | In Progress → Fix Released |
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/newton) | #42 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/newton
commit bd7055daf23f03f
Author: Kevin Benton <email address hidden>
Date: Thu Apr 6 05:01:30 2017 -0700
Allow offloading lookups in driver contexts
This allows segments looked up ahead of time to be passed
into NetworkContext objects and NetworkContext objects to
be passed into PortContext objects. This allows us to avoid
doing segments lookups for every PortContext construction
when handling a bunch of ports (e.g. in RPC handler).
Conflicts:
neutron/
Change-Id: Ib4c43a7894fe12
Partial-Bug: #1665215
(cherry picked from commit 604e598a7d43b8b
(cherry picked from commit a2ae48c2cee71d8
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ocata) | #43 |
Reviewed: https:/
Committed: https:/
Submitter: Jenkins
Branch: stable/ocata
commit 2a32ae927147ebb
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 22:27:53 2017 -0800
Bulk up port context retrieval
With the switch to subquery relationships, individual get_port calls
can get expensive with large numbers of ports
(100ms per port in my dev environment). This patch bulks up the
retrieval of the port contexts so one set of queries covers all
of the devices in an RPC call.
Partial-Bug: #1665215
Change-Id: I63757e143b23c2
(cherry picked from commit 529da4e583aefe7
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 11.0.0.0b2 | #44 |
This issue was fixed in the openstack/neutron 11.0.0.0b2 development milestone.
tags: | added: neutron-proactive-backport-potential |
tags: | removed: neutron-proactive-backport-potential |
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ocata) | #45 |
Fix proposed to branch: stable/ocata
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #46 |
Fix proposed to branch: stable/ocata
Review: https:/
OpenStack Infra (hudson-openstack) wrote : | #47 |
Fix proposed to branch: stable/ocata
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ocata) | #48 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/ocata
commit 912b25f3d89b2f5
Author: Kevin Benton <email address hidden>
Date: Thu Apr 27 11:56:56 2017 -0700
Separate port status update from getting details
This is a small refactor to separate updating the port status
from the method retrieving device details. This is in preparation
for patch I99c2b77b35e6ea
updating ports' status in bulk.
Change-Id: Ifa78f6911cfbbd
Partial-Bug: #1665215
(cherry picked from commit a04332ca4a3182d
OpenStack Infra (hudson-openstack) wrote : | #49 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/ocata
commit e374e0e5e57d26c
Author: Kevin Benton <email address hidden>
Date: Wed Feb 15 23:05:42 2017 -0800
Bulk up port status updating in ML2 RPC
This eliminates the last of the bottlenecks in
get_
updates use a bulk data retrieval as well.
The last remaining thing that will impact performance is the status
update back to ACTIVE on removal of the provisioning blocks. However,
that will require a much larger refactor since it is callback driven
at the individual port level.
Elimination of the L2pop driver will ultimately solve this completely
since we won't need to cycle the port status anymore on every single
agent restart.
Closes-Bug: #1665215
Change-Id: I99c2b77b35e6ea
(cherry picked from commit 1be00e8239db3db
OpenStack Infra (hudson-openstack) wrote : | #50 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/ocata
commit 7b960e57ce25259
Author: Kevin Benton <email address hidden>
Date: Mon May 15 19:08:05 2017 -0700
Notify L2pop driver from update_
This patch calls update_port_up and update_port_down inside
of the l2pop driver directly from update_device_up and
update_
driver to setup forwarding entries completely independent of
the port status update events.
This will allow L2pop to function without requiring the ports to
transition from ACTIVE-
device details.
This will unblock the push notifications work and will additionally
enable us to remove the update to BUILD status as part of a performance
improvement backport for bug #1665215.
Partial-Bug: #1665215
Partially-
Change-Id: Icd4cd4e3f735e8
(cherry picked from commit 9f55d77016083ff
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron ocata-eol | #51 |
This issue was fixed in the openstack/neutron ocata-eol release.
Fix proposed to branch: master /review. openstack. org/434677
Review: https:/