Performance issue when creating lots of ports

Bug #2009055 reported by Felix Huettner
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Felix Huettner

Bug Description

When creating ~250 ports on a single subnet we currently take around 140 seconds.

This duration increased significantly with the introduction of a fix for https://bugs.launchpad.net/neutron/+bug/1865891 that exclusively locks a subnet when a port is created/updated in there.

This was observed in devstack wih ovn as a backend.

Reproducer:

1. Create a network and a subnet
2. Run
```
for i in `seq 1 250`; do curl -g http://10.1.0.129:9696/networking/v2.0/ports -H "Content-Type: application/json" -H "X-Auth-Token: sometoken" -X POST -d '{"port": {"admin_state_up": true, "name": "test", "network_id": "somenetwork"}}' & done
time wait
```

We see this causing issues on highly frequented subnets (e.g. the ones hosting our public ips) as the requests take long and die because of our 20 second database max_statement_time.

Changed in neutron:
status: New → In Progress
Revision history for this message
Felix Huettner (felix.huettner) wrote :
Changed in neutron:
assignee: nobody → Felix Huettner (felix.huettner)
tags: added: loadimpact
Changed in neutron:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by "Slawek Kaplonski <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/875938
Reason: This review is > 4 weeks without comment, and failed Zuul jobs the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by "Felix Huettner <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/875938
Reason: will be replaced

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/881943

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/881943
Committed: https://opendev.org/openstack/neutron/commit/c0af5b3b5ea89d3147adf1054625f29d5b01b309
Submitter: "Zuul (22348)"
Branch: master

commit c0af5b3b5ea89d3147adf1054625f29d5b01b309
Author: Felix Huettner <email address hidden>
Date: Wed Mar 1 16:14:18 2023 +0100

    Reduce lock contention on subnets

    in [1] a lock was introduced with the goal of preventing subnets from
    being deleted while ports are being created in them in parallel.
    This was acheived by aquiring an exclusive lock on the row of the
    subnet in the Subnet table when adding/modifying a port or deleting
    the subnet.

    However as this was a exclusive lock it also prevented concurrent port
    modifications on the same subnet from happening. This can cause
    performance issues on environment with large shared subnets (e.g. a
    large external subnet).

    To reduce the lock contention for this case we split the lock in two
    parts:

    * For normal port operations we will aquire a shared lock on the
      row of the subnet. This allows multiple such operations to happen in
      parallel.
    * For deleting a subnet we will aquire an exclusive lock on the row of
      the subnet. This lock can not be aquired when there is any shared
      lock currently on the row.

    With this we maintain the same locking level as before, but reduce the
    amount of lock contention happening and thereby improve throughput.

    The performance improvement can be measured using rally test [2].
    (improving from 21 to 18 seconds).
    Alternatively it can be tested using 250 parallel curl calls to create a
    port in the same network. This improves from 113s to 42s.

    [1]: https://review.opendev.org/c/openstack/neutron/+/713045
    [2]: https://github.com/openstack/rally-openstack/blob/master/samples/tasks/scenarios/neutron/create-and-delete-ports.json

    Closes-Bug: #2009055
    Change-Id: I31b1a9c2f986f59fee0da265acebbd88d2f8e4f8

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 23.0.0.0b2

This issue was fixed in the openstack/neutron 23.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/2023.1)

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/neutron/+/893082

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/zed)

Fix proposed to branch: stable/zed
Review: https://review.opendev.org/c/openstack/neutron/+/893084

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/2023.1)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/893082
Committed: https://opendev.org/openstack/neutron/commit/c89d028a955721c8132ca23c44e84a59d5fd99ce
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit c89d028a955721c8132ca23c44e84a59d5fd99ce
Author: Felix Huettner <email address hidden>
Date: Wed Mar 1 16:14:18 2023 +0100

    Reduce lock contention on subnets

    HINT: This isn't a clean backport, as we keep the subnet in-use field.
    We can't backport the db update that would remove the field.

    in [1] a lock was introduced with the goal of preventing subnets from
    being deleted while ports are being created in them in parallel.
    This was acheived by aquiring an exclusive lock on the row of the
    subnet in the Subnet table when adding/modifying a port or deleting
    the subnet.

    However as this was a exclusive lock it also prevented concurrent port
    modifications on the same subnet from happening. This can cause
    performance issues on environment with large shared subnets (e.g. a
    large external subnet).

    To reduce the lock contention for this case we split the lock in two
    parts:

    * For normal port operations we will aquire a shared lock on the
      row of the subnet. This allows multiple such operations to happen in
      parallel.
    * For deleting a subnet we will aquire an exclusive lock on the row of
      the subnet. This lock can not be aquired when there is any shared
      lock currently on the row.

    With this we maintain the same locking level as before, but reduce the
    amount of lock contention happening and thereby improve throughput.

    The performance improvement can be measured using rally test [2].
    (improving from 21 to 18 seconds).
    Alternatively it can be tested using 250 parallel curl calls to create a
    port in the same network. This improves from 113s to 42s.

    [1]: https://review.opendev.org/c/openstack/neutron/+/713045
    [2]: https://github.com/openstack/rally-openstack/blob/master/samples/tasks/scenarios/neutron/create-and-delete-ports.json

    Closes-Bug: #2009055
    Change-Id: I31b1a9c2f986f59fee0da265acebbd88d2f8e4f8
    (cherry picked from commit c0af5b3b5ea89d3147adf1054625f29d5b01b309)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/889238
Committed: https://opendev.org/openstack/neutron/commit/921cde14c15611d1ad4153881c3eae4f56e09a4f
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 921cde14c15611d1ad4153881c3eae4f56e09a4f
Author: Felix Huettner <email address hidden>
Date: Wed Mar 1 16:14:18 2023 +0100

    Reduce lock contention on subnets

    HINT: This isn't a clean backport, as we keep the subnet in-use field.
    We can't backport the db update that would remove the field.

    in [1] a lock was introduced with the goal of preventing subnets from
    being deleted while ports are being created in them in parallel.
    This was acheived by aquiring an exclusive lock on the row of the
    subnet in the Subnet table when adding/modifying a port or deleting
    the subnet.

    However as this was a exclusive lock it also prevented concurrent port
    modifications on the same subnet from happening. This can cause
    performance issues on environment with large shared subnets (e.g. a
    large external subnet).

    To reduce the lock contention for this case we split the lock in two
    parts:

    * For normal port operations we will aquire a shared lock on the
      row of the subnet. This allows multiple such operations to happen in
      parallel.
    * For deleting a subnet we will aquire an exclusive lock on the row of
      the subnet. This lock can not be aquired when there is any shared
      lock currently on the row.

    With this we maintain the same locking level as before, but reduce the
    amount of lock contention happening and thereby improve throughput.

    The performance improvement can be measured using rally test [2].
    (improving from 21 to 18 seconds).
    Alternatively it can be tested using 250 parallel curl calls to create a
    port in the same network. This improves from 113s to 42s.

    [1]: https://review.opendev.org/c/openstack/neutron/+/713045
    [2]: https://github.com/openstack/rally-openstack/blob/master/samples/tasks/scenarios/neutron/create-and-delete-ports.json

    Closes-Bug: #2009055
    Change-Id: I31b1a9c2f986f59fee0da265acebbd88d2f8e4f8
    (cherry picked from commit c0af5b3b5ea89d3147adf1054625f29d5b01b309)

tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/zed)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/893084
Committed: https://opendev.org/openstack/neutron/commit/d25c129ec2d541986b5dcd12bc2bd48d51b87111
Submitter: "Zuul (22348)"
Branch: stable/zed

commit d25c129ec2d541986b5dcd12bc2bd48d51b87111
Author: Felix Huettner <email address hidden>
Date: Wed Mar 1 16:14:18 2023 +0100

    Reduce lock contention on subnets

    HINT: This isn't a clean backport, as we keep the subnet in-use field.
    We can't backport the db update that would remove the field.

    in [1] a lock was introduced with the goal of preventing subnets from
    being deleted while ports are being created in them in parallel.
    This was acheived by aquiring an exclusive lock on the row of the
    subnet in the Subnet table when adding/modifying a port or deleting
    the subnet.

    However as this was a exclusive lock it also prevented concurrent port
    modifications on the same subnet from happening. This can cause
    performance issues on environment with large shared subnets (e.g. a
    large external subnet).

    To reduce the lock contention for this case we split the lock in two
    parts:

    * For normal port operations we will aquire a shared lock on the
      row of the subnet. This allows multiple such operations to happen in
      parallel.
    * For deleting a subnet we will aquire an exclusive lock on the row of
      the subnet. This lock can not be aquired when there is any shared
      lock currently on the row.

    With this we maintain the same locking level as before, but reduce the
    amount of lock contention happening and thereby improve throughput.

    The performance improvement can be measured using rally test [2].
    (improving from 21 to 18 seconds).
    Alternatively it can be tested using 250 parallel curl calls to create a
    port in the same network. This improves from 113s to 42s.

    [1]: https://review.opendev.org/c/openstack/neutron/+/713045
    [2]: https://github.com/openstack/rally-openstack/blob/master/samples/tasks/scenarios/neutron/create-and-delete-ports.json

    Closes-Bug: #2009055
    Change-Id: I31b1a9c2f986f59fee0da265acebbd88d2f8e4f8
    (cherry picked from commit c0af5b3b5ea89d3147adf1054625f29d5b01b309)

tags: added: in-stable-zed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 22.1.0

This issue was fixed in the openstack/neutron 22.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.5.0

This issue was fixed in the openstack/neutron 20.5.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 21.2.0

This issue was fixed in the openstack/neutron 21.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.