Rally test NeutronNetworks.create_and_update_subnets fails

Bug #1920923 reported by Slawek Kaplonski
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Rodolfo Alonso

Bug Description

It happens pretty often recently that test NeutronNetworks.create_and_update_subnets in neutron-rally-task job is failing.
Examples:

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_97b/681671/13/check/neutron-rally-task/97b75cd/results/report.html#/NeutronNetworks.create_and_update_subnets/overview

https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_716/780548/3/check/neutron-rally-task/7162e07/results/report.html#/NeutronNetworks.create_and_update_subnets/failures

https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_156/318542/6/check/neutron-rally-task/1567769/results/report.html#/NeutronNetworks.create_and_update_subnets/failures

https://a4882fb7db4c401136c2-acad0afc1440c186988309ce1e0a4290.ssl.cf5.rackcdn.com/780916/3/check/neutron-rally-task/390be8d/results/report.html#/NeutronNetworks.create_and_update_subnets/failures

https://89a28d92de3b4c8c3017-1438e56e418f3d4087dd94ee6330f7d7.ssl.cf5.rackcdn.com/780916/3/check/neutron-rally-task/c316a5c/results/report.html#/NeutronNetworks.create_and_update_subnets/failures

https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e3f/781227/1/check/neutron-rally-task/e3fd553/results/report.html#/NeutronNetworks.create_and_update_subnets/failures

https://72c824f20fd751937cae-512ca6f82afe45a8d7ced45e416cc067.ssl.cf2.rackcdn.com/781566/1/check/neutron-rally-task/824bae8/results/report.html#/NeutronNetworks.create_and_update_subnets/output

Changed in neutron:
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

As we can see in [1][2] (this is the best place where I found this error), the problem happens during the segment allocation ("allocate_partially_specified_segment").

This method, if "network_segment_range" is not enabled (this is not the case in Rally tests), will retrieve a subset of non allocated segments and will try to allocate the first one returned.

This code has two problems:
- If we use only one segment, why do we retrieve so many? This is a possible optimization.
- A race condition between several requests. This is the main problem and that explains the issues seen in the Rally CI. If several queries are executed at the same time, all of them will return the same segment list and will compete to allocate the first one.

When a network is created without defining the segmentation ID (this is why we call "allocate_partially_specified_segment"), the segmentation ID assigned to this network could be any of the free ones. Nothing guarantees that the network will be allocated in the next free segmentation ID. If a specific segmentation ID is required, the user should provide it in the request.

My proposal will be to randomize this query and return on single non allocated segment.

Regards.

[1]https://89a28d92de3b4c8c3017-1438e56e418f3d4087dd94ee6330f7d7.ssl.cf5.rackcdn.com/780916/3/check/neutron-rally-task/c316a5c/results/report.html#/NeutronNetworks.create_and_update_subnets/failures
[2]http://paste.openstack.org/show/803835/

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :
Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Should we backport this patch to stable branches?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 18.0.0.0rc2

This issue was fixed in the openstack/neutron 18.0.0.0rc2 release candidate.

tags: added: neutron-proactive-backport-potential
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

The patch was backported to W. If we don't have problems with rally jobs in older versions, I wouldn't backport this patch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/neutron/+/804999

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/805000

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/neutron/+/808160

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/805000
Committed: https://opendev.org/openstack/neutron/commit/ab56a5cd652f57890723d4df5ba6ed22845070fa
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit ab56a5cd652f57890723d4df5ba6ed22845070fa
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Tue Mar 23 18:57:31 2021 +0000

    Randomize segmentation ID assignation

    If plugin "network_segment_range" is not enabled and a new segment
    is required, if no segmentation ID is provided in the request, the
    segmentation ID assigned is randomly retrieved from the non
    allocated segmentation IDs.

    The goal is to improve the concurrent network (and segment) creation.
    If several segments are created in parallel, this random query
    will return a different segmentation ID to each one, avoiding the
    database retry request.

    Closes-Bug: #1920923

    Conflicts:
        neutron/common/utils.py
        neutron/tests/unit/plugins/ml2/drivers/test_type_vlan.py

    Change-Id: Id3f71611a00e69c4f22340ca4d05d95e4373cf69
    (cherry picked from commit 6eaa6d83d7c7f07fd4bf04879c91582de504eff4)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 19.0.0.0rc1

This issue was fixed in the openstack/neutron 19.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/804999
Committed: https://opendev.org/openstack/neutron/commit/16a2fe772234c70b661add0e9cf3f84bbff67840
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 16a2fe772234c70b661add0e9cf3f84bbff67840
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Tue Mar 23 18:57:31 2021 +0000

    Randomize segmentation ID assignation

    If plugin "network_segment_range" is not enabled and a new segment
    is required, if no segmentation ID is provided in the request, the
    segmentation ID assigned is randomly retrieved from the non
    allocated segmentation IDs.

    The goal is to improve the concurrent network (and segment) creation.
    If several segments are created in parallel, this random query
    will return a different segmentation ID to each one, avoiding the
    database retry request.

    Closes-Bug: #1920923

    Conflicts:
        neutron/common/utils.py
        neutron/plugins/ml2/drivers/helpers.py
        neutron/tests/unit/plugins/ml2/drivers/test_type_vlan.py

    Change-Id: Id3f71611a00e69c4f22340ca4d05d95e4373cf69
    (cherry picked from commit 6eaa6d83d7c7f07fd4bf04879c91582de504eff4)
    (cherry picked from commit ab56a5cd652f57890723d4df5ba6ed22845070fa)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/train)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/808160
Committed: https://opendev.org/openstack/neutron/commit/be1a0daab0d60411f4cc5a0ce92030cc07bfcbdf
Submitter: "Zuul (22348)"
Branch: stable/train

commit be1a0daab0d60411f4cc5a0ce92030cc07bfcbdf
Author: Rodolfo Alonso Hernandez <email address hidden>
Date: Tue Mar 23 18:57:31 2021 +0000

    Randomize segmentation ID assignation

    If plugin "network_segment_range" is not enabled and a new segment
    is required, if no segmentation ID is provided in the request, the
    segmentation ID assigned is randomly retrieved from the non
    allocated segmentation IDs.

    The goal is to improve the concurrent network (and segment) creation.
    If several segments are created in parallel, this random query
    will return a different segmentation ID to each one, avoiding the
    database retry request.

    Closes-Bug: #1920923

    Conflicts:
        neutron/common/utils.py
        neutron/plugins/ml2/drivers/helpers.py
        neutron/tests/functional/objects/plugins/ml2/test_base.py
        neutron/tests/unit/plugins/ml2/drivers/test_type_vlan.py

    Change-Id: Id3f71611a00e69c4f22340ca4d05d95e4373cf69
    (cherry picked from commit 6eaa6d83d7c7f07fd4bf04879c91582de504eff4)
    (cherry picked from commit ab56a5cd652f57890723d4df5ba6ed22845070fa)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 16.4.2

This issue was fixed in the openstack/neutron 16.4.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 17.3.0

This issue was fixed in the openstack/neutron 17.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron train-eol

This issue was fixed in the openstack/neutron train-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.