Race during adding and updating same port in L3 agent's info can generate wrong radvd config file

Bug #1813279 reported by Slawek Kaplonski
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Slawek Kaplonski

Bug Description

There is possibility that because of some race in processing adding/updating internal ports info in RouterInfo class, same port, with 2 different revisions and different subnets configured will be added to RouterInfo.internal_ports twice in RouterInfo._process_internal_ports method (https://github.com/openstack/neutron/blob/master/neutron/agent/l3/router_info.py#L544).
If ports have got IPv6 gateway configured and radvd daemon should be started for such router, this may lead to generate radvd config file with duplicate interfaces, like:

interface qr-29c030a8-26
{
   AdvSendAdvert on;
   MinRtrAdvInterval 30;
   MaxRtrAdvInterval 100;
   AdvLinkMTU 1500;
   AdvOtherConfigFlag on;

   prefix 2003:0:0:1::/64
   {
        AdvOnLink on;
        AdvAutonomous on;
   };

   prefix 2003::/64
   {
        AdvOnLink on;
        AdvAutonomous on;
   };
};interface qr-29c030a8-26
{
   AdvSendAdvert on;
   MinRtrAdvInterval 30;
   MaxRtrAdvInterval 100;
   AdvLinkMTU 1500;

   AdvOtherConfigFlag on;

   prefix 2003::/64
   {
        AdvOnLink on;
        AdvAutonomous on;
   };
};

In some cases this may lead to crash radvd daemon. See also https://bugzilla.redhat.com/show_bug.cgi?id=1630167 for more details.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/633236

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I think I found why it happens like that.
In RouterInfo._process_internal_ports() method, when updated_ports are found, there is loop which iterates through current ports (stored in internal_ports list) and then updating element from self.internal_ports which may be different port. It's in https://github.com/openstack/neutron/blob/master/neutron/agent/l3/router_info.py#L586

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/633618

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Slawek Kaplonski (<email address hidden>) on branch: master
Review: https://review.openstack.org/633236
Reason: Better approach to fix this issue is proposed in https://review.openstack.org/#/c/633618/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/633618
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=21cddc47b446fdfb5535b347c1d7825a04e02c62
Submitter: Zuul
Branch: master

commit 21cddc47b446fdfb5535b347c1d7825a04e02c62
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jan 28 23:53:04 2019 +0100

    Fix update of ports cache in router_info class

    RouterInfo class has got internal_ports cache which is updated
    in _process_internal_ports() method.
    There was an issue in this updates logic because it was
    iterating through enumerate local variable "internal_ports"
    which represents current router ports and if such current port
    was found in updated_ports list it was storred in
    RouterInfo().internal_ports variable under same index as was
    found in "internal_ports" local variable.
    This sometimes leads to an issue because same port can be
    stored under different index in internal_ports and
    RouterInfo().internal_ports lists thus wrong port in
    RouterInfo().internal_ports was overwritten.

    Such issue leads to problem with generating radvd config file
    because in ports cache list there was duplicate info about same port
    so radvd config file contained duplicate interface definitions too.

    This should be properly fixed by changing RouterInfo.internal_ports
    to be a dict instead of list of ports but such patch would be much
    bigger and (possibly) harded to backport to stable branches.

    Change-Id: I2e38457942518c8a3e07e606091bb6720317b77e
    Closes-Bug: #1813279

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/636023

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/636024

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/636025

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/queens)

Reviewed: https://review.openstack.org/636024
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=48749c2788396a1dadd57232d18abb7bd131b826
Submitter: Zuul
Branch: stable/queens

commit 48749c2788396a1dadd57232d18abb7bd131b826
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jan 28 23:53:04 2019 +0100

    Fix update of ports cache in router_info class

    RouterInfo class has got internal_ports cache which is updated
    in _process_internal_ports() method.
    There was an issue in this updates logic because it was
    iterating through enumerate local variable "internal_ports"
    which represents current router ports and if such current port
    was found in updated_ports list it was storred in
    RouterInfo().internal_ports variable under same index as was
    found in "internal_ports" local variable.
    This sometimes leads to an issue because same port can be
    stored under different index in internal_ports and
    RouterInfo().internal_ports lists thus wrong port in
    RouterInfo().internal_ports was overwritten.

    Such issue leads to problem with generating radvd config file
    because in ports cache list there was duplicate info about same port
    so radvd config file contained duplicate interface definitions too.

    This should be properly fixed by changing RouterInfo.internal_ports
    to be a dict instead of list of ports but such patch would be much
    bigger and (possibly) harded to backport to stable branches.

    Change-Id: I2e38457942518c8a3e07e606091bb6720317b77e
    Closes-Bug: #1813279
    (cherry picked from commit 21cddc47b446fdfb5535b347c1d7825a04e02c62)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/rocky)

Reviewed: https://review.openstack.org/636023
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=679e8ee6cc93f15851fd276044f5682ccc247dc1
Submitter: Zuul
Branch: stable/rocky

commit 679e8ee6cc93f15851fd276044f5682ccc247dc1
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jan 28 23:53:04 2019 +0100

    Fix update of ports cache in router_info class

    RouterInfo class has got internal_ports cache which is updated
    in _process_internal_ports() method.
    There was an issue in this updates logic because it was
    iterating through enumerate local variable "internal_ports"
    which represents current router ports and if such current port
    was found in updated_ports list it was storred in
    RouterInfo().internal_ports variable under same index as was
    found in "internal_ports" local variable.
    This sometimes leads to an issue because same port can be
    stored under different index in internal_ports and
    RouterInfo().internal_ports lists thus wrong port in
    RouterInfo().internal_ports was overwritten.

    Such issue leads to problem with generating radvd config file
    because in ports cache list there was duplicate info about same port
    so radvd config file contained duplicate interface definitions too.

    This should be properly fixed by changing RouterInfo.internal_ports
    to be a dict instead of list of ports but such patch would be much
    bigger and (possibly) harded to backport to stable branches.

    Change-Id: I2e38457942518c8a3e07e606091bb6720317b77e
    Closes-Bug: #1813279
    (cherry picked from commit 21cddc47b446fdfb5535b347c1d7825a04e02c62)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/pike)

Reviewed: https://review.openstack.org/636025
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=10c981512a6cfed389ccc1523d21298a8a75929a
Submitter: Zuul
Branch: stable/pike

commit 10c981512a6cfed389ccc1523d21298a8a75929a
Author: Slawek Kaplonski <email address hidden>
Date: Mon Jan 28 23:53:04 2019 +0100

    Fix update of ports cache in router_info class

    RouterInfo class has got internal_ports cache which is updated
    in _process_internal_ports() method.
    There was an issue in this updates logic because it was
    iterating through enumerate local variable "internal_ports"
    which represents current router ports and if such current port
    was found in updated_ports list it was storred in
    RouterInfo().internal_ports variable under same index as was
    found in "internal_ports" local variable.
    This sometimes leads to an issue because same port can be
    stored under different index in internal_ports and
    RouterInfo().internal_ports lists thus wrong port in
    RouterInfo().internal_ports was overwritten.

    Such issue leads to problem with generating radvd config file
    because in ports cache list there was duplicate info about same port
    so radvd config file contained duplicate interface definitions too.

    This should be properly fixed by changing RouterInfo.internal_ports
    to be a dict instead of list of ports but such patch would be much
    bigger and (possibly) harded to backport to stable branches.

    Change-Id: I2e38457942518c8a3e07e606091bb6720317b77e
    Closes-Bug: #1813279
    (cherry picked from commit 21cddc47b446fdfb5535b347c1d7825a04e02c62)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 14.0.0.0b2

This issue was fixed in the openstack/neutron 14.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 11.0.7

This issue was fixed in the openstack/neutron 11.0.7 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 13.0.3

This issue was fixed in the openstack/neutron 13.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 12.0.6

This issue was fixed in the openstack/neutron 12.0.6 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.