Race condition in l2pop drops tunnels

Bug #1372438 reported by Ihar Hrachyshka
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Vivekanandan Narasimhan
Icehouse
Fix Released
Undecided
Unassigned
Juno
Fix Released
Undecided
Unassigned

Bug Description

The issue was originally raised by a Red Hat performance engineer (Joe Talerico) here: https://bugzilla.redhat.com/show_bug.cgi?id=1136969 (see starting from comment 4).

Joe created a Fedora instance in his OS cloud based on RHEL7-OSP5 (Icehouse), where he installed Rally client to run benchmarks against that cloud itself. He assigned a floating IP to that instance to be able to access API endpoints from inside the Rally machine. Then he ran a scenario which basically started up 100+ new instances in parallel, tried to access each of them via ssh, and once it succeeded, clean up each created instance (with its ports). Once in a while, his Rally instance lost connection to outside world. This was because VXLAN tunnel to the compute node hosting the Rally machine was dropped on networker node where DHCP, L3, Metadata agents were running. Once we restarted OVS agent, the tunnel was recreated properly.

The scenario failed only if L2POP mechanism was enabled.

I've looked thru the OVS agent logs and found out that the tunnel was dropped due to a legitimate fdb entry removal request coming from neutron-server side. So the fault is probably on neutron-server side, in l2pop mechanism driver.

I've then looked thru the patches in Juno to see whether there is something related to the issue already merged, and found the patch that gets rid of _precommit step when cleaning up fdb entries. Once we've applied the patch on the neutron-server node, we stopped to experience those connectivity failures.

After discussion with Vivekanandan Narasimhan, we came up with the following race condition that may result in tunnels being dropped while legitimate resources are still using them:

(quoting Vivek below)

'''
- - port1 delete request comes in;
- - port1 delete request acquires lock
- - port2 create/update request comes in;
- - port2 create/update waits on due to unavailability of lock
- - precommit phase for port1 determines that the port is the last one, so we should drop the FLOODING_ENTRY;
- - port1 delete applied to db;
- - port1 transaction releases the lock
- - port2 create/update acquires the lock
- - precommit phase for port2 determines that the port is the first one, so request FLOODING_ENTRY + MAC-specific flow creation;
- - port2 create/update request applied to db;
- - port2 transaction releases the lock

Now at this point postcommit of either of them could happen, because code-pieces operate outside the
locked zone.

If it happens, this way, tunnel would retain:

- - postcommit phase for port1 requests FLOODING_ENTRY deletion due to port1 deletion
- - postcommit phase requests FLOODING_ENTRY + MAC-specific flow creation for port2;

If it happens the below way, tunnel would break:
- - postcommit phase for create por2 requests FLOODING_ENTRY + MAC-specific flow
- - postcommit phase for delete port1 requests FLOODING_ENTRY deletion
'''

We considered the patch to get rid of precommit for backport to Icehouse [1] that seems to eliminate the race, but we're concerned that we reverted that to previous behaviour in Juno as part of DVR work [2], though we haven't done any testing to check whether the issue is present in Juno (though brief analysis of the code shows that it should fail there too).

Ideally, the fix for Juno should be easily backportable because the issue is currently present in Icehouse, and we would like to have the same fix for both branches (Icehouse and Juno) instead of backporting patch [1] to Icehouse and implementing another patch for Juno.

[1]: https://review.openstack.org/#/c/95165/
[2]: https://review.openstack.org/#/c/102398/

tags: added: icehouse-backport-potential
tags: added: l2pop
tags: added: l3-ipam-dhcp ml2
Changed in neutron:
assignee: nobody → Vivekanandan Narasimhan (vivekanandan-narasimhan)
Revision history for this message
Mathieu Rohon (mathieu-rohon) wrote :

this patch will have to wait for this one to merge first :
https://bugs.launchpad.net/neutron/+bug/1367391

after this bug get resolved, the port binding should be avalaible in delete_post_commit(), even when the port was bound to several hosts.

Revision history for this message
Mathieu Rohon (mathieu-rohon) wrote :

I think that once a port could be bound to several hosts, when such a port is deleted, l2pop MD should look at each hosts the port was bound to, to know if this host has others ports on the same segment, and to know if other hosts have to remove their flooding entries and consequently their tunnel.

Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

@Mathieu, the problem with bug 1367391 is that it's not clear whether it's going to be fixed before Juno release (though it's currently targeted for RC-1). Also, the bug is generic and assumes a lot of code cleanup in DVR implementation that is not present in Icehouse that also suffers from the race condition. I would be glad if we consider backportability of the fix for this bug when trying to come up with a solution. Of course, we could backport the first patch by Mathieu in Icehouse and have another fix in Juno/Kilo, but that's not ideal. So I'd like to hear from interested parties whether planned Juno/Kilo fix is going to be backportable.

Also, in terms of backports, I would like to push a fix for the issue in Icehouse before the next release which is scheduled for the next week. Otherwise, we leave the bug unfixed for our users for another several months, for the consequent (and the last one) icehouse release date is still TBD.

Revision history for this message
Vivekanandan Narasimhan (vivekanandan-narasimhan) wrote :

I may not be able to hold on till some patchset appears for 1367391 as that is more of a refactor
than fixing a specific bug. So I am pursuing fixing this, regardless.

Changed in neutron:
status: New → In Progress
Revision history for this message
Vivekanandan Narasimhan (vivekanandan-narasimhan) wrote :

Ihar,

Can you please validate the patch put out for this bug here, in your environment and post an update:

https://review.openstack.org/#/c/123403/

Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

Vivek,

we don't have cycles to check that at the moment, but the patch looks almost the same as the one that helped us to avoid the failure in Icehouse, so I guess it's ok.

tags: added: l3-dvr-backlog
Changed in neutron:
importance: Undecided → Medium
tags: added: juno-rc-potential
Kyle Mestery (mestery)
Changed in neutron:
milestone: none → kilo-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/123403
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=3cd2163d5105faad389bee5175ef446f0bb90289
Submitter: Jenkins
Branch: master

commit 3cd2163d5105faad389bee5175ef446f0bb90289
Author: Vivekanandan Narasimhan <email address hidden>
Date: Tue Sep 23 02:25:16 2014 -0700

    Race for l2pop when ports go up/down on same host

    With l2pop enabled, race exists in delete_port_postcommit
    when both create/update_port and delete_port deal with
    different ports on the same host, where such ports are
    either the first (or) last on same network for that host.
    This race happens outside the DB locking zones in
    the respective methods of ML2 plugin.

    To fix this, we have moved determination of
    fdb_entries back to delete_port_postcommit and removed
    delete_port_precommit altogether from l2pop mechanism
    driver. In order to accomodate dvr interfaces, we
    are storing and re-using the mechanism-driver context
    which hold dvr-port-binding information while
    invoking delete_port_postcommit. We loop through
    dvr interface bindings invoking delete_port_postcommit
    similar to delete_port_precommit.

    Closes-Bug: #1372438
    Change-Id: If0502f57382441fdb4510c81a89794f57a38e696

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/128642

Thierry Carrez (ttx)
tags: added: juno-backport-potential
removed: juno-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/129166

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/lbaasv2)

Fix proposed to branch: feature/lbaasv2
Review: https://review.openstack.org/130864

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (feature/lbaasv2)
Download full text (72.6 KiB)

Reviewed: https://review.openstack.org/130864
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=c089154a94e5872efc95eab33d3d0c9de8619fe4
Submitter: Jenkins
Branch: feature/lbaasv2

commit 62588957fbeccfb4f80eaa72bef2b86b6f08dcf8
Author: Kevin Benton <email address hidden>
Date: Wed Oct 22 13:04:03 2014 -0700

    Big Switch: Switch to TLSv1 in server manager

    Switch to TLSv1 for the connections to the backend
    controllers. The default SSLv3 is no longer considered
    secure.

    TLSv1 was chosen over .1 or .2 because the .1 and .2 weren't
    added until python 2.7.9 so TLSv1 is the only compatible option
    for py26.

    Closes-Bug: #1384487
    Change-Id: I68bd72fc4d90a102003d9ce48c47a4a6a3dd6e03

commit 17204e8f02fdad046dabdb8b31397289d72c877b
Author: OpenStack Proposal Bot <email address hidden>
Date: Wed Oct 22 06:20:15 2014 +0000

    Imported Translations from Transifex

    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure

    Change-Id: I58db0476c810aa901463b07c42182eef0adb5114

commit d712663b99520e6d26269b0ca193527603178742
Author: Carl Baldwin <email address hidden>
Date: Mon Oct 20 21:48:42 2014 +0000

    Move disabling of metadata and ipv6_ra to _destroy_router_namespace

    I noticed that disable_ipv6_ra is called from the wrong place and that
    in some cases it was called with a bogus router_id because the code
    made an incorrect assumption about the context. In other case, it was
    never called because _destroy_router_namespace was being called
    directly. This patch moves the disabling of metadata and ipv6_ra in
    to _destroy_router_namespace to ensure they get called correctly and
    avoid duplication.

    Change-Id: Ia76a5ff4200df072b60481f2ee49286b78ece6c4
    Closes-Bug: #1383495

commit f82a5117f6f484a649eadff4b0e6be9a5a4d18bb
Author: OpenStack Proposal Bot <email address hidden>
Date: Tue Oct 21 12:11:19 2014 +0000

    Updated from global requirements

    Change-Id: Idcbd730f5c781d21ea75e7bfb15959c8f517980f

commit be6bd82d43fbcb8d1512d8eb5b7a106332364c31
Author: Angus Lees <email address hidden>
Date: Mon Aug 25 12:14:29 2014 +1000

    Remove duplicate import of constants module

    .. and enable corresponding pylint check now the only offending instance
    is fixed.

    Change-Id: I35a12ace46c872446b8c87d0aacce45e94d71bae

commit 9902400039018d77aa3034147cfb24ca4b2353f6
Author: rajeev <email address hidden>
Date: Mon Oct 13 16:25:36 2014 -0400

    Fix race condition on processing DVR floating IPs

    Fip namespace and agent gateway port can be shared by multiple dvr routers.
    This change uses a set as the control variable for these shared resources
    and ensures that Test and Set operation on the control variable are
    performed atomically so that race conditions do not occur among
    multiple threads processing floating IPs.
    Limitation: The scope of this change is limited to addressing the race
    condition described in the bug report. It may not address other issues
    such as pre-existing issue wit...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/juno)

Reviewed: https://review.openstack.org/129166
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4a84be59db123df8618e5c0d0614c8a927e68196
Submitter: Jenkins
Branch: stable/juno

commit 4a84be59db123df8618e5c0d0614c8a927e68196
Author: Vivekanandan Narasimhan <email address hidden>
Date: Tue Sep 23 02:25:16 2014 -0700

    Race for l2pop when ports go up/down on same host

    With l2pop enabled, race exists in delete_port_postcommit
    when both create/update_port and delete_port deal with
    different ports on the same host, where such ports are
    either the first (or) last on same network for that host.
    This race happens outside the DB locking zones in
    the respective methods of ML2 plugin.

    To fix this, we have moved determination of
    fdb_entries back to delete_port_postcommit and removed
    delete_port_precommit altogether from l2pop mechanism
    driver. In order to accomodate dvr interfaces, we
    are storing and re-using the mechanism-driver context
    which hold dvr-port-binding information while
    invoking delete_port_postcommit. We loop through
    dvr interface bindings invoking delete_port_postcommit
    similar to delete_port_precommit.

    Closes-Bug: #1372438
    Change-Id: If0502f57382441fdb4510c81a89794f57a38e696
    (cherry picked from commit 3cd2163d5105faad389bee5175ef446f0bb90289)

tags: added: in-stable-juno
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/icehouse)

Reviewed: https://review.openstack.org/128642
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=acb37285ccdd9caa3133188b738ae09fa1bc5913
Submitter: Jenkins
Branch: stable/icehouse

commit acb37285ccdd9caa3133188b738ae09fa1bc5913
Author: Vivekanandan Narasimhan <email address hidden>
Date: Tue Sep 23 02:25:16 2014 -0700

    Race for l2pop when ports go up/down on same host

    With l2pop enabled, race exists in delete_port_postcommit
    when both create/update_port and delete_port deal with
    different ports on the same host, where such ports are
    either the first (or) last on same network for that host.
    This race happens outside the DB locking zones in
    the respective methods of ML2 plugin.

    To fix this, we have moved determination of
    fdb_entries back to delete_port_postcommit and removed
    delete_port_precommit altogether from l2pop mechanism
    driver.

    Icehouse changes:
    - dropped DVR related changes;
    - L2populationMechanismDriver does not have attribute
      L2populationAgentNotify.
    - some minor pep8 changes.

    Conflicts:
     neutron/plugins/ml2/drivers/l2pop/mech_driver.py
     neutron/plugins/ml2/plugin.py
     neutron/tests/unit/ml2/drivers/test_l2population.py

    Closes-Bug: #1372438
    Change-Id: If0502f57382441fdb4510c81a89794f57a38e696
    (cherry picked from commit 3cd2163d5105faad389bee5175ef446f0bb90289)

tags: added: in-stable-icehouse
Thierry Carrez (ttx)
Changed in neutron:
milestone: kilo-1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.