Floating IP ops lock wait timeout

Bug #1410777 reported by Salvatore Orlando
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Won't Fix
Undecided
Unassigned
Juno
Fix Released
High
Salvatore Orlando
vmware-nsx
Fix Released
High
Salvatore Orlando

Bug Description

Under heavy load floating IP operations can trigger a lock wait timeout, thus causing the operation itself to fail.

The reason for the timeout is the usual untimely eventlet yield which can be triggered in many places during the operation. The chances of this happening are increased by the fact that _update_fip_assoc (called within a DB transaction) does several interactions with the NSX backend.

Unfortunately it is not practical to change the logic of the plugin in a way such that _update_fip_assoc does not go to the backend anymore, especially because the fix would be so extensive that it would be hardly backportable. An attempt in this direction also did not provide a solution: https://review.openstack.org/#/c/138078/

no longer affects: neutron
no longer affects: neutron
Changed in neutron:
status: New → Won't Fix
Changed in vmware-nsx:
importance: Undecided → High
Changed in vmware-nsx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to vmware-nsx (master)

Reviewed: https://review.openstack.org/145794
Committed: https://git.openstack.org/cgit/stackforge/vmware-nsx/commit/?id=0e3299d69ce6f81f6502aca9a15bb9daca90158b
Submitter: Jenkins
Branch: master

commit 0e3299d69ce6f81f6502aca9a15bb9daca90158b
Author: Salvatore Orlando <email address hidden>
Date: Thu Jan 8 06:37:34 2015 -0800

    NSX: synchronize floating IP operations

    This patch simply adds floating IP operations (create, update,
    and delete) to the VMware global mutex already employed for
    router gateway operations.

    This should prevent the occurence of database lock wait timeout
    errors caused by untimely eventlet yields.

    Closes-Bug: #1410777

    Change-Id: I0e887b1401daec991f2244fb897a4b1dd206bf35

Changed in vmware-nsx:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/147464

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/juno)

Reviewed: https://review.openstack.org/147464
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7c825121080db10ff5aeb1d90d3e434966c1645b
Submitter: Jenkins
Branch: stable/juno

commit 7c825121080db10ff5aeb1d90d3e434966c1645b
Author: Salvatore Orlando <email address hidden>
Date: Thu Jan 15 02:56:13 2015 -0800

    NSX: synchronize floating IP operations

    This patch simply adds floating IP operations (create, update,
    and delete) to the VMware global mutex already employed for
    router gateway operations.

    This should prevent the occurence of database lock wait timeout
    errors caused by untimely eventlet yields.

    Patch applied from stackforge/vmware-nsx commit id
    defd249040edb45b49a9d7eb9451dde1dffef69e

    Change-Id: Iad794a354412221ba4085637e0622882bbfce82b
    Closes-Bug: #1410777

Adit Sarfaty (asarfaty)
Changed in vmware-nsx:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.