vmware-nsx

NSX v3 excessive request back pressure on forced revalidate

Bug #1541591 reported by Boden R on 2016-02-03

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	vmware-nsx	Fix Released	Undecided	Boden R

Bug Description

With the current NSX v3 clustered client logic, endpoint selection will force a revalidate of endpoint states in cases where all endpoints are down. While this can be ideal in lower throughput scenarios where endpoints state is fluctuating, it's less than optimal in high request throughput scenarios. In these scenarios we get back pressure caused by cascading forced revalidation.

Tags:

Boden R (boden) on 2016-02-03

Changed in vmware-nsx:
assignee:	nobody → Boden R (boden)
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-02-03: Fix proposed to vmware-nsx (master)

Fix proposed to branch: master
Review: https://review.openstack.org/275938

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-02-09: Fix merged to vmware-nsx (master)

Reviewed: https://review.openstack.org/275938
Committed: https://git.openstack.org/cgit/openstack/vmware-nsx/commit/?id=e7acdfe91ae1e539fa89de4e161d06dde5ede427
Submitter: Jenkins
Branch: master

commit e7acdfe91ae1e539fa89de4e161d06dde5ede427
Author: Boden R <email address hidden>
Date: Wed Feb 3 14:39:27 2016 -0700

NSX-v3 update endpoint state only on timeout

    This patch removes the NSX v3 client cluster logic that
    forces a revalidate of all endpoints when endpoint
    selection only finds DOWN endpoints. The revalidate
    call can cause cascading backpressure under certain
    circumstances.

    Now DOWN endpoints are only returned to UP as part
    of the endpoint keepalive ping that is controlled via
    conn_idle_timeout config property. Thus, the default
    conn_idle_timeout is also decreased to 10s ensuring
    endpoint revalidation occurs (by default) on a fequent
    basis.

backport: liberty

Change-Id: I5423bce793892dd864353a23ca7c288b846a1ab6
Closes-Bug: #1541591

Changed in vmware-nsx:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-02-09: Fix proposed to vmware-nsx (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/277913

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-02-10: Fix merged to vmware-nsx (stable/liberty)

Reviewed: https://review.openstack.org/277913
Committed: https://git.openstack.org/cgit/openstack/vmware-nsx/commit/?id=d4303335b2b1bd586ca227459fb8fa64b54482cb
Submitter: Jenkins
Branch: stable/liberty

commit d4303335b2b1bd586ca227459fb8fa64b54482cb
Author: Boden R <email address hidden>
Date: Wed Feb 3 14:39:27 2016 -0700

NSX-v3 update endpoint state only on timeout

backport: liberty

Closes-Bug: #1541591
(cherry picked from commit e7acdfe91ae1e539fa89de4e161d06dde5ede427)

    Conflicts:
     vmware_nsx/nsxlib/v3/cluster.py
    Change-Id: I5423bce793892dd864353a23ca7c288b846a1ab6

tags:

added: in-stable-liberty

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.