NSX-mh: bad retry behaviour on controller connection issues

Bug #1485883 reported by Salvatore Orlando
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Invalid
Undecided
Unassigned
Juno
Fix Released
Undecided
Salvatore Orlando
vmware-nsx
Fix Released
High
Salvatore Orlando

Bug Description

If the connection to a NSX-mh controller fails - for instance because there is a network issue or the controller is unreachable - the neutron plugin keeps retrying the connection to the same controller until it times out, whereas a correct behaviour would be to try to connect to the other controllers in the cluster.

The issue can be reproduced with the following steps:
1. Three Controllers in the cluster 10.25.56.223,10.25.101.133,10.25.56.222
2. Neutron net-create dummy-1 from openstack cli
3. Vnc into controller-1, ifconfig eth0 down
4. Do neutron net-create dummy-2 from openstack cli

The API requests were forwarded to 10.25.56.223 originally. eth0 interface was shutdown on 10.25.56.223. But the requests continued to get forwarded to the same Controllers and timed out.

Changed in vmware-nsx:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to vmware-nsx (master)

Fix proposed to branch: master
Review: https://review.openstack.org/214060

Changed in vmware-nsx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to vmware-nsx (master)

Reviewed: https://review.openstack.org/214060
Committed: https://git.openstack.org/cgit/openstack/vmware-nsx/commit/?id=1602baf661b7e2cd951bf1603c6e99ab9638e1b0
Submitter: Jenkins
Branch: master

commit 1602baf661b7e2cd951bf1603c6e99ab9638e1b0
Author: Salvatore Orlando <email address hidden>
Date: Tue Aug 18 00:55:32 2015 -0700

    NSX-mh: Failover controller connections on socket failures

    Upon a socket connection failure, release the current connection
    and acquire a new one to a different controller.
    This is achieved by treating socket connection failures as 503
    errors returned by the controller.

    Also, ensure an even distribution of initial connection priorities
    across controllers.

    Change-Id: I988b46a4d1f51e4ad6b22ed3d892eab6a96a3acd
    Closes-Bug: 1485883

Changed in vmware-nsx:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to vmware-nsx (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/216261

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to vmware-nsx (stable/kilo)

Reviewed: https://review.openstack.org/216261
Committed: https://git.openstack.org/cgit/openstack/vmware-nsx/commit/?id=d8108779eaf65e04e88e958c105d6ccb7eae0ca4
Submitter: Jenkins
Branch: stable/kilo

commit d8108779eaf65e04e88e958c105d6ccb7eae0ca4
Author: Salvatore Orlando <email address hidden>
Date: Tue Aug 18 00:55:32 2015 -0700

    NSX-mh: Failover controller connections on socket failures

    Upon a socket connection failure, release the current connection
    and acquire a new one to a different controller.
    This is achieved by treating socket connection failures as 503
    errors returned by the controller.

    Also, ensure an even distribution of initial connection priorities
    across controllers.

    Cherry-picked from commit: 1602baf661b7e2cd951bf1603c6e99ab9638e1b0

    Change-Id: I988b46a4d1f51e4ad6b22ed3d892eab6a96a3acd
    Closes-Bug: 1485883

tags: added: in-stable-kilo
Alan Pevec (apevec)
Changed in neutron:
status: New → Invalid
Adit Sarfaty (asarfaty)
Changed in vmware-nsx:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.