[RFE] [LBaaS] ssh connection timeout

Bug #1457556 reported by Kevin Fox
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
octavia
Invalid
Wishlist
Unassigned

Bug Description

In the V2 api, we need a way to tune the lb connection timeouts so that we can have a pool of ssh servers that have long running tcp connections. ssh sessions can last days to weeks and users get grumpy if the session times out if they are in the middle of doing something. Currently the timeouts are tuned to drop connections that are too long running regardless of if there is traffic on the connection or not. This is good for http, but bad for ssh.

Kevin Fox (kevpn)
tags: added: lbaas
Revision history for this message
Alan (kaihongd) wrote :

please try below solution and see if it can be workaround or not.

1. increase all underlay interfaces MTU, e.g. 9000
or
2. decrease the client and backend servers MTU to less than 1500. e.g. 1450

tags: added: rfe
Revision history for this message
Michael Johnson (johnsom) wrote :

I think we need to expose the following HAProxy options as a minimum:
retries
timeout connect
timeout client
timeout server
timeout http-keep-alive
timeout http-request
timeout tunnel

Nice to have:
timeout check
timeout client-fin
timeout queue
timeout server-fin

Revision history for this message
Kevin Fox (kevpn) wrote :

perhaps:
option tcpka

too.

Revision history for this message
Reedip (reedip-banerjee-deactivatedaccount) wrote :

If this is currently open, can I take this up?

Revision history for this message
Kevin Fox (kevpn) wrote :

This is still a major problem for us... If your willing, please do. :)

In order to work around it, for now I've had to just patch the raw source code of the v1 lbaas with something like the patch below. But being able to do it per lb, would be much much better.. Since we're using v1, its relatively easy to patch in this hackish way since we only have to do it in one place. in v2 with amphora, it will be much harder. :/

Patch:
--- /usr/lib/python2.7/site-packages/neutron_lbaas/services/loadbalancer/drivers/haproxy/cfg.py.orig 2015-10-01 13:04:50.364724468 -0700
+++ /usr/lib/python2.7/site-packages/neutron_lbaas/services/loadbalancer/drivers/haproxy/cfg.py 2015-10-01 13:11:19.943959753 -0700
@@ -76,14 +76,26 @@

 def _build_defaults(config):
- opts = [
- 'log global',
- 'retries 3',
- 'option redispatch',
- 'timeout connect 5000',
- 'timeout client 50000',
- 'timeout server 50000',
- ]
+ opts = []
+ if config['vip']['protocol'] == 'TCP' and (config['vip']['protocol_port'] == 2811 or config['vip']['protocol_port'] == 22):
+ opts = [
+ 'log global',
+ 'retries 3',
+ 'option redispatch',
+ 'timeout connect 5000',
+ "timeout client %s" %(1 * 24 * 60 * 60 * 1000), #1 day
+ "timeout server %s" %(1 * 24 * 60 * 60 * 1000),
+ "option tcpka",
+ ]
+ else:
+ opts = [
+ 'log global',
+ 'retries 3',
+ 'option redispatch',
+ 'timeout connect 5000',
+ 'timeout client 50000',
+ 'timeout server 50000',
+ ]

     return itertools.chain(['defaults'], ('\t' + o for o in opts))

Thanks,
Kevin

Revision history for this message
Reedip (reedip-banerjee-deactivatedaccount) wrote :

We can proceed for an option in both v1 and v2, but need to discuss it with other members as well.

Changed in neutron:
assignee: nobody → Reedip (reedip-banerjee)
Revision history for this message
Reedip (reedip-banerjee-deactivatedaccount) wrote :

Note: V1 is deprecated in Liberty and would be moved out in future releases.
So maybe the priority of v2 would be a bit higher than v1

Revision history for this message
Kevin Fox (kevpn) wrote :

V2 only would be ok I think. We want to get to v2 anyway and if its fixed there, we can just move forward.

Thanks,
Kevin

Revision history for this message
Brandon Logan (brandon-logan) wrote :

Would adding a timeout option on the listener (for V2) be sufficient? this would set the timeout client and timeout server to the same value. Do you think its worthwhile to do a timeout on the listener for the timeout client option, and a timeout on the pool for the timeout server option?

Revision history for this message
Kevin Fox (kevpn) wrote :

Not sure. I've never had to have different values, but maybe others will.

Revision history for this message
Brandon Logan (brandon-logan) wrote :

I think we can go with timeout on listener first and if someone in the future wants to set them to different values then we can add it to the pool.

Revision history for this message
Akihiro Motoki (amotoki) wrote :

The current discussion sounds reasonable.

Changed in neutron:
importance: Undecided → Wishlist
status: New → Triaged
Henry Gessau (gessau)
summary: - lbaas ssh connection timeout
+ [RFE] [LBaaS] ssh connection timeout
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Let's make sure we have a better understanding of how you intend to expose these attributes (to make sure they are not particularly tied to a specific backend), but this is a nice to have.

tags: added: rfe-approved
removed: rfe
Changed in neutron:
milestone: none → mitaka-1
Changed in neutron:
milestone: mitaka-1 → mitaka-2
Revision history for this message
Brandon Logan (brandon-logan) wrote :

a timeout value on the listener should solve this problem and I'm pretty sure all drivers would support this, though obviously they'd need to modify their drivers to support it. An extension would need to be made for this that just adds the timeout value to the listener in the RESOURCE_ATTRIBUTE_MAP. Then the plugin and octavia driver will need to support this as well at first.

Changed in neutron:
milestone: mitaka-2 → mitaka-3
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

This looks it's dead. Let's garbage collect it.

Changed in neutron:
milestone: mitaka-3 → none
status: Triaged → Incomplete
assignee: Reedip (reedip-banerjee) → nobody
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :
Revision history for this message
Kevin Fox (kevpn) wrote :

Still desperately needed. No solution yet. Don't drop the bug please.

Revision history for this message
Reedip (reedip-banerjee-deactivatedaccount) wrote :

Apologies, was caught up in other work, will put up a patch coming weekend ( though it will be a WIP for now)

Changed in neutron:
assignee: nobody → Reedip (reedip-banerjee)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron-lbaas (master)

Fix proposed to branch: master
Review: https://review.openstack.org/273896

Changed in neutron:
status: Incomplete → In Progress
Changed in python-neutronclient:
assignee: nobody → Reedip (reedip-banerjee)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to python-neutronclient (master)

Fix proposed to branch: master
Review: https://review.openstack.org/273911

Changed in python-neutronclient:
status: New → In Progress
Changed in neutron:
milestone: none → mitaka-3
Changed in neutron:
milestone: mitaka-3 → mitaka-rc1
Changed in neutron:
milestone: mitaka-rc1 → newton-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on python-neutronclient (master)

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/273911
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

What's holding this one back?

Changed in neutron:
milestone: newton-1 → newton-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/273911
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Yet again on the backburner.

Changed in python-neutronclient:
importance: Undecided → Wishlist
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Perhaps its time has passed and we should consider abandoning the attempt to deliver this feature.

Changed in neutron:
assignee: Reedip (reedip-banerjee) → nobody
Changed in python-neutronclient:
assignee: Reedip (reedip-banerjee) → nobody
status: In Progress → Incomplete
Changed in neutron:
status: In Progress → Incomplete
Revision history for this message
Kevin Fox (kevpn) wrote :

Still a very needed feature.... It will only become more apparent as things like k8s start integrating with lbaasv2, or they will avoid using it.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-lbaas (master)

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/273896
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

@Kevin: sure, if only someone wanted to work on it and get it to the finish line.

Revision history for this message
Kevin Fox (kevpn) wrote :

yeah. this is hampering adoption. I haven't been able to consider octavia yet, or switching away from lbaasv1 until there's a solution.

I've also been actively looking at other non neutron solutions. :/

Changed in neutron:
assignee: nobody → Reedip (reedip-banerjee)
status: Incomplete → In Progress
Changed in neutron:
milestone: newton-2 → newton-3
Changed in neutron:
milestone: newton-3 → newton-rc1
Changed in neutron:
milestone: newton-rc1 → none
status: In Progress → Incomplete
Changed in neutron:
status: Incomplete → In Progress
Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

So it says it's in progress. Where are the patches?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on python-neutronclient (master)

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/273911
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-lbaas (master)

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/273896
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

affects: neutron → octavia
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to octavia (master)

Fix proposed to branch: master
Review: https://review.openstack.org/412971

affects: python-neutronclient → python-openstackclient
Changed in python-openstackclient:
status: Incomplete → New
assignee: nobody → Reedip (reedip-banerjee)
Revision history for this message
Michael Johnson (johnsom) wrote :

Reedip,
Our client code is in python-octaviaclient repository and tracked under the octavia project.
This doesn't need to be against python-openstackclient.

Revision history for this message
Reedip (reedip-banerjee-deactivatedaccount) wrote :

Michael,
I can remove it... but the implementation in octaviaclient is for OSC itself, and besides, there is no project in launchpad for octaviaclient

no longer affects: python-openstackclient
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on octavia (master)

Change abandoned by Michael Johnson (<email address hidden>) on branch: master
Review: https://review.openstack.org/412971
Reason: This has not been updated in 4 months.
Abandoning. You can restore the patch later if you feel it is still valuable and you want to continue working on it.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-lbaas (master)

Change abandoned by Reedip (<email address hidden>) on branch: master
Review: https://review.openstack.org/273896
Reason: Lack of time led me to dropping this issue . If someone else can take this up, or if it still holds a value, let me know

Revision history for this message
Reedip (reedip-banerjee-deactivatedaccount) wrote :

Since I am not active anymore, I would like to remove myself from the assignee. Its open for anyone else to take up.

Changed in octavia:
assignee: Reedip (reedip-banerjee) → nobody
Revision history for this message
Gregory Thiemonge (gthiemonge) wrote : auto-abandon-script

Abandoned after re-enabling the Octavia launchpad.

Changed in octavia:
status: In Progress → Invalid
tags: added: auto-abandon
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.