[RFE] [LBaaS] ssh connection timeout
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| octavia |
In Progress
|
Wishlist
|
Unassigned |
Bug Description
In the V2 api, we need a way to tune the lb connection timeouts so that we can have a pool of ssh servers that have long running tcp connections. ssh sessions can last days to weeks and users get grumpy if the session times out if they are in the middle of doing something. Currently the timeouts are tuned to drop connections that are too long running regardless of if there is traffic on the connection or not. This is good for http, but bad for ssh.
tags: | added: lbaas |
Alan (kaihongd) wrote : | #1 |
tags: | added: rfe |
Michael Johnson (johnsom) wrote : | #2 |
I think we need to expose the following HAProxy options as a minimum:
retries
timeout connect
timeout client
timeout server
timeout http-keep-alive
timeout http-request
timeout tunnel
Nice to have:
timeout check
timeout client-fin
timeout queue
timeout server-fin
Kevin Fox (kevpn) wrote : | #3 |
perhaps:
option tcpka
too.
Reedip (reedip-banerjee) wrote : | #4 |
If this is currently open, can I take this up?
Kevin Fox (kevpn) wrote : | #5 |
This is still a major problem for us... If your willing, please do. :)
In order to work around it, for now I've had to just patch the raw source code of the v1 lbaas with something like the patch below. But being able to do it per lb, would be much much better.. Since we're using v1, its relatively easy to patch in this hackish way since we only have to do it in one place. in v2 with amphora, it will be much harder. :/
Patch:
--- /usr/lib/
+++ /usr/lib/
@@ -76,14 +76,26 @@
def _build_
- opts = [
- 'log global',
- 'retries 3',
- 'option redispatch',
- 'timeout connect 5000',
- 'timeout client 50000',
- 'timeout server 50000',
- ]
+ opts = []
+ if config[
+ opts = [
+ 'log global',
+ 'retries 3',
+ 'option redispatch',
+ 'timeout connect 5000',
+ "timeout client %s" %(1 * 24 * 60 * 60 * 1000), #1 day
+ "timeout server %s" %(1 * 24 * 60 * 60 * 1000),
+ "option tcpka",
+ ]
+ else:
+ opts = [
+ 'log global',
+ 'retries 3',
+ 'option redispatch',
+ 'timeout connect 5000',
+ 'timeout client 50000',
+ 'timeout server 50000',
+ ]
return itertools.
Thanks,
Kevin
Reedip (reedip-banerjee) wrote : | #6 |
We can proceed for an option in both v1 and v2, but need to discuss it with other members as well.
Changed in neutron: | |
assignee: | nobody → Reedip (reedip-banerjee) |
Reedip (reedip-banerjee) wrote : | #7 |
Note: V1 is deprecated in Liberty and would be moved out in future releases.
So maybe the priority of v2 would be a bit higher than v1
Kevin Fox (kevpn) wrote : | #8 |
V2 only would be ok I think. We want to get to v2 anyway and if its fixed there, we can just move forward.
Thanks,
Kevin
Brandon Logan (brandon-logan) wrote : | #9 |
Would adding a timeout option on the listener (for V2) be sufficient? this would set the timeout client and timeout server to the same value. Do you think its worthwhile to do a timeout on the listener for the timeout client option, and a timeout on the pool for the timeout server option?
Kevin Fox (kevpn) wrote : | #10 |
Not sure. I've never had to have different values, but maybe others will.
Brandon Logan (brandon-logan) wrote : | #11 |
I think we can go with timeout on listener first and if someone in the future wants to set them to different values then we can add it to the pool.
Akihiro Motoki (amotoki) wrote : | #12 |
The current discussion sounds reasonable.
Changed in neutron: | |
importance: | Undecided → Wishlist |
status: | New → Triaged |
summary: |
- lbaas ssh connection timeout + [RFE] [LBaaS] ssh connection timeout |
Let's make sure we have a better understanding of how you intend to expose these attributes (to make sure they are not particularly tied to a specific backend), but this is a nice to have.
tags: |
added: rfe-approved removed: rfe |
Changed in neutron: | |
milestone: | none → mitaka-1 |
Changed in neutron: | |
milestone: | mitaka-1 → mitaka-2 |
Brandon Logan (brandon-logan) wrote : | #14 |
a timeout value on the listener should solve this problem and I'm pretty sure all drivers would support this, though obviously they'd need to modify their drivers to support it. An extension would need to be made for this that just adds the timeout value to the listener in the RESOURCE_
Changed in neutron: | |
milestone: | mitaka-2 → mitaka-3 |
This looks it's dead. Let's garbage collect it.
Changed in neutron: | |
milestone: | mitaka-3 → none |
status: | Triaged → Incomplete |
assignee: | Reedip (reedip-banerjee) → nobody |
Kevin Fox (kevpn) wrote : | #17 |
Still desperately needed. No solution yet. Don't drop the bug please.
Reedip (reedip-banerjee) wrote : | #18 |
Apologies, was caught up in other work, will put up a patch coming weekend ( though it will be a WIP for now)
Changed in neutron: | |
assignee: | nobody → Reedip (reedip-banerjee) |
Fix proposed to branch: master
Review: https:/
Changed in neutron: | |
status: | Incomplete → In Progress |
Changed in python-neutronclient: | |
assignee: | nobody → Reedip (reedip-banerjee) |
Fix proposed to branch: master
Review: https:/
Changed in python-neutronclient: | |
status: | New → In Progress |
Changed in neutron: | |
milestone: | none → mitaka-3 |
Changed in neutron: | |
milestone: | mitaka-3 → mitaka-rc1 |
Changed in neutron: | |
milestone: | mitaka-rc1 → newton-1 |
Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https:/
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.
What's holding this one back?
Changed in neutron: | |
milestone: | newton-1 → newton-2 |
Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https:/
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.
Yet again on the backburner.
Changed in python-neutronclient: | |
importance: | Undecided → Wishlist |
Perhaps its time has passed and we should consider abandoning the attempt to deliver this feature.
Changed in neutron: | |
assignee: | Reedip (reedip-banerjee) → nobody |
Changed in python-neutronclient: | |
assignee: | Reedip (reedip-banerjee) → nobody |
status: | In Progress → Incomplete |
Changed in neutron: | |
status: | In Progress → Incomplete |
Kevin Fox (kevpn) wrote : | #26 |
Still a very needed feature.... It will only become more apparent as things like k8s start integrating with lbaasv2, or they will avoid using it.
Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https:/
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.
@Kevin: sure, if only someone wanted to work on it and get it to the finish line.
Kevin Fox (kevpn) wrote : | #29 |
yeah. this is hampering adoption. I haven't been able to consider octavia yet, or switching away from lbaasv1 until there's a solution.
I've also been actively looking at other non neutron solutions. :/
Changed in neutron: | |
assignee: | nobody → Reedip (reedip-banerjee) |
status: | Incomplete → In Progress |
Changed in neutron: | |
milestone: | newton-2 → newton-3 |
Changed in neutron: | |
milestone: | newton-3 → newton-rc1 |
Changed in neutron: | |
milestone: | newton-rc1 → none |
status: | In Progress → Incomplete |
Changed in neutron: | |
status: | Incomplete → In Progress |
Ihar Hrachyshka (ihar-hrachyshka) wrote : | #30 |
So it says it's in progress. Where are the patches?
Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https:/
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.
Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https:/
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.
affects: | neutron → octavia |
Fix proposed to branch: master
Review: https:/
affects: | python-neutronclient → python-openstackclient |
Changed in python-openstackclient: | |
status: | Incomplete → New |
assignee: | nobody → Reedip (reedip-banerjee) |
Michael Johnson (johnsom) wrote : | #34 |
Reedip,
Our client code is in python-
This doesn't need to be against python-
Reedip (reedip-banerjee) wrote : | #35 |
Michael,
I can remove it... but the implementation in octaviaclient is for OSC itself, and besides, there is no project in launchpad for octaviaclient
no longer affects: | python-openstackclient |
Change abandoned by Michael Johnson (<email address hidden>) on branch: master
Review: https:/
Reason: This has not been updated in 4 months.
Abandoning. You can restore the patch later if you feel it is still valuable and you want to continue working on it.
Change abandoned by Reedip (<email address hidden>) on branch: master
Review: https:/
Reason: Lack of time led me to dropping this issue . If someone else can take this up, or if it still holds a value, let me know
Reedip (reedip-banerjee) wrote : | #38 |
Since I am not active anymore, I would like to remove myself from the assignee. Its open for anyone else to take up.
Changed in octavia: | |
assignee: | Reedip (reedip-banerjee) → nobody |
please try below solution and see if it can be workaround or not.
1. increase all underlay interfaces MTU, e.g. 9000
or
2. decrease the client and backend servers MTU to less than 1500. e.g. 1450