Autossh does not notice when connection "freezes", starts new connection without removing the old process

Bug #1700498 reported by Stuart Langridge
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
autossh (Ubuntu)
New
Undecided
Unassigned

Bug Description

I have three machines, called REMOTE, INTERNET, and LOCAL. REMOTE and LOCAL are both behind firewalls (in completely separate places). INTERNET is on the internet. I need to be able to SSH into REMOTE _from_ LOCAL. To do this I have done the following:

1. used autossh on LOCAL to set up a ssh tunnel from LOCAL to INTERNET, which forwards INTERNET port 22222 back down the ssh tunnel to LOCAL port 22
2. used autossh on REMOTE to set up an ssh tunnel from REMOTE to INTERNET port 22222 (i.e., to LOCAL port 22) which forwards LOCAL port 10022 back down the ssh tunnel to REMOTE port 22

This means that I can now, on LOCAL, do "ssh -p 10022 localhost" and be connected to REMOTE.

The first tunnel, point 1, is fine. However, the second tunnel keeps "freezing". A symptom of this is that port 10022 on LOCAL is "open" (I can ssh to it and don't get a "Connection refused") but the connection never connects. Looking at ps on LOCAL, I can see that there are many sshd [priv] processes from the remote machine (I have 44 right now). If I kill all these processes, then autossh on REMOTE seems to "notice" that this has happened and start a new connection, which works fine.

So there are perhaps two issues here.

a. For some reason the ssh connection goes "stale" and can't be used to connect. I do not know why this is; in particular, I don't know whether it's caused by network dropouts or by configuration or what.
b. at some points, autossh seems to be realising that the connection doesn't work and starting a new one (good, this is what it's for!) but it does not seem to be killing the old one (which is why I've got 44 processes rather than one)

I do not know whether these are related; I do not know whether autossh is starting a new process because the old one is stale, or for some other reason. I do not know how to replicate the "connection goes stale" issue; it happens over time, so if I come back to the tunnel after a day then it will likely be stale, but connecting and reconnecting to it five minutes later normally means it has not died in the interim. Testing this is very slow because it can't be replicated (so I have to wait a day for a problem to exhibit), and because I am administering REMOTE over the very ssh tunnel that this system creates (so I am very wary of doing anything that breaks the tunnel, because I can't recover from that without a site visit).

The tunnel from REMOTE to LOCAL is created thus:
AUTOSSH_GATETIME=0
/usr/bin/autossh -M 0 -N -T -q -o ServerAliveInterval=30 -o ServerAliveCountMax=3 -R 10022:localhost:22 -p 22222 remoteuser@INTERNET

The tunnel from LOCAL to INTERNET is created thus:
AUTOSSH_GATETIME=0
ExecStart=/usr/bin/autossh -q -N -o "ServerAliveInterval 60" -o "ServerAliveCountMax 3" internetuser@INTERNET -R 0.0.0.0:22222:localhost:22

I am happy to do some diagnostics, but not at the expense of cutting me off from the remote endpoint. I can provide tcpdumps, log files, etc to support this bug if I'm given commands to run to produce them.

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: autossh 1.4e-2
ProcVersionSignature: Ubuntu 4.4.0-79.100-generic 4.4.67
Uname: Linux 4.4.0-79-generic x86_64
ApportVersion: 2.20.1-0ubuntu2.6
Architecture: amd64
CurrentDesktop: GNOME
Date: Mon Jun 26 11:16:19 2017
InstallationDate: Installed on 2014-04-07 (1176 days ago)
InstallationMedia: Ubuntu 13.10 "Saucy Salamander" - Release amd64 (20131016.1)
SourcePackage: autossh
UpgradeStatus: Upgraded to xenial on 2016-08-04 (326 days ago)

Revision history for this message
Stuart Langridge (sil) wrote :
description: updated
summary: Autossh does not notice when connection "freezes", starts new connection
- without removing the old one
+ without removing the old process
Revision history for this message
Axel Beckert (xtaran) wrote : Re: [Bug 1700498] [NEW] Autossh does not notice when connection "freezes", starts new connection without removing the old process

Hi Stuart,

thanks for the bug report.

I'll have a closer look at it later.

Stuart Langridge wrote:
> I have three machines, called REMOTE, INTERNET, and LOCAL. REMOTE and
> LOCAL are both behind firewalls (in completely separate places).
> INTERNET is on the internet. I need to be able to SSH into REMOTE _from_
> LOCAL. To do this I have done the following:
>
> 1. used autossh on LOCAL to set up a ssh tunnel from LOCAL to INTERNET, which forwards INTERNET port 22222 back down the ssh tunnel to LOCAL port 22
> 2. used autossh on REMOTE to set up an ssh tunnel from REMOTE to INTERNET port 22222 (i.e., to LOCAL port 22) which forwards LOCAL port 10022 back down the ssh tunnel to REMOTE port 22
>
> This means that I can now, on LOCAL, do "ssh -p 10022 localhost" and be
> connected to REMOTE.

While this seems a rather complicated setup where I can't say on a
first glance what went wrong, I have a few ideas how it might work
better (or at least easier to find the culprit):

1. Don't use autossh 3 times but 1 time plus SSH's jumphost feature.

   Via .ssh/config:

       Host REMOTE
            ProxyCommand ssh INTERNET -W REMOTE:22

   That way you just have to call "autossh REMOTE" and everything else
   is done automatically.

   This should work also with Xenial.

   Via commandline there's a shortcut since OpenSSH 7.3 (only in Yakkety
   and newer):

       autossh -- -J INTERNET REMOTE

   (The "--" is unfortunately necessary as autossh thinks of -J as
   invalid SSH option, probably because it is a rather new option.)

2. Autossh per Default only checks every 10 minutes if the connection
   is still alive. That interval is rather long. So you might want to
   reduce the check interval for each of your autossh connection to
   make autossh react more timely if the connection stalls:

   Replace every occurrence of "autossh" in your setup with e.g.
   "env AUTOSSH_POLL=5 autossh" to see if that already helps.

Hope this helps!

  Regards, Axel
--
 ,''`. | Axel Beckert <email address hidden>, http://people.debian.org/~abe/
: :' : | Debian Developer, ftp.ch.debian.org Admin
`. `' | 4096R: 2517 B724 C5F6 CA99 5329 6E61 2FF9 CD59 6126 16B5
  `- | 1024D: F067 EA27 26B9 C3FC 1486 202E C09E 1D89 9593 0EDE

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.