sshd failure

Bug #6871 reported by Debian Bug Importer
12
Affects Status Importance Assigned to Milestone
openssh (Debian)
Fix Released
Unknown
openssh (Ubuntu)
Fix Released
Medium
Colin Watson

Bug Description

Automatically imported from Debian bug report #252676
http://bugs.debian.org/252676

Revision history for this message
In , Joey Hess (joeyh) wrote : more info

Dilinger also experienced the problem:

<dilinger> joeyh: the ssh breakage i was seeing was due to pam being upgraded under ssh
<dilinger> doing an lsof, all the pam links that ssh had been linked against had been deleted
<dilinger> s/links/libs/

--
see shy jo

Revision history for this message
In , Andres Salomon (dilinger-deactivatedaccount) wrote : ssh problems

I've had this ssh bug happen to me twice. The first time, I was already
logged into the machine, so I could examine sshd. My initial thought
(as I mentioned on IRC) was that it was some weird pam/sshd interaction;
I had dist-upgraded numerous times without restarted sshd, and sshd was
linked against deleted pam libs. However, it happened a second time a
few weeks ago; this time, I wasn't logged into the box, so I couldn't
examine it. The second time, I don't think I had done any
dist-upgrades. I'm waiting for it to happen again, but it doesn't
happen very often. The next time it does, I'll spend more time trying
to figure out what's going on (since the first time, I was in a rush to
get sshd restarted).

Revision history for this message
In , Joey Hess (joeyh) wrote : more info

It happened to me again today. I just killed the stuck ssh processes and
restarted it again..

--
see shy jo

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Automatically imported from Debian bug report #252676
http://bugs.debian.org/252676

Revision history for this message
Debian Bug Importer (debzilla) wrote :
Download full text (3.2 KiB)

Message-ID: <email address hidden>
Date: Fri, 4 Jun 2004 13:20:54 -0400
From: Joey Hess <email address hidden>
To: Debian Bug Tracking System <email address hidden>
Subject: sshd failure

Package: ssh
Version: 1:3.8.1p1-4
Severity: serious

Note: I'm not 100% sure I was running ssh -4, and not -3, when I
experienced this bug, because the first thing I tried to do to fix it
was upgrade. Bug #248125 looks similar, and that was -3? My status-old
is dated June second, and has version -4 in it though, so I do think I
was running -4.

My colocated server was refusing both ssh and ssl telnet connections.
It looked like this:

joey:~>ssh -v kite
OpenSSH_3.8.1p1 Debian 1:3.8.1p1-4, OpenSSL 0.9.7d 17 Mar 2004
debug6761: Reading configuration data /home/joey/.ssh/config
debug6761: Applying options for kite
debug6761: Reading configuration data /etc/ssh/ssh_config
debug6761: Connecting to kite [64.62.161.42] port 22.
debug6761: Connection established.
debug6761: identity file /home/joey/.ssh/identity type -1
debug6761: identity file /home/joey/.ssh/id_rsa type -1
debug6761: identity file /home/joey/.ssh/id_dsa type 2
ssh_exchange_identification: Connection closed by remote host

Telnet also hung up before I got to a login prompt. The rest of the serivces
seemed ok. I got a root shell via other means, and tried restarting ssh. No
luck. Tried upgrading the whole system to current unstable, again, no luck.
Then I noticed something strange in ps:

14515 ? S 0:00 sshd: joey [pam]
32215 ? S 0:00 sshd: bdragon [pam]
 8978 ? S 0:00 sshd: joeyh [pam]

There were a few more that I've elided because they may contain preveligded
information. I don't have a "bdragon" or "joeyh" user, and there were some
other weird users listed. None of these users were really logged in,
that I could tell.

I also found this in the log:

Jun 2 10:33:06 kitenet sshd[26977]: error: Bind to port 22 on 0.0.0.0 fail=
ed: Address already in use.
Jun 2 10:33:06 kitenet sshd[26977]: fatal: Cannot bind any address.

I killed all of these processes, and restarted ssh again. Now it worked, and
so did telnet.

I have to catch a plane, so I can't investigate further right now.

-- System Information:
Debian Release: testing/unstable
  APT prefers unstable
  APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: i386 (i686)
Kernel: Linux 2.4.26
Locale: LANG=3Den_US, LC_CTYPE=3Den_US

Versions of packages ssh depends on:
ii adduser 3.56 Add and remove users and groups
ii debconf 1.4.25 Debian configuration managemen=
t sy
ii dpkg 1.10.22 Package maintenance system for=
 Deb
ii libc6 2.3.2.ds1-13 GNU C Library: Shared librarie=
s an
ii libpam-modules 0.76-21 Pluggable Authentication Modul=
es f
ii libpam-runtime 0.76-21 Runtime support for the PAM li=
brar
ii libpam0g 0.76-21 Pluggable Authentication Modul=
es l
ii libssl0.9.7 0.9.7d-3 SSL shared libraries
ii libwrap0 7.6.dbs-4 Wietse Venema's TCP wrappers l=
ibra
ii zlib1g ...

Read more...

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-ID: <email address hidden>
Date: Fri, 4 Jun 2004 19:15:58 -0300
From: Joey Hess <email address hidden>
To: <email address hidden>
Subject: more info

--JP+T4n/bALQSJXh8
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Dilinger also experienced the problem:

<dilinger> joeyh: the ssh breakage i was seeing was due to pam being upgrad=
ed under ssh
<dilinger> doing an lsof, all the pam links that ssh had been linked agains=
t had been deleted
<dilinger> s/links/libs/

--=20
see shy jo

--JP+T4n/bALQSJXh8
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAwPSed8HHehbQuO8RAna+AKCRRerf3+Em3SLfgCStxHzy6TrN2QCg3iEk
eHFvhhq2J4FIoLu3U3ZZYps=
=1AmS
-----END PGP SIGNATURE-----

--JP+T4n/bALQSJXh8--

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-Id: <email address hidden>
Date: Mon, 14 Jun 2004 11:11:16 -0400
From: Andres Salomon <email address hidden>
To: <email address hidden>
Subject: ssh problems

--=-idZlgbJUxPH4K2p06g/C
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

I've had this ssh bug happen to me twice. The first time, I was already
logged into the machine, so I could examine sshd. My initial thought
(as I mentioned on IRC) was that it was some weird pam/sshd interaction;
I had dist-upgraded numerous times without restarted sshd, and sshd was
linked against deleted pam libs. However, it happened a second time a
few weeks ago; this time, I wasn't logged into the box, so I couldn't
examine it. The second time, I don't think I had done any
dist-upgrades. I'm waiting for it to happen again, but it doesn't
happen very often. The next time it does, I'll spend more time trying
to figure out what's going on (since the first time, I was in a rush to
get sshd restarted).

--=-idZlgbJUxPH4K2p06g/C
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQBAzcAT78o9R9NraMQRAoAnAJoDc+cLLl6C6z1jczeY2T2Rjw9LagCeKIbV
09VUJmSel6DCAYKihcmRpM0=
=cJMn
-----END PGP SIGNATURE-----

--=-idZlgbJUxPH4K2p06g/C--

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-ID: <email address hidden>
Date: Mon, 28 Jun 2004 14:12:34 -0400
From: Joey Hess <email address hidden>
To: <email address hidden>
Subject: more info

--k1lZvvs/B4yU6o8G
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

It happened to me again today. I just killed the stuck ssh processes and
restarted it again..

--=20
see shy jo

--k1lZvvs/B4yU6o8G
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFA4F+Sd8HHehbQuO8RAkyZAJwIXrigtMWYot+S4L0pLh/nu8nFGQCfcUax
NjRfTcHvaOqIz8TOSe3/PQ8=
=YBnp
-----END PGP SIGNATURE-----

--k1lZvvs/B4yU6o8G--

Revision history for this message
In , Andres Salomon (dilinger-deactivatedaccount) wrote : what kind of fucked up reportbug behavior...

severity 257514 serious
merge 252676 257514
thanks

--
Andres Salomon <email address hidden>

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-Id: <email address hidden>
Date: Sat, 03 Jul 2004 20:17:49 -0400
From: Andres Salomon <email address hidden>
To: <email address hidden>
Subject: what kind of fucked up reportbug behavior...

--=-vpoIvdb/cO/NkgJ+hNCs
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

severity 257514 serious
merge 252676 257514
thanks

--=20
Andres Salomon <email address hidden>

--=-vpoIvdb/cO/NkgJ+hNCs
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQBA50ys78o9R9NraMQRArg3AJ9EGt46LfTDw11IuwkgOYf33T/chgCfab54
KDZQAwJ2sQerYDj25ug3lEE=
=DIKr
-----END PGP SIGNATURE-----

--=-vpoIvdb/cO/NkgJ+hNCs--

Revision history for this message
Debian Bug Importer (debzilla) wrote :

*** Bug 6943 has been marked as a duplicate of this bug. ***

Revision history for this message
Matt Zimmerman (mdz) wrote :

Remove myself from all these CCs now that we have the warty-bugs mailing list

Revision history for this message
In , Colin Watson (cjwatson) wrote : Re: Bug#252676: sshd failure
Download full text (3.5 KiB)

On Fri, Jun 04, 2004 at 01:20:54PM -0400, Joey Hess wrote:
> My colocated server was refusing both ssh and ssl telnet connections.
> It looked like this:
>
> joey:~>ssh -v kite
> OpenSSH_3.8.1p1 Debian 1:3.8.1p1-4, OpenSSL 0.9.7d 17 Mar 2004
> debug1: Reading configuration data /home/joey/.ssh/config
> debug1: Applying options for kite
> debug1: Reading configuration data /etc/ssh/ssh_config
> debug1: Connecting to kite [64.62.161.42] port 22.
> debug1: Connection established.
> debug1: identity file /home/joey/.ssh/identity type -1
> debug1: identity file /home/joey/.ssh/id_rsa type -1
> debug1: identity file /home/joey/.ssh/id_dsa type 2
> ssh_exchange_identification: Connection closed by remote host
>
> Telnet also hung up before I got to a login prompt. The rest of the serivces
> seemed ok. I got a root shell via other means, and tried restarting ssh. No
> luck. Tried upgrading the whole system to current unstable, again, no luck.
> Then I noticed something strange in ps:
>
> 14515 ? S 0:00 sshd: joey [pam]
> 32215 ? S 0:00 sshd: bdragon [pam]
> 8978 ? S 0:00 sshd: joeyh [pam]
>
> There were a few more that I've elided because they may contain preveligded
> information. I don't have a "bdragon" or "joeyh" user, and there were some
> other weird users listed. None of these users were really logged in,
> that I could tell.

We're also seeing these symptoms on a server at work, although they're
highly intermittent and very difficult to track down. Debian ssh
3.8.1p1-4 is basically OpenSSH 3.8.1p1 plus Darren Tucker's auth-pam.c
patch to kill the PAM thread if the privsep slave dies plus a few other
changes which I'm pretty sure are unrelated. In all cases where it goes
wrong, the [pam] processes are left lying around either after attempting
to log in as a nonexistent user or Ctrl-Cing ssh at a Password: prompt.
We're running with UsePrivilegeSeparation yes, UsePAM yes, and
PasswordAuthentication no.

We noticed this at the end of a diff of auth.log output between when the
[pam] processes were left lying around and when they aren't:

  debug3: ssh_msg_send: type 1
  debug3: ssh_msg_recv entering
  debug3: mm_request_send entering: type 51
  debug3: mm_request_receive entering
- debug1: do_cleanup
  fatal: PAM: authentication thread exited unexpectedly
  debug1: do_cleanup
+ debug1: PAM: cleanup
+ debug3: PAM: sshpam_thread_cleanup entering

It looks to me as if sshpam_cleanup() and sshpam_thread_cleanup() aren't
getting called under all circumstances when they should be, and that the
result of this is that the [pam] threads lie around forever until they
choke the server. Yet do_cleanup() *is* getting called. Since I believe
that neither KRB5 nor GSSAPI is compiled in, this means that either:

  (a) we're in the login shell child (should certainly hope not,
      authentication fails)

  (b) do_cleanup() has been called already in this process

  (c) authctxt is NULL (which I don't think can be possible, since
      do_cleanup() must be getting called from cleanup_exit())

So I think I see the problem: if do_cleanup() happens to get called from
the "wrong" thread (perhaps the authentication thread itsel...

Read more...

Revision history for this message
In , Damien Miller (djm) wrote :

Colin Watson wrote:
> I wish I could provide you with a reliable reproduction recipe, but
> perhaps this is good enough for a pthreads expert on openssh-unix-dev to
> work it out?

Are you compiling with USE_POSIX_THREADS?

-d

Revision history for this message
In , Colin Watson (cjwatson) wrote :

On Fri, Jul 09, 2004 at 10:51:22PM +1000, Damien Miller wrote:
> Colin Watson wrote:
> > I wish I could provide you with a reliable reproduction recipe, but
> > perhaps this is good enough for a pthreads expert on openssh-unix-dev to
> > work it out?
>
> Are you compiling with USE_POSIX_THREADS?

No.

--
Colin Watson [<email address hidden>]

Revision history for this message
Debian Bug Importer (debzilla) wrote :
Download full text (3.9 KiB)

Message-ID: <email address hidden>
Date: Fri, 9 Jul 2004 13:35:13 +0100
From: Colin Watson <email address hidden>
To: Joey Hess <email address hidden>, <email address hidden>,
 Andres Salomon <email address hidden>, <email address hidden>
Cc: <email address hidden>
Subject: Re: Bug#252676: sshd failure

On Fri, Jun 04, 2004 at 01:20:54PM -0400, Joey Hess wrote:
> My colocated server was refusing both ssh and ssl telnet connections.
> It looked like this:
>
> joey:~>ssh -v kite
> OpenSSH_3.8.1p1 Debian 1:3.8.1p1-4, OpenSSL 0.9.7d 17 Mar 2004
> debug6761: Reading configuration data /home/joey/.ssh/config
> debug6761: Applying options for kite
> debug6761: Reading configuration data /etc/ssh/ssh_config
> debug6761: Connecting to kite [64.62.161.42] port 22.
> debug6761: Connection established.
> debug6761: identity file /home/joey/.ssh/identity type -1
> debug6761: identity file /home/joey/.ssh/id_rsa type -1
> debug6761: identity file /home/joey/.ssh/id_dsa type 2
> ssh_exchange_identification: Connection closed by remote host
>
> Telnet also hung up before I got to a login prompt. The rest of the serivces
> seemed ok. I got a root shell via other means, and tried restarting ssh. No
> luck. Tried upgrading the whole system to current unstable, again, no luck.
> Then I noticed something strange in ps:
>
> 14515 ? S 0:00 sshd: joey [pam]
> 32215 ? S 0:00 sshd: bdragon [pam]
> 8978 ? S 0:00 sshd: joeyh [pam]
>
> There were a few more that I've elided because they may contain preveligded
> information. I don't have a "bdragon" or "joeyh" user, and there were some
> other weird users listed. None of these users were really logged in,
> that I could tell.

We're also seeing these symptoms on a server at work, although they're
highly intermittent and very difficult to track down. Debian ssh
3.8.1p1-4 is basically OpenSSH 3.8.1p1 plus Darren Tucker's auth-pam.c
patch to kill the PAM thread if the privsep slave dies plus a few other
changes which I'm pretty sure are unrelated. In all cases where it goes
wrong, the [pam] processes are left lying around either after attempting
to log in as a nonexistent user or Ctrl-Cing ssh at a Password: prompt.
We're running with UsePrivilegeSeparation yes, UsePAM yes, and
PasswordAuthentication no.

We noticed this at the end of a diff of auth.log output between when the
[pam] processes were left lying around and when they aren't:

  debug6763: ssh_msg_send: type 1
  debug6763: ssh_msg_recv entering
  debug6763: mm_request_send entering: type 51
  debug6763: mm_request_receive entering
- debug6761: do_cleanup
  fatal: PAM: authentication thread exited unexpectedly
  debug6761: do_cleanup
+ debug6761: PAM: cleanup
+ debug6763: PAM: sshpam_thread_cleanup entering

It looks to me as if sshpam_cleanup() and sshpam_thread_cleanup() aren't
getting called under all circumstances when they should be, and that the
result of this is that the [pam] threads lie around forever until they
choke the server. Yet do_cleanup() *is* getting called. Since I believe
that neither KRB5 nor GSSAPI is compiled in, this means that either:

  (a) we're in the login shell child (should...

Read more...

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-ID: <email address hidden>
Date: Fri, 09 Jul 2004 22:51:22 +1000
From: Damien Miller <email address hidden>
To: Colin Watson <email address hidden>
Cc: Joey Hess <email address hidden>, <email address hidden>,
 Andres Salomon <email address hidden>, <email address hidden>, <email address hidden>
Subject: Re: Bug#252676: sshd failure

Colin Watson wrote:
> I wish I could provide you with a reliable reproduction recipe, but
> perhaps this is good enough for a pthreads expert on openssh-unix-dev to
> work it out?

Are you compiling with USE_POSIX_THREADS?

-d

Revision history for this message
In , Darren Tucker (dtucker) wrote :

Colin Watson wrote:
[snip bug details]
> We're also seeing these symptoms on a server at work, although they're
> highly intermittent and very difficult to track down.

I will look at this tomorrow. Could you please provide the sshd PAM
configs for the machine(s) exhibiting the problem? In particular, are
there any PAM modules that might use fork/exec themselves?

--
Darren Tucker (dtucker at zip.com.au)
GPG key 8FF4FA69 / D9A3 86E9 7EEE AF4B B2D4 37C9 C982 80C7 8FF4 FA69
     Good judgement comes with experience. Unfortunately, the experience
usually comes from bad judgement.

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-ID: <email address hidden>
Date: Fri, 9 Jul 2004 14:15:52 +0100
From: Colin Watson <email address hidden>
To: Damien Miller <email address hidden>
Cc: Joey Hess <email address hidden>, <email address hidden>,
 Andres Salomon <email address hidden>, <email address hidden>, <email address hidden>
Subject: Re: Bug#252676: sshd failure

On Fri, Jul 09, 2004 at 10:51:22PM +1000, Damien Miller wrote:
> Colin Watson wrote:
> > I wish I could provide you with a reliable reproduction recipe, but
> > perhaps this is good enough for a pthreads expert on openssh-unix-dev to
> > work it out?
>
> Are you compiling with USE_POSIX_THREADS?

No.

--
Colin Watson [<email address hidden>]

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-ID: <email address hidden>
Date: Sat, 10 Jul 2004 00:36:00 +1000
From: Darren Tucker <email address hidden>
To: Colin Watson <email address hidden>
CC: Joey Hess <email address hidden>, <email address hidden>,
 Andres Salomon <email address hidden>, <email address hidden>, <email address hidden>
Subject: Re: Bug#252676: sshd failure

Colin Watson wrote:
[snip bug details]
> We're also seeing these symptoms on a server at work, although they're
> highly intermittent and very difficult to track down.

I will look at this tomorrow. Could you please provide the sshd PAM
configs for the machine(s) exhibiting the problem? In particular, are
there any PAM modules that might use fork/exec themselves?

--
Darren Tucker (dtucker at zip.com.au)
GPG key 8FF4FA69 / D9A3 86E9 7EEE AF4B B2D4 37C9 C982 80C7 8FF4 FA69
     Good judgement comes with experience. Unfortunately, the experience
usually comes from bad judgement.

Revision history for this message
In , Darren Tucker (dtucker) wrote :

Darren Tucker wrote:
> Colin Watson wrote:
> [snip bug details]
>
>> We're also seeing these symptoms on a server at work, although they're
>> highly intermittent and very difficult to track down.
>
> I will look at this tomorrow.

I was able to sometimes reproduce this on Debian by connecting to the
server PreferredAuthentications=keyboard-interactive then *immediately*
cancelling the authentication with ctrl-C.

After some digging I think I have found the cause: waitpid will return
zero if the process has not exited and none of the conditions listed
under "ERRORS" in the man page have been met. Attached is a patch to
test for this too (which it should have done in the first place, sigh).

I have not been able to reproduce the problem with this patch.

(Interestingly, I was not able to reproduce it on Redhat by doing the
same thing. I'm not sure why, but Debian is running on faster, dual CPU
box so it could be a timing issue.)

--
Darren Tucker (dtucker at zip.com.au)
GPG key 8FF4FA69 / D9A3 86E9 7EEE AF4B B2D4 37C9 C982 80C7 8FF4 FA69
     Good judgement comes with experience. Unfortunately, the experience
usually comes from bad judgement.

Revision history for this message
In , Andres Salomon (dilinger-deactivatedaccount) wrote :

On Sat, 2004-07-10 at 13:14 +1000, Darren Tucker wrote:
> After some digging I think I have found the cause: waitpid will
> return
> zero if the process has not exited and none of the conditions listed
> under "ERRORS" in the man page have been met. Attached is a patch to
> test for this too (which it should have done in the first place,
> sigh).
>
> I have not been able to reproduce the problem with this patch.

Thanks. I've built and installed packages w/ the patch applied; if I
see the bug again, I'll let you folks know.

--
Andres Salomon <email address hidden>

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-ID: <email address hidden>
Date: Sat, 10 Jul 2004 13:14:34 +1000
From: Darren Tucker <email address hidden>
To: Colin Watson <email address hidden>
CC: <email address hidden>, <email address hidden>,
 Joey Hess <email address hidden>, Andres Salomon <email address hidden>, <email address hidden>
Subject: Re: Bug#252676: sshd failure

--------------040204000305010000030000
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Darren Tucker wrote:
> Colin Watson wrote:
> [snip bug details]
>
>> We're also seeing these symptoms on a server at work, although they're
>> highly intermittent and very difficult to track down.
>
> I will look at this tomorrow.

I was able to sometimes reproduce this on Debian by connecting to the
server PreferredAuthentications=keyboard-interactive then *immediately*
cancelling the authentication with ctrl-C.

After some digging I think I have found the cause: waitpid will return
zero if the process has not exited and none of the conditions listed
under "ERRORS" in the man page have been met. Attached is a patch to
test for this too (which it should have done in the first place, sigh).

I have not been able to reproduce the problem with this patch.

(Interestingly, I was not able to reproduce it on Redhat by doing the
same thing. I'm not sure why, but Debian is running on faster, dual CPU
box so it could be a timing issue.)

--
Darren Tucker (dtucker at zip.com.au)
GPG key 8FF4FA69 / D9A3 86E9 7EEE AF4B B2D4 37C9 C982 80C7 8FF4 FA69
     Good judgement comes with experience. Unfortunately, the experience
usually comes from bad judgement.

--------------040204000305010000030000
Content-Type: text/plain;
 name="openssh-pam-wait.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="openssh-pam-wait.patch"

Index: auth-pam.c
===================================================================
RCS file: /usr/local/src/security/openssh/cvs/openssh_cvs/auth-pam.c,v
retrieving revision 1.110
diff -u -p -r1.110 auth-pam.c
--- auth-pam.c 1 Jul 2004 04:00:15 -0000 1.110
+++ auth-pam.c 10 Jul 2004 02:58:58 -0000
@@ -113,11 +113,11 @@ sshpam_sigchld_handler(int sig)
  if (cleanup_ctxt == NULL)
   return; /* handler called after PAM cleanup, shouldn't happen */
  if (waitpid(cleanup_ctxt->pam_thread, &sshpam_thread_status, WNOHANG)
- == -1) {
+ <= 0) {
   /* PAM thread has not exitted, privsep slave must have */
   kill(cleanup_ctxt->pam_thread, SIGTERM);
   if (waitpid(cleanup_ctxt->pam_thread, &sshpam_thread_status, 0)
- == -1)
+ <= 0)
    return; /* could not wait */
  }
  if (WIFSIGNALED(sshpam_thread_status) &&

--------------040204000305010000030000--

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-Id: <email address hidden>
Date: Fri, 09 Jul 2004 23:48:11 -0400
From: Andres Salomon <email address hidden>
To: Darren Tucker <email address hidden>
Cc: Colin Watson <email address hidden>, <email address hidden>,
 <email address hidden>, Joey Hess <email address hidden>, <email address hidden>
Subject: Re: Bug#252676: sshd failure

--=-wNPqkrDjCIcY27FD9lth
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

On Sat, 2004-07-10 at 13:14 +1000, Darren Tucker wrote:
> After some digging I think I have found the cause: waitpid will
> return=20
> zero if the process has not exited and none of the conditions listed=20
> under "ERRORS" in the man page have been met. Attached is a patch to=20
> test for this too (which it should have done in the first place,
> sigh).
>=20
> I have not been able to reproduce the problem with this patch.

Thanks. I've built and installed packages w/ the patch applied; if I
see the bug again, I'll let you folks know.

--=20
Andres Salomon <email address hidden>

--=-wNPqkrDjCIcY27FD9lth
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQBA72b778o9R9NraMQRApoeAJ9kZhpNgvyxdD/VQsOawp23mDXszACffGfs
UFKYU8FKGr8I3+LsswPZYmI=
=cSaI
-----END PGP SIGNATURE-----

--=-wNPqkrDjCIcY27FD9lth--

Revision history for this message
In , Colin Watson (cjwatson) wrote :

On Sat, Jul 10, 2004 at 01:14:34PM +1000, Darren Tucker wrote:
> I was able to sometimes reproduce this on Debian by connecting to the
> server PreferredAuthentications=keyboard-interactive then *immediately*
> cancelling the authentication with ctrl-C.
>
> After some digging I think I have found the cause: waitpid will return
> zero if the process has not exited and none of the conditions listed
> under "ERRORS" in the man page have been met. Attached is a patch to
> test for this too (which it should have done in the first place, sigh).
>
> I have not been able to reproduce the problem with this patch.

That makes good sense to me, since in an strace here I'm seeing
waitpid() returning zero.

> (Interestingly, I was not able to reproduce it on Redhat by doing the
> same thing. I'm not sure why, but Debian is running on faster, dual CPU
> box so it could be a timing issue.)

I can't reproduce it on Debian powerpc, which had been doing my head in;
I can well believe a timing issue.

I'm applying your patch and will upload shortly after a bit of testing.
Thanks!

--
Colin Watson [<email address hidden>]

Revision history for this message
In , Colin Watson (cjwatson) wrote : Bug#252676: fixed in openssh 1:3.8.1p1-5
Download full text (3.5 KiB)

Source: openssh
Source-Version: 1:3.8.1p1-5

We believe that the bug you reported is fixed in the latest version of
openssh, which is due to be installed in the Debian FTP archive:

openssh-client-udeb_3.8.1p1-5_powerpc.udeb
  to pool/main/o/openssh/openssh-client-udeb_3.8.1p1-5_powerpc.udeb
openssh-server-udeb_3.8.1p1-5_powerpc.udeb
  to pool/main/o/openssh/openssh-server-udeb_3.8.1p1-5_powerpc.udeb
openssh_3.8.1p1-5.diff.gz
  to pool/main/o/openssh/openssh_3.8.1p1-5.diff.gz
openssh_3.8.1p1-5.dsc
  to pool/main/o/openssh/openssh_3.8.1p1-5.dsc
ssh-askpass-gnome_3.8.1p1-5_powerpc.deb
  to pool/main/o/openssh/ssh-askpass-gnome_3.8.1p1-5_powerpc.deb
ssh_3.8.1p1-5_powerpc.deb
  to pool/main/o/openssh/ssh_3.8.1p1-5_powerpc.deb

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed. If you
have further comments please address them to <email address hidden>,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Colin Watson <email address hidden> (supplier of updated openssh package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing <email address hidden>)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Sat, 10 Jul 2004 13:57:27 +0100
Source: openssh
Binary: ssh-askpass-gnome openssh-client-udeb ssh openssh-server-udeb
Architecture: source powerpc
Version: 1:3.8.1p1-5
Distribution: unstable
Urgency: medium
Maintainer: Matthew Vernon <email address hidden>
Changed-By: Colin Watson <email address hidden>
Description:
 openssh-client-udeb - Secure shell client for the Debian installer (udeb)
 openssh-server-udeb - Secure shell server for the Debian installer (udeb)
 ssh - Secure rlogin/rsh/rcp replacement (OpenSSH)
 ssh-askpass-gnome - under X, asks user for a passphrase for ssh-add
Closes: 252226 252676 258517
Changes:
 openssh (1:3.8.1p1-5) unstable; urgency=medium
 .
   * Update German debconf template translation (thanks, Helge Kreutzmann;
     closes: #252226).
   * Remove Suggests: dnsutils, as it was only needed for
     make-ssh-known-hosts (#93265), which has been replaced by ssh-keyscan.
   * Disable shadow password support in openssh-server-udeb.
   * Fix non-portable shell constructs in maintainer scripts, Makefile, and
     ssh-copy-id (thanks, David Weinehall; closes: #258517).
   * Apply patch from Darren Tucker to make the PAM authentication SIGCHLD
     handler kill the PAM thread if its waitpid() call returns 0, as well as
     the previous check for -1 (closes: #252676).
   * Add scp and sftp to openssh-client-udeb. It might not be very 'u' any
     more; oh well.
Files:
 3202977c5bb0f8ad90f054490c897ee8 890 net standard openssh_3.8.1p1-5.dsc
 c1607db15c5c218a105ebeb283987c16 148208 net standard openssh_3.8.1p1-5.diff.gz
 7fd850f6eaa00a94bc20bd08bd47365f 732184 net standard ssh_3.8.1p1-5_powerpc.deb
 bcccadd0ae2ccdf5e392fdc0857c6440 51878 gnome optional ssh-askpass-gnome_3.8.1p1-5_powerpc.deb
 e7f35854be7a14906d2c003a881a979e 150892 debian-installer optional open...

Read more...

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-ID: <email address hidden>
Date: Sat, 10 Jul 2004 13:33:00 +0100
From: Colin Watson <email address hidden>
To: Darren Tucker <email address hidden>
Cc: <email address hidden>, <email address hidden>,
 Joey Hess <email address hidden>, Andres Salomon <email address hidden>, <email address hidden>
Subject: Re: Bug#252676: sshd failure

On Sat, Jul 10, 2004 at 01:14:34PM +1000, Darren Tucker wrote:
> I was able to sometimes reproduce this on Debian by connecting to the
> server PreferredAuthentications=keyboard-interactive then *immediately*
> cancelling the authentication with ctrl-C.
>
> After some digging I think I have found the cause: waitpid will return
> zero if the process has not exited and none of the conditions listed
> under "ERRORS" in the man page have been met. Attached is a patch to
> test for this too (which it should have done in the first place, sigh).
>
> I have not been able to reproduce the problem with this patch.

That makes good sense to me, since in an strace here I'm seeing
waitpid() returning zero.

> (Interestingly, I was not able to reproduce it on Redhat by doing the
> same thing. I'm not sure why, but Debian is running on faster, dual CPU
> box so it could be a timing issue.)

I can't reproduce it on Debian powerpc, which had been doing my head in;
I can well believe a timing issue.

I'm applying your patch and will upload shortly after a bit of testing.
Thanks!

--
Colin Watson [<email address hidden>]

Revision history for this message
Debian Bug Importer (debzilla) wrote :
Download full text (3.7 KiB)

Message-Id: <email address hidden>
Date: Sat, 10 Jul 2004 09:32:03 -0400
From: Colin Watson <email address hidden>
To: <email address hidden>
Subject: Bug#252676: fixed in openssh 1:3.8.1p1-5

Source: openssh
Source-Version: 1:3.8.1p1-5

We believe that the bug you reported is fixed in the latest version of
openssh, which is due to be installed in the Debian FTP archive:

openssh-client-udeb_3.8.1p1-5_powerpc.udeb
  to pool/main/o/openssh/openssh-client-udeb_3.8.1p1-5_powerpc.udeb
openssh-server-udeb_3.8.1p1-5_powerpc.udeb
  to pool/main/o/openssh/openssh-server-udeb_3.8.1p1-5_powerpc.udeb
openssh_3.8.1p1-5.diff.gz
  to pool/main/o/openssh/openssh_3.8.1p1-5.diff.gz
openssh_3.8.1p1-5.dsc
  to pool/main/o/openssh/openssh_3.8.1p1-5.dsc
ssh-askpass-gnome_3.8.1p1-5_powerpc.deb
  to pool/main/o/openssh/ssh-askpass-gnome_3.8.1p1-5_powerpc.deb
ssh_3.8.1p1-5_powerpc.deb
  to pool/main/o/openssh/ssh_3.8.1p1-5_powerpc.deb

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed. If you
have further comments please address them to <email address hidden>,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Colin Watson <email address hidden> (supplier of updated openssh package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing <email address hidden>)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Sat, 10 Jul 2004 13:57:27 +0100
Source: openssh
Binary: ssh-askpass-gnome openssh-client-udeb ssh openssh-server-udeb
Architecture: source powerpc
Version: 1:3.8.1p1-5
Distribution: unstable
Urgency: medium
Maintainer: Matthew Vernon <email address hidden>
Changed-By: Colin Watson <email address hidden>
Description:
 openssh-client-udeb - Secure shell client for the Debian installer (udeb)
 openssh-server-udeb - Secure shell server for the Debian installer (udeb)
 ssh - Secure rlogin/rsh/rcp replacement (OpenSSH)
 ssh-askpass-gnome - under X, asks user for a passphrase for ssh-add
Closes: 252226 252676 258517
Changes:
 openssh (1:3.8.1p1-5) unstable; urgency=medium
 .
   * Update German debconf template translation (thanks, Helge Kreutzmann;
     closes: #252226).
   * Remove Suggests: dnsutils, as it was only needed for
     make-ssh-known-hosts (#93265), which has been replaced by ssh-keyscan.
   * Disable shadow password support in openssh-server-udeb.
   * Fix non-portable shell constructs in maintainer scripts, Makefile, and
     ssh-copy-id (thanks, David Weinehall; closes: #258517).
   * Apply patch from Darren Tucker to make the PAM authentication SIGCHLD
     handler kill the PAM thread if its waitpid() call returns 0, as well as
     the previous check for -1 (closes: #252676).
   * Add scp and sftp to openssh-client-udeb. It might not be very 'u' any
     more; oh well.
Files:
 3202977c5bb0f8ad90f054490c897ee8 890 net standard openssh_3.8.1p1-5.dsc
 c1607db15c5c218a105ebeb283987c16 148208 net standard openssh_3.8.1p1-5.diff.gz
 7fd850f6eaa00a94bc20bd08bd47365...

Read more...

Revision history for this message
Colin Watson (cjwatson) wrote :

sync requested

Revision history for this message
Matt Zimmerman (mdz) wrote :

1:3.8.1p1-5 synched from Debian

Revision history for this message
Daniel Robitaille (robitaille) wrote :

Fixed in Debian in 2004

Changed in openssh:
status: Unconfirmed → Fix Released
Changed in openssh:
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.