sshd segfault at 38

Bug #534623 reported by jbw
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openssh (Ubuntu)
High
Unassigned

Bug Description

sshd keeps dying on me.

On one box here are the ssh segfaults as shown in the /var/log/message file:

Mar 7 12:12:44 opssrv2 kernel: [189231.425897] sshd[18463]: segfault at 38 ip 0000000000000038 sp 00007fff7936d560 error 14 in libnss_files-2.10.1.so[7f891d75d000+c000]
Mar 7 12:17:53 opssrv2 kernel: [189540.896446] sshd[13358]: segfault at 38 ip 0000000000000038 sp 00007fffeb8ba510 error 14 in zero (deleted)[7fbbba3f8000+140000]
Mar 7 13:07:43 opssrv2 kernel: [192530.518617] sshd[11124]: segfault at 38 ip 0000000000000038 sp 00007fff14b5cd20 error 14 in libnss_files-2.10.1.so[7f4e4f57d000+c000]
Mar 7 13:14:08 opssrv2 kernel: [192915.534163] sshd[19750]: segfault at 38 ip 0000000000000038 sp 00007ffffcace330 error 14 in libnss_files-2.10.1.so[7f0ddc0d4000+c000]
Mar 7 13:17:36 opssrv2 kernel: [193123.471510] sshd[20752]: segfault at 38 ip 0000000000000038 sp 00007fff7fb10640 error 14 in zero (deleted)[7f25741d1000+140000]
Mar 7 14:12:15 opssrv2 kernel: [196402.624346] sshd[26539]: segfault at 38 ip 0000000000000038 sp 00007fff504eae80 error 14 in libnss_files-2.10.1.so[7f063111a000+c000]
Mar 7 15:12:02 opssrv2 kernel: [199989.471037] sshd[15912]: segfault at 38 ip 0000000000000038 sp 00007fffdb74e8b0 error 14 in libnss_files-2.10.1.so[7fb896b17000+c000]
Mar 7 15:21:58 opssrv2 kernel: [200586.033565] sshd[12210]: segfault at 38 ip 0000000000000038 sp 00007fffaefe4170 error 14 in zero (deleted)[7f9992eea000+140000]
Mar 7 16:17:00 opssrv2 kernel: [203887.563062] sshd[18585]: segfault at 38 ip 0000000000000038 sp 00007fff04329180 error 14 in zero (deleted)[7f5c0484a000+140000]
Mar 7 17:21:03 opssrv2 kernel: [207730.897392] sshd[7160]: segfault at 38 ip 0000000000000038 sp 00007fff2bcd3f50 error 14 in libnss_files-2.10.1.so[7f378dd7b000+c000]
Mar 7 17:47:04 opssrv2 kernel: [209291.421967] sshd[18117]: segfault at 38 ip 0000000000000038 sp 00007ffffd911510 error 14 in zero (deleted)[7f83ebb53000+140000]
Mar 7 18:11:57 opssrv2 kernel: [210784.938400] sshd[1533]: segfault at 38 ip 0000000000000038 sp 00007fff06726940 error 14 in zero (deleted)[7f9d896f2000+140000]
Mar 7 19:07:24 opssrv2 kernel: [214111.083084] sshd[8763]: segfault at 38 ip 0000000000000038 sp 00007fffdd5c24d0 error 14 in zero (deleted)[7f3f27904000+140000]
Mar 7 20:11:03 opssrv2 kernel: [217930.073629] sshd[22526]: segfault at 38 ip 0000000000000038 sp 00007ffff8534720 error 14 in libnss_files-2.10.1.so[7f42db622000+c000]
Mar 7 21:05:42 opssrv2 kernel: [221209.484340] sshd[26017]: segfault at 38 ip 0000000000000038 sp 00007fff4acc3790 error 14 in libnss_files-2.10.1.so[7f93079d8000+c000]
Mar 7 21:05:47 opssrv2 kernel: [221214.366166] sshd[26937]: segfault at 38 ip 0000000000000038 sp 00007fff5e4faeb0 error 14 in zero (deleted)[7f0c5f405000+140000]
Mar 7 22:05:45 opssrv2 kernel: [224812.451978] sshd[31521]: segfault at 38 ip 0000000000000038 sp 00007fff98505d90 error 14 in libnss_files-2.10.1.so[7f9cc4592000+c000]
Mar 7 22:40:47 opssrv2 kernel: [226914.986412] sshd[26234]: segfault at 38 ip 0000000000000038 sp 00007fff922c2070 error 14 in zero (deleted)[7fa1a5bf5000+140000]
Mar 7 22:55:42 opssrv2 kernel: [227809.477066] sshd[3101]: segfault at 38 ip 0000000000000038 sp 00007fffeab8cf80 error 14 in zero (deleted)[7f537b824000+140000]
Mar 7 22:55:44 opssrv2 kernel: [227811.303907] sshd[3544]: segfault at 38 ip 0000000000000038 sp 00007fff7ef87be0 error 14 in zero (deleted)[7fafe38ef000+140000]

Rebuilt a new 9.10 server (from which this bug report is from) and had the same thing:

Mar 8 13:10:48 opssrv7 kernel: [ 3923.432132] sshd[888]: segfault at 38 ip 0000000000000038 sp 00007fffcf5a8c30 error 14 in libnss_files-2.10.1.so[7f47390c1000+c000]

Both boxes are VM's. Also there are a number of automated ssh connections (using keys) into the boxes, about 9,000 per hour. I'm trying to replace a much older non-ubuntu vm box with OpenSSH_3.7.1p1 that for years had the same ssh traffic without issue.

I commented out the follow two modules from /etc/pam.d/sshd in order to reduce the sshd overhead but that did not make any difference, sshd still dies after some amount of time and has to be manually re-started.

# Print the message of the day upon successful login.
#session optional pam_motd.so # [1]

# Print the status of the user's mailbox upon successful login.
#session optional pam_mail.so standard noenv # [1]

- Justin

ProblemType: Bug
Architecture: amd64
Date: Mon Mar 8 14:50:29 2010
DistroRelease: Ubuntu 9.10
ExecutablePath: /usr/sbin/sshd
Package: openssh-server 1:5.1p1-6ubuntu2
ProcEnviron:
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-20.57-server
SourcePackage: openssh
Uname: Linux 2.6.31-20-server x86_64

Revision history for this message
jbw (jwilloughby) wrote :
Revision history for this message
jbw (jwilloughby) wrote :

Some new sshd faults, include general protection in addition to to the segfaults:

Mar 8 15:10:42 opssrv7 kernel: [11118.173994] sshd[21990] general protection ip:38 sp:7fffae68f3e0 error:0 in libnss_files-2.10.1.so[7f889fce3000+c000]
Mar 8 15:25:42 opssrv7 kernel: [12017.672166] sshd[15451] general protection ip:38 sp:7fff306c4860 error:0 in zero (deleted)[7ff2b9bba000+140000]
Mar 8 15:50:46 opssrv7 kernel: [13521.836865] sshd[6964]: segfault at 38 ip 0000000000000038 sp 00007fff2d90c7f0 error 14 in libnss_files-2.10.1.so[7fee05ee0000+c000]

Revision history for this message
jbw (jwilloughby) wrote :

I've had the server crash also, just now, I'm just not sure if it's sshd that's causing the problem or if sshd is just reacting to some other issue with ubuntu server 9.10. Does screenshot shed any light on this issue?

- Justin

Revision history for this message
Mathias Gug (mathiaz) wrote :

Are there other processes than sshd that are sefaulting in the guest?

Changed in openssh (Ubuntu):
importance: Undecided → High
status: New → Incomplete
tags: added: karmic
Revision history for this message
jbw (jwilloughby) wrote :

The only other process that is segfaulting (and not as frequently as sshd) is sh

Mar 9 00:00:50 opssrv7 kernel: [42925.569832] sh[8686]: segfault at 38 ip 0000000000000038 sp 00007fff3f861940 error 14 in dash[400000+18000]

For the past two days opssrv7 had 48 sshd segfaults and sh had 4.

- Justin

Revision history for this message
Mathias Gug (mathiaz) wrote : Re: [Bug 534623] Re: sshd segfault at 38

On Wed, Mar 10, 2010 at 12:32:24PM -0000, jbw wrote:
>
> The only other process that is segfaulting (and not as frequently as sshd) is sh
>
> Mar 9 00:00:50 opssrv7 kernel: [42925.569832] sh[8686]: segfault at 38
> ip 0000000000000038 sp 00007fff3f861940 error 14 in dash[400000+18000]
>
> For the past two days opssrv7 had 48 sshd segfaults and sh had 4.
>

Hm - not sure it's related to openssh only then. May be a hardware related
problem. Have you tried to run the vm on a different host?

--
Mathias Gug
Ubuntu Developer http://www.ubuntu.com

Revision history for this message
jbw (jwilloughby) wrote :

I have other version of Ubuntu server running on the same host without issue as well as some other Linux distributions.

The only thing different with this is it's now my only my 2nd 9.10 64 or 32 bit Ubuntu server running.

The processes on other Unix boxes (ssh into the box to launch rrdtool) was running fine on a 32bit 2.4.18 linux kernel with sshd OpenSSH_3.7.1p1, also a VM on the same host VMWare server.

I first moved all incoming ssh connections to to a new Ubuntu 9.10 64bit server and had issues with sshd dying and sometimes the whole box dying.

I built a new 9.10 box, also 64bit and move the incoming ssh processes to this new box/vm. It had the same problems as the first 9.10 server/box/vm. The first 9.10 server has had no issues since I'm not hitting it with a huge amount of ssh connections any more (they were moved to the 2nd 9.10 box).

So I'm guessing there could be something else at fault, but it seems all the ssh connections are triggering the other issue if ssh itself is not at fault.

Is there any other troubleshooting I can do on my current 9.10 box that would get to the bottom of this? I hate to have to build a 9.04 box to see if it does the something as it takes me a while to update all my Unix boxes that ssh into the Ubuntu server, and I've already done it twice.

Thanks,

- Justin

Revision history for this message
emilis.kuke@gmail.com (emilis-kuke) wrote :

I think we have more problems this is not just a server craches. Please look at this, this is exploit for freebsd: http://www.hack0wn.com/view.php?xroot=897.0&cat=exploits. One of my server was hacked too few days ago..

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for openssh (Ubuntu) because there has been no activity for 60 days.]

Changed in openssh (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers