[10.04 LTS] emacs spins when ssh is uncleanly terminated

Bug #786730 reported by Benjamin Kaduk
32
This bug affects 3 people
Affects Status Importance Assigned to Milestone
GNU Emacs
Unknown
Unknown
Fedora
Fix Released
Critical
emacs23 (Ubuntu)
Fix Released
Medium
Unassigned
Lucid
Fix Released
Medium
Unassigned
emacs23 (openSUSE)
Fix Released
Medium

Bug Description

When emacs is running in an ssh session and ssh is terminated uncleanly, it begins to use 100% CPU and slowly but constantly increase its used memory.

[Impact]

In a department with approximately 50 desktops, one publicly accessible server, and a couple hundred users, I have to kill on average one emacs process a day. If left alone, those processes not only slow down the machine (by using up CPU time) but also eventually trigger the OOM killer when their memory usage exceeds the amount of available swap space.

Unfortunately, the Linux OOM killer is not 100% accurate at removing the cause and I have seen other long-running high-memory processes. While I have heard no complaints about data loss, it is possible.

[Development Fix]

Newer versions of Ubuntu have fixed this problem by having newer versions of emacs. The issue was fixed in 23.2a, so all Ubuntu packages > 23.2 are fixed.

[Stable Fix]

The fix from emacs applies cleanly to the source for 23.1 and can be found in their git repository: http://git.savannah.gnu.org/cgit/emacs.git/patch/?id=1857569af70493e9dcd446c8bfb029096b355593

[Test Case]

1) ssh to localhost
2) start emacs
3) Control-Z
4) Enter, tilde, period (to kill the SSH session)
5) emacs starts spinning and eating memory (verified by top)

[Regression Portential]

Given that the fix only affects emacs when it is exiting, it is unlikely to cause any user regressions.

Revision history for this message
In , J-koenig (j-koenig) wrote :
Download full text (4.7 KiB)

typically I run "emacs -nw" within xterm, and often I stop them with CTRL-Z and keep them in background.

now I've seen 2+ times that emacs loops and hogs both memory and cpu after shutting down X11 going to runlevel 3. likely this was a leftover emacs from background.

from top:

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23902 harald 20 0 7222m 3.4g 608 R 100 88.9 59:28.72 emacs-gtk

from strace:

ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff1f32e320) = -1 EIO (Input/output error)
ioctl(3, SNDCTL_TMR_STOP or TCSETSW, {B0 -opost -isig -icanon -echo ...}) = -1 EIO (Input/output error)
write(3, "\7", 1) = -1 EIO (Input/output error)
brk(0x1bb35e000) = 0x1bb339000
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
brk(0x1bb35e000) = 0x1bb339000
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
brk(0x1bb35a000) = 0x1bb339000
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff1f32e320) = -1 EIO (Input/output error)
ioctl(3, SNDCTL_TMR_STOP or TCSETSW, {B0 -opost -isig -icanon -echo ...}) = -1 EIO (Input/output error)
write(3, "\7", 1) = -1 EIO (Input/output error)
brk(0x1bb35e000) = 0x1bb339000
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
brk(0x1bb35e000) = 0x1bb339000
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
rt_sigprocmask(SIG_BLOCK, [IO], [HUP TERM IO], 8) = 0
ioctl(3, FIONREAD, [0]) = -1 EIO (Input/output error)
kill(23902, SIGHUP) = 0
brk(0x1bb35a000) = 0x1bb339000
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -...

Read more...

Revision history for this message
In , 1-w1rner-0 (1-w1rner-0) wrote :

If you have problem with the Gtk GUI you may use

  export EMACS_TOOLKIT=x11

or

  export EMACS_TOOLKIT=nox

does this help you. Beside this you may use

          Help -> Send Bug Report

to report this upstream as I do not have the time to fix
problem within the Gtk API

Revision history for this message
In , 1-w1rner-0 (1-w1rner-0) wrote :

Just send a bug report upstream with emacs GUI.

Revision history for this message
In , 1-w1rner-0 (1-w1rner-0) wrote :

Add a patch from upstream and submitted the fixed emacs to factory

Revision history for this message
In , Swamp-a (swamp-a) wrote :

Update released for: emacs, emacs-debuginfo, emacs-debugsource, emacs-el, emacs-info, emacs-nox, emacs-x11
Products:
openSUSE 11.2 (debug, i586, x86_64)

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in emacs23 (Ubuntu):
status: New → Confirmed
Revision history for this message
Brian Gernhardt (benji-silverinsanity) wrote :

100% reliable method of reproducing on my 10.4 machine:

1) ssh to localhost
2) start emacs
3) Control-Z
4) Enter, tilde, period (to kill the SSH session)
5) emacs starts spinning and eating memory

emacs bug reports:
http://debbugs.gnu.org/cgi-bin/bugreport.cgi?bug=4970
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=7951

The fix was included in 23.2a, but the patch applies cleanly to 23.1 as well.
http://git.savannah.gnu.org/cgit/emacs.git/commit/?h=emacs-23&id=1857569af70493e9dcd446c8bfb029096b355593

Revision history for this message
Brian Gernhardt (benji-silverinsanity) wrote :

I don't know how to draw more attention to this bug but I would say that I'd classify it as serious. Emacs ends up silently consuming memory in the background until the out of memory killer is triggered. And the OOM killer doesn't always kill emacs, it sometimes kills other long running processes which can lead to data loss. I have to work on machines on LTS and I'd really appreciate this patch being applied in lucid-updates.

Revision history for this message
Andres Freund (andres-anarazel) wrote :

I know that "me too" comments aren't really appreciated...

Is there anything hat we can do help this issue go away?

Revision history for this message
In , RHEL (rhel-redhat-bugs) wrote :

This bug has been copied from bug #732157 and has been proposed
to be backported to 6.2 z-stream (EUS).

Revision history for this message
In , Miroslav (miroslav-redhat-bugs) wrote :

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.

    New Contents:
Emacs did not properly terminate if it was started remotely and the remote client session was closed while Emacs was suspended. Under these conditions, Emacs entered an infinite loop in the code and gradually consumed all available computer resources, which caused the system to become unstable. With this update, Emacs has been modified, and it now terminates correctly when the remote session is closed.

Revision history for this message
In , errata-xmlrpc (errata-xmlrpc-redhat-bugs) wrote :

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0042.html

Revision history for this message
Marius B. Kotsbak (mariusko) wrote :

Ubuntu Precise contains version 23.3 which according to the above comment should contain fix for this issue. Could you please try to upgrade your Ubuntu and see if the problem is solved?

Changed in emacs23 (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Marius B. Kotsbak (mariusko) wrote :

I tried reproducing it using comment #2. At point 5 there is no emacs process.

Changed in emacs23 (Ubuntu):
status: Incomplete → Fix Released
Revision history for this message
Brian Gernhardt (benji-silverinsanity) wrote :

While it's lovely that it's fixed in 12.04, 10.04 is supposed to be supported on the desktop for another year and on servers for another three. I don't find "Upgrade to the newest Ubuntu" to be a nice solution for a bug in an LTS package.

My first comment (#2) includes a link to the patch from emacs that fixes this issue. Why can't it just be applied to emacs23 for 10.04?

I hate to be a pain about this, but it's likely to take some time to update the machines I have to administer to a new version of Ubuntu and having to check all the machines periodically for spinning emacs is getting old.

Revision history for this message
Marius B. Kotsbak (mariusko) wrote :

I agree, but I doubt there will be someone that will do it. It might help if you can follow the procedure here:

https://wiki.ubuntu.com/StableReleaseUpdates#Procedure

as far as you are able.

Revision history for this message
Marius B. Kotsbak (mariusko) wrote :
Changed in lucid-backports:
status: New → Confirmed
Iain Lane (laney)
no longer affects: lucid-backports
description: updated
description: updated
tags: added: patch
Revision history for this message
Brian Gernhardt (benji-silverinsanity) wrote :

Marius: I know how to perform the updates, but there are logistical problems too. Thanks for pointing out the SRU procedure. I updated the description to match the requested sections. The next step is to "Ask a bug supervisor to nominate for the appropriate Ubuntu releases", which I don't know how to do. (Other than ask here and on #ubuntu-bugs, which I've now done.)

Revision history for this message
Marius B. Kotsbak (mariusko) wrote :

Great. Alternatively/in additional, build your own package with the patch applied and place it in a PPA, but you then miss other bugfixes/security patches for emacs.

Changed in emacs23 (openSUSE):
importance: Unknown → Medium
status: Unknown → Fix Released
Revision history for this message
Marius B. Kotsbak (mariusko) wrote : Re: emacs spins when ssh is uncleanly terminated

Could you please run "apport-collect 786730" on one of the servers to add more system information to the bug report?

Revision history for this message
Marius B. Kotsbak (mariusko) wrote :

Attached is a patch that is supposed to add the upstream patch. Please try to apply it (after running apt-get build-dep emacs23 && apt-get source emacs23) and build the package (dpkg-buildpackage). It won't build here since I don't have the correct Ubuntu version.

Revision history for this message
Marius B. Kotsbak (mariusko) wrote :
Revision history for this message
Marius B. Kotsbak (mariusko) wrote :

This PPA is maybe a workaround: http://www.jeffbeeman.com/node/120

summary: - emacs spins when ssh is uncleanly terminated
+ [10.04 LTS] emacs spins when ssh is uncleanly terminated
Revision history for this message
Marius B. Kotsbak (mariusko) wrote :

Is this ready for a SRU now? Both Redhat and SuSE have released bugfixes for this one.

Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Marius, thanks for the debdiff! I've sponsored this into lucid-proposed, it should be looked at by somebody on the SRU team soon. In the future, you should subscribe 'ubuntu-sponsors' to a bug when you have prepared a debdiff, as that will ensure somebody looks at it in a timely manner.

Changed in emacs23 (Ubuntu Lucid):
status: New → Fix Committed
importance: Undecided → Medium
Changed in emacs23 (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Joe Gross (jgross) wrote :

This just killed several of my production boxes. A timely release would be appreciated.

Revision history for this message
Marius B. Kotsbak (mariusko) wrote :

I guess the fixed package might be available in Proposed. Can anyone who are running Lucid test it?: https://wiki.ubuntu.com/Testing/EnableProposed

Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Hello Benjamin, or anyone else affected,

Accepted emacs23 into lucid-proposed. The package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

tags: added: verification-needed
Revision history for this message
Benjamin Kaduk (kaduk-launchpad) wrote :

We've been running with the emacs23 version 23.1+1-4ubuntu7.3 package from lucid-proposed for several days, and I have not been able to reproduce this issue. My users are not reporting any regressions, either, so my experiences support moving it to production.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package emacs23 - 23.1+1-4ubuntu7.3

---------------
emacs23 (23.1+1-4ubuntu7.3) lucid-proposed; urgency=low

  * debian/patches/remove-check-for-last-terminal.diff: Cherry pick
    fix from upstream to fix issue with emacs spinning out of control
    when ssh session is terminated. (LP: #786730) Thanks to Marius
    Kotsbak!
 -- Clint Byrum <email address hidden> Wed, 09 May 2012 07:53:01 -0700

Changed in emacs23 (Ubuntu Lucid):
status: Fix Committed → Fix Released
Changed in fedora:
importance: Unknown → Critical
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.