screen unpredictably crashes

Bug #644825 reported by Jim Cheetham
This bug report is a duplicate of:  Bug #375625: screen freezes up, turns zombie. Edit Remove
82
This bug affects 14 people
Affects Status Importance Assigned to Milestone
byobu
Invalid
High
Unassigned
eglibc (Ubuntu)
Confirmed
Undecided
Unassigned
screen (Ubuntu)
Incomplete
High
Unassigned

Bug Description

Recently on a wide range of unconnected servers, I've been noticing byobu sessions crashing out, with the message "Error: Cannot find master process to attach to!" appearing on the console/shell. I have not yet noticed any unifying factors; this happens when logged on to the physical console as well as over ssh, while the screen is in use as well as when it has been idle for a long time.

byobu 2.68-0ubuntu1.1 and screen 4.0.3-14ubuntu1 are in use in all the cases I've seen (i.e. current upgraded Ubuntu 10.04 servers)

{{{
$ screen -ls
There is a screen on:
 4384.pts-0.leela (14/09/10 23:55:39) (Dead ???)
Remove dead screens with 'screen -wipe'.
1 Socket in /var/run/screen/S-jim.
}}}

I have never noticed this behaviour before, and I've been using vanilla screen for a long time; recently I've been switching to byobu because I find the on-screen info bars to be useful. For this reason I'm listing this bug against byobu, but I cannot prove that.

Suggestions for increasing debug visibility would be appreciated -- I don't want to have to run multi-day sessions under strace, but if that's what it takes I can do that on at least one server.

Changed in byobu:
importance: Undecided → High
Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Thanks for the bug report.

Okay, I'm marking this bug "High", but can't yet Confirm it. I haven't seen the problem, myself, yet, but it sounds serious.

The precise error message itself is coming from a recent from patch to Screen from Debian:
./debian/patches/32misc_minor_fixes.dpatch:+ printf("\nError: Cannot find master process to attach to!\n");

Are you running with Nethack mode on?

The content of the patch just changes an error message:

--- screen-4.0.3.orig/attacher.c 2009-02-26 01:40:12.427283863 +0100
+++ screen-4.0.3/attacher.c 2009-02-26 01:43:02.258234277 +0100
@@ -579,7 +579,7 @@
         {
          fcntl(0, F_SETFL, 0);
          SetTTY(0, &attach_Mode);
- printf("\nSuddenly the Dungeon collapses!! - You die...\n");
+ printf("\nError: Cannot find master process to attach to!\n");
          eexit(1);
         }
 #endif

Previously, did you see an equal number of "Suddenly the Dungeon collapses!! - You die..." messages?

Changed in byobu:
status: New → Incomplete
Changed in screen (Ubuntu):
importance: Undecided → High
status: New → Incomplete
Revision history for this message
Jim Cheetham (jim.cheetham) wrote :

I'm just using the default Ubuntu screen, which as far as I know does not have nethack mode enabled; I have not seen any messages like this before (well, not since the late 1980s when I was part of an unsuccessful attempt to hack a new immortal character into nethack ...). Actually, I have rarely seen screen crash.

It may be notable that I'm usually running byobu on a gateway machine, and several of the screens within it are themselves running byobu in a remote ssh session. It is the byobu/screen on the gateway machine that crashes in all the cases that I remember (sadly, I have not remembered many specifics), and the remote sessions are available for reconnection later.

I'll make sure to check whether the remote sessions are attached or detatched after the next crash I see; also whether it was a screen showing a remote screen, or a local one.

Revision history for this message
Dustin Kirkland  (kirkland) wrote : Re: [Bug 644825] Re: byobu/screen unpredictably crashes "Cannot find master process to attach to!"

Okay, so I can't quite figure out how you're getting this specific
error message. The code path looks to me to be nethack-mode specific.
 From the screen(1) manpage, here's how you enable/disable this:

       nethack on|off

       Changes the kind of error messages used by screen. When you are famil‐
       iar with the game "nethack", you may enjoy the nethack-style messages
       which will often blur the facts a little, but are much funnier to read.
       Anyway, standard messages often tend to be unclear as well.
       This option is only available if screen was compiled with the NETHACK
       flag defined. The default setting is then determined by the presence of
       the environment variable $NETHACKOPTIONS and the file ~/.nethackrc - if
       either one is present, the default is on.

Can you just check and make sure all of these are disabled?

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Also, how much memory do you have in this gateway machine?

I've tried byobu on a router's firmware myself and it's memory
footprint is quite a bit heavier than screen alone (because it logs so
many lines per window, and runs so many status scripts).

Revision history for this message
Jim Cheetham (jim.cheetham) wrote : Re: byobu/screen unpredictably crashes "Cannot find master process to attach to!"

Summary: no nethack options active, and "sufficient" memory.

{{{
jim@moonshine:~$ echo $NETHACKOPTIONS

jim@moonshine:~$ ls -l .nethackrc
ls: cannot access .nethackrc: No such file or directory
jim@moonshine:~$ free -m
             total used free shared buffers cached
Mem: 2511 2475 35 0 14 2140
-/+ buffers/cache: 320 2191
Swap: 7357 0 7357
}}}

Actually, it seems odd that the screen patch you identified would have changed a nethack message into a generic message ... it looks more like a nethack-specific message had been left in attacher.c by accident.

So, looking at screen's attacher.c, it seems as if this message can only happen if the MasterPid becomes unresponsive (i.e. doesn't respond to a kill 0), or something to do with AttacherChld happens ...

Now, is there a byobu process that invokes screen? And would that qualify for control over whatever MasterPid is supposed to be?

Revision history for this message
Shaun Bouckaert (train-meditations) wrote :

I've had this occur twice in the last 2 days on a new server setup. I do not have nethack style messages enabled. The most recent crash occured as I was closing htop in one of my windows, not actually switching windows or closing any windows. The last time it happened, IIRC, I had a split window and was tabbing between them. The machine is set up as a LAMP server (primarily a database server, running a couple of web apps), with 8GB of memory, of which it is currently using under 600MB.

Revision history for this message
Jim Cheetham (jim.cheetham) wrote :

The last two times this has happened to me, the server in question had had a new kernel installed as part of system upgrades, but had not yet been rebooted. Probably a red herring.

I tried to run an instance under strace -- however the program behaves very differently under these circumstances, and I'm not sure how indicative it will be (e.g by default, "strace byobu" results in byobu immediately terminating with the error "Directory '/var/run/screen' must have mode 777.")

Revision history for this message
John Millikin (jmillikin) wrote :

I've had this occur a few times recently; I am using the default byobu profile, on a 64-bit system with 4 GiB of RAM.

Each time the crash has occurred, I'm exiting vim when signed in from SSH, and the vim/screen session was started from the local desktop.

The following line appears in dmesg:

[919845.653676] screen[22768]: segfault at 18 ip 00007f55568ab120 sp 00007ffff3f35b30 error 4 in libc-2.12.1.so[7f5556831000+17a000]

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

I tried your method (byobu/vim from desktop, ssh in, exit vim), but was not able to make it crash in the same way.

Also, googling around for "error 4 in libc-2.12.1.so", it seems that that error is popping up in various around the interwebs. Possibly a library problem? I'm going to add a task for libc6.

summary: - byobu/screen unpredictably crashes "Cannot find master process to attach
- to!"
+ screen unpredictably crashes
Revision history for this message
eddified (eddified) wrote :

I just saw this error when I wasn't using byobu... just screen.

Revision history for this message
Jordan (jordanu) wrote :

I have seen this error only once, and I have been using byobu years (currently running Ubuntu 10.10).

The one time I saw this message I was connecting from Android's connectbot and was asked if I wanted to start a nested screen session, I answered no and quickly lost the connection for unrelated reasons (flaky internet connection). After reconnecting and and answering "yes" to start a nested screen session then trying to find the screen window containing irssi I was surprised to find that it no longer existed. When I came back home and looked at a terminal window that had previously had irssi in byobu I saw this error message. I have not enabled nethack mode and I have more than enough free memory:

jordan@jordan-desktop:~$ echo $NETHACKOPTIONS

jordan@jordan-desktop:~$ ls ~/.nethackrc
ls: cannot access /home/jordan/.nethackrc: No such file or directory
jordan@jordan-desktop:~$ free -m
             total used free shared buffers cached
Mem: 7750 4356 3393 0 618 1981
-/+ buffers/cache: 1755 5994
Swap: 0 0 0

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

I'm subscribed to this bug and will continue following this closely.

But I'm marking the task against Byobu as 'invalid', as I haven't been able to reproduce it there, and it appears that this crash has happened for some people outside of byobu.

It would be nice to get to the bottom of this ;-)

Changed in byobu:
status: Incomplete → Invalid
Revision history for this message
David Fraser (davidf) wrote :

In #8, wrote:
> I've had this occur a few times recently; I am using the default byobu profile, on a 64-bit system with 4 GiB of RAM.
>
> Each time the crash has occurred, I'm exiting vim when signed in from SSH, and the vim/screen session was started from the local desktop.
>
> The following line appears in dmesg:
>
> [919845.653676] screen[22768]: segfault at 18 ip 00007f55568ab120 sp 00007ffff3f35b30 error 4 in libc-2.12.1.so[7f5556831000+17a000]

I've also had this when exiting vim, but not from SSH; I had reattached to an existing screen session. And I get a similar segfault message, although I'm on 32-bit:

[67596.757458] screen[3297]: segfault at c ip 003d38b8 sp bfa0263c error 4 in libc-2.12.1.so[366000+157000]

I'm curious as to why apport doesn't pick up this segfault and write a core dump. Any ideas?

Revision history for this message
Malcolm Scott (malcscott) wrote :

Try "dpkg -S /usr/bin/screen". If it states "diversion by screen-profiles to: /usr/bin/screen.real", but you don't actually have screen-profiles installed, you may be experiencing bug #727741 which may have left you with an old and crashy version of screen.

Revision history for this message
Kevin (kevin-matson) wrote :

I experience this problem often when working with medium to large log files (>200K) The 10.04 servers I'm experienceing this on have never had screen-profiles installed. This seems like an unnecessary annoyance at best and a impediment to work at worst. I'll check dmesg going forward to see if I can add some data to this bug.

Revision history for this message
Chuck Frain (chuckfrain-deactivatedaccount) wrote :

I just had this happen to my machine. I was working in mutt (mutt-patched) version looking at email. The system is Kubuntu running the latest Natty beta. apt-get update/upgrade was run within the last 12 hours. The machine has 1.5 gig ram, not much running on it at the time.

From dmesg:
[188818.744522] screen[3683]: segfault at 914fa40 ip 00f02deb sp bf9ce7d0 error 4 in libc-2.13.so[e96000+15a000]

'dpkg -S /usr/bin/screen' results in "screen: /usr/bin/screen"

uname -a
Linux chuck-desktop 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:50 UTC 2011 i686 athlon i386 GNU/Linux

Revision history for this message
Alex Grubb (agrubb) wrote :

I've been having this same issue intermittently, usually appearing when exiting a full terminal editor like emacs after re-attaching an old byobu session. I do not have the issue listed in #727741.

A related issue (possibly the root cause) is given in: https://bugs.launchpad.net/ubuntu/+source/screen/+bug/375625. I can reliably reproduce the issue listed there and the failure has the exact same behavior as when I've had byobu/screen crash in normal use.

Revision history for this message
Yung-Chin Oei (yungchin) wrote :

I have experienced both this bug (where the client complains it can't find the master process anymore) and the bug now marked a duplicate, #375625 (where the client freezes, but indeed the master process also disappears). The comment in the other bug, https://bugs.launchpad.net/ubuntu/+source/screen/+bug/375625/comments/7 outlines steps that make that bug very reproducible for me, but I have not yet been able to actively reproduce this bug. All I know is this seems to strike when exiting vim.

I guess what I'm asking is whether it's necessarily obvious that the two are duplicates?

Changed in eglibc (Ubuntu):
status: New → Confirmed
Revision history for this message
Dr. Azrael Tod (i41b-launchpad-net-hpgu) wrote :

happens here too
no, it's not nethack-mode

seems to occure to me mostly, when mcabber or the likes as children crash

Revision history for this message
James Reily (james-reilly-a) wrote :

Happened to me twice today. Once while in irssi and the second time in mutt. I have nethack mode on and have for a long time.

From dmesg:
[743662.124228] screen[11639]: segfault at 0 ip 000000000040af4d sp 00007fffd6ebec10 error 4 in screen[400000+58000]
[745451.819338] screen[26757] general protection ip:40af4d sp:7fff1e2f1620 error:0 in screen[400000+58000]

fun@morpheus:~$ screen -ls
There are screens on:
        12546.pts-16.morpheus (06/03/13 13:49:28) (Attached)
        26757.pts-49.morpheus (06/03/13 12:57:53) (Dead ???)
        11639.pts-15.morpheus (25/02/13 22:23:05) (Dead ???)
Remove dead screens with 'screen -wipe'.
3 Sockets in /var/run/screen/S-fun.

fun@morpheus:~$ uname -a
Linux morpheus 2.6.32-45-server #104-Ubuntu SMP Tue Feb 19 21:35:01 UTC 2013 x86_64 GNU/Linux

fun@morpheus:~$ screen -v
Screen version 4.00.03jw4 (FAU) 2-May-06

Revision history for this message
Mark Smith (mark-launchpad-d) wrote :

screen not freezing up nor turning zombie, but i got the error described above. it happens reliably. i just manufactured Ubuntu 14.04. when exiting using ^A^D it does exactly what this bug describes. my screen is then lost completely (no longer exists), then i get logged out. the commands i'm running are:

byobu-screen -d -m -L -a -t amp703 -S amp703 -T xterm-color
byobu-screen -d -r amp703

i then type ^A^D it hangs for about 10 seconds. it then returns this:

Error: Cannot find master process to attach to!

my PS1 appears as would normally happen after a command completes, but then i am logged out immediately through no action of my own complete with the word "logout" as if the ^D part was interpreted by my shell.

sure enough, if i just type ^A by itself within byobu, after 10s it drops to my shell with the master process error above.

Revision history for this message
Mark Smith (mark-launchpad-d) wrote :

also, if i use screen directly, e.g. replace byobu-screen with just screen, it works as expected.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.