bogus background command often causes shell to exit

Bug #127399 reported by Tuukka Tolvanen on 2007-07-21
10
Affects Status Importance Assigned to Milestone
bash (Ubuntu)
Medium
Michael Vogt

Bug Description

feisty, GNU bash, version 3.2.13(1)-release (i486-pc-linux-gnu)

steps:
1. start interactive bash shell, e.g. open gnome-terminal
2. repeatedly attempt to run nonexistent program in background, e.g.
       iusfbdfvb &

expected: the interactive shell should not exit.

result: shell exists, about 9 times out of 100 just now. If you have the terminal stay open after the shell exits, (e.g. either edit -> profile -> command -> on exit -> leave open; or run 'bash' for an interactive shell within the interactive shell) you'll see 'exit' as the last command, as if you had typed it in.

Related branches

Tuukka Tolvanen (sp3000) wrote :

> result: shell exists
er, shell _exits_ rather

Zygmunt Krynicki (zyga) wrote :

This is really strange. Thanks for reporting - I will look into it

Changed in command-not-found:
importance: Undecided → Medium
status: New → Confirmed

I can also reproduce this bug fairly easily after a couple of tries on my feisty laptop.

I tried to run bash within valgrind memory checker and it detects several errors:

I run a bash with:

$ valgrind --trace-children=yes bash

... then try to run a non existing command in background and get several kind of errors such as among other things:

==17299== Invalid read of size 4
==17299== at 0x808BB0F: PyObject_Free (in /usr/bin/python2.5)
==17299== by 0x810AAFB: (within /usr/bin/python2.5)
==17299== by 0x8110F89: (within /usr/bin/python2.5)
==17299== by 0x80EFAEB: (within /usr/bin/python2.5)
==17299== by 0x80D7821: PyErr_Clear (in /usr/bin/python2.5)
==17299== by 0x80E88E4: (within /usr/bin/python2.5)
==17299== by 0x80E8AE4: PyErr_PrintEx (in /usr/bin/python2.5)
==17299== by 0x80E9302: PyRun_SimpleFileExFlags (in /usr/bin/python2.5)
==17299== by 0x805932F: Py_Main (in /usr/bin/python2.5)
==17299== by 0x8058861: main (in /usr/bin/python2.5)
==17299== Address 0x4361010 is not stack'd, malloc'd or (recently) free'd
==17299==
==17299== Conditional jump or move depends on uninitialised value(s)
==17299== at 0x808BB18: PyObject_Free (in /usr/bin/python2.5)
==17299== by 0x810AAE2: (within /usr/bin/python2.5)
==17299== by 0x81121A0: (within /usr/bin/python2.5)
==17299== by 0x8085268: (within /usr/bin/python2.5)
==17299== by 0x8086BAD: PyDict_SetItem (in /usr/bin/python2.5)
==17299== by 0x8088807: _PyModule_Clear (in /usr/bin/python2.5)
==17299== by 0x80DCC9C: PyImport_Cleanup (in /usr/bin/python2.5)
==17299== by 0x80E8D9E: Py_Finalize (in /usr/bin/python2.5)
==17299== by 0x80E88E9: (within /usr/bin/python2.5)
==17299== by 0x80E8AE4: PyErr_PrintEx (in /usr/bin/python2.5)
==17299== by 0x80E9302: PyRun_SimpleFileExFlags (in /usr/bin/python2.5)
==17299== by 0x805932F: Py_Main (in /usr/bin/python2.5)

Not sure whether this stack trace is right or not. I'd be surprised if bash used Python libraries?!

Zygmunt Krynicki (zyga) wrote :

Bash is not using python but python is invoked from bash's command_not_found handler (command-not-found is obviously implemented in python). This may be a python issue of some sort but is still does not explain why bash is exiting.

I still observe this bug after upgrade to Gutsy.

I still observe this bug after upgrade to Hardy.

I get this with Hardy too. I strace'd the bash running in another window. Here are some selected highlights after entering `sfkjsfskfjsd &'.

    write(2, "[1] 22001\n", 10) = 10
    write(1, "\33]0;ralph@blake: ~\7", 19) = 19
    write(2, "$ ", 2) = 2
    read(0, 0xbfb0382f, 1) = -1 EIO (Input/output error)
    write(2, "logout\n", 7) = 7
    open("/home/ralph/.bash_logout", O_RDONLY|O_LARGEFILE) = 3

So bash prints the background PID, updates the terminal title bar, prints $PS1, and then attempts to read from fd 0 but gets an EIO and bails out.

Pavol Rusnak (prusnak) wrote :

We created own implementation of command-not-found for SUSE and have the same problem. So it is not Ubuntu specific, but it is broken somewhere between command-not-found handler and python script.

Pavol Rusnak (prusnak) wrote :

Update: this has nothing to do with python.

If I put

command_not_found_handle() {
    ls
}

into /etc/bash_command_not_found or something similar.

And run

foobar &

bash will crash also (from time to time).

When handler contains only bash internals (echo ...), the crashes won't happen. This has something to do with forking ...

No, it isn't specifically to do with Python. See my comment above where read(2) on stdin returns EIO; I think that's a route to tracking it down.

Pavol Rusnak (prusnak) wrote :
Michael Vogt (mvo) on 2009-06-26
Changed in command-not-found (Ubuntu):
assignee: nobody → Michael Vogt (mvo)
milestone: none → karmic-alpha-3
status: Confirmed → In Progress
Michael Vogt (mvo) on 2009-06-26
affects: command-not-found (Ubuntu) → bash (Ubuntu)
Changed in bash (Ubuntu):
status: In Progress → Fix Released
Malte Helmert (helmert) wrote :

I can confirm that this bug still exists in karmic.

Here's how I can reproduce it 80% of the time:
=====================================================
helmert@alfons:~$ bash && echo bash is done
helmert@alfons:~$ cd tmp/
helmert@alfons:~/tmp$ rubbish &
[1] 26946
helmert@alfons:~/tmp$ exit
bash is done
=====================================================

Note that I don't type the "exit" -- it appears automatically. bash decides to exit after I try to invoke a nonexistent command ("rubbish" above) in the background. Interestingly, the problem only occurs for me if I'm *not* in my home dir, hence the cd at the start. (No idea what this has to do with anything.)

I'm not a bash expert, but to me this looks like there might be some sort of race condition between the command-not-found-handler and the background process (PID 26946 above) at the core of this.

Malte Helmert (helmert) wrote :

Anyone reading this? Should I open a new bug? (Not sure about the correct policy for bugs that are already marked as resolved.)

Anyway, an update: the fix in comment #6 of https://bugzilla.novell.com/show_bug.cgi?id=470844 (with the paths to command-non-found suitably adapted) works for me. At least I haven't got any crashes in a while.

I can't reproduce your symptoms on 9.10. For me, the original problem is fixed; I no longer see the strace output I quoted above. Your need to chdir away from your home directory is probably pertinent. I'd suggest investigating that and perhaps using "strace -p <pid-of-bash> -s 3000 -f -o /tmp/st" in another window to see what's going on to cause the exit.

I can still reproduce this bug with Ubuntu-9.10 Karmic (tested in a xterm).
It happens rarely but I managed to reproduce it at least twice.

I happens more often if I run it strace.

I ran...

$ strace -s 3000 -f -o /tmp/trace-xterm xterm

And in the xterm, I ran the non existing command:

$ foobar&

xterm exited immediately! (but it does not happen all the time when I try again).

In the log file /tmp/trace-xterm, I see something interesting shortly before it exits:

334 write(2, "Sorry, command-not-found has crashed! Please file a bug report at:", 66) = -1 EIO (Input/output error)

I attach the full /tmp/trace-xterm log file.

I agree, it does look as if it still fails in a similar manner in the
case you strace'd. Perhaps the fix above works, but there's a race
condition where the checks don't always work.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.