Helper apps inherit open file descriptors

Bug #102408 reported by Ulrich Hobelmann on 2007-04-03
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Evince
Unknown
Medium
Mozilla Firefox
Invalid
High
firefox-3.0 (Ubuntu)
Low
Unassigned
firefox-3.5 (Ubuntu)
Medium
Unassigned

Bug Description

Binary package hint: evince

(for no good reason, I think)

I'm noticing this, because right now my sound system isn't working and I get error messages like (this is no big deal; I'll have it working again in no time):
ALSA lib pcm_direct.c:224:(make_local_socket) connect failed: /tmp/alsa-dmix-12345: No such file or directory
ALSA lib pcm_dmix.c:894:(snd_pcm_dmix_open) unable to connect client

I'm starting evince from the command line with a .pdf file as parameter, and as soon as the window opens, the message appears.

What in the world is a PDF viewer doing with my sound device? I'm just wondering. Maybe this is one call and one dependency that can be optimized away. Or maybe there *is* a reason for using ALSA, but I've never heard any sounds when opening PDF files, and I don't see why that application should have any business in the sound area at all. Click when I follow a hyperlink?

Created an attachment (id=85280)
helper app data

This is an nsIProcess bug...

ccing some people who may know who it belongs to.

Just in case there's any confusion, this bug is about closing all file
descriptors except stdin, stdout, and stderr.

I think it might be a necko bug. We should set FD_CLOEXEC on all the fds we open.

Or maybe people are getting these fds via the OpenANSIDescriptor thingy? Then
nsLocalFileUnix needs to set FD_CLOEXEC on the underlying fd, which is perhaps
_mildly_ exciting to do portably, but hey.

LXR finds FD_CLOEXEC in nsprpub/pr/src/md/unix/unix.c which mumbles
about inheritance or some such. But nspr #defines so many things that I
can't figure out what it really means.

NSPR has a PR_SetFDInheritable function but no one seems to use it.

Maybe I'm more paranoid than most but the potential for mischief is
enormous here.

How so? A helper app that's running as the user can do anything Mozilla can do.
 It can write to any files, open /proc/$mozillapid/fd/*, etc.

If you don't trust your helper apps, you're screwed in ways we can't begin to
protect against.

An external app can't read the pipes or sockets. I'm not sure how much
of a problem that is but I certainly don't want to find out.

Yes, there are many ways to get screwed. I just prefer to eliminate as
many as possible. Armor plating is nice.

And no, I don't trust helper apps or anything else that isn't guaranteed
bug free. I've been burned too often.

Take a look at bug 125505. There I proposed a mime assistant that could
execv the real helper. You could use the same idea to sanitize the
helper environment. It would be a lot faster than fixing all of mozilla.

On Linux, an external app _can_ read the pipes or sockets, via the
/proc/$mozillapid/fd/* mechanism I described earlier. There are almost
certainly similar mechanisms on other platforms, since debuggers provide such
facilities.

I'm all for mitigating the damage buggy apps can do by accident, but protection
against "mischief" on the part of a malicious helper-app author is something
we're not going to be able to provide, and I'd rather we spent the effort elsewhere.

Closing file descriptors is a correctness issue, not a security/protection one.

You're better than I am. I tried and I can't get an app that isn't a
mozilla child to read from anonymous pipes or socketpairs. Sure you can
attach a process with ptrace but that's almost certainly going to be
noticed by the user. On Linux, the kernel stops an attached process.

So maybe I shouldn't have used the word "mischief." I didn't mean it in
quite the nefarious sense you seem to. I meant for it to include the
typical buggy app. That's what gotten me in the past. I've never
actually been burned by a malicious app but I have been severely burned
by buggy ones.

This might be a correctness issue but protection against data loss makes
it much more important, at least to me.

wtc, what is the default inherit trait of a file descriptor opened via
PR_Open()? Is there a way to set all file descriptors opened by an application
not-inheritable by a child process without explictly calling
|PR_SetFDInheritable| after every PR_Open?

The default inherit trait of a file descriptor opened via
PR_Open() is inheritable on Unix and not-inheritable on Windows.

There is no convenient way to set all file descriptors opened
by an application not-inheritable by a child process on Unix.
I've seen people who resort to a for loop that closes all the
possible Unix fd's. That for loop is rather expensive.

A possible solution is to use a helper process whose only
purpose is to fork and exec helper apps for the Mozilla client.
You know exactly which file descriptors are open in this helper
process and you can set FD_CLOEXEC on all these fd's.

Created an attachment (id=86274)
initial mime-helper

Here's a first try at a mime-helper. Rather than closing descriptors it sets
FD_CLOEXEC which some problems. It prefers to read /proc/self/exe but falls
back to a brute force strategy.

It works on Linux but that's not saying much.

An alternative XP solution to this problem is to try to utilize the IPC daemon.
 Since it is spawned early at startup it should not have access to most (or all)
of the file descriptors we are concerned with. It'd be relatively easy to add a
protocol extension for spawning processes via the IPC daemon. We'd of course
have to take care to not make XPCOM (nsIProcess) depend on the IPC daemon.
Thoughts?

How about we just use the NSPR function that is meant for this purpose:

PR_NewProcessAttr
http://www.mozilla.org/projects/nspr/reference/html/prprocess.html#24416
(see the description)

and use that processattr in PR_CreateProcess:
http://www.mozilla.org/projects/nspr/reference/html/prprocess.html#24535

at this place in mozilla code:
http://lxr.mozilla.org/seamonkey/source/xpcom/threads/nsProcessCommon.cpp#283

Created an attachment (id=127178)
like this

note, this is currently untested

biesi: well, NSPR process attr only allows us to control specific values passed
on to the child process, like stdin for example. currently, it doesn't provide
a way to make other file descriptors close-on-exec, which is what we'd need. my
concern is with portability... basically, we need to iterate over any open file
descriptors (other than those for stdin, et al.) and mark them as close-on-exec.
 the portability problem is how to determine the set of open file descriptors.
we know how to do this under linux using the /proc filesystem, but that
obviously doesn't work on other platforms. for example, what about OSX?

if we use the IPC daemon then that problem is avoided.

darin: well, the first url I mentioned in my comment contains this sentence:

 No file descriptors are inherited by the new process.

Is that not what we want here?

i may be mistaken, but i think NSPR's notion of inheritable file descriptors is
just a mechanism to allow a child process to get access to a PRFileDesc handle
that references the OS file descriptor. in other words, it is just a XP
mechanism for accessing inherited file descriptors. it does not seem to get in
the way of the native mechanisms. it just makes it possible to get a PRFileDesc
from a native file descriptor. in fact, it seems to create a name to PRFileDesc
dictionary, which is passed to the child process as an environment variable (at
least that's how the UNIX impl works). so, i think the comments in the NSPR
reference are just misleading. but, you should probably give your patch a try ;)

(From update of attachment 127178)
aww... it seems you are right, unfortunately :( the child process still has the
file descriptors open

One solution is to execute the following for loop in
the child process before it calls exec:

    int fd;

    for (fd = 3; fd <= MaxFdOfTheProcess; fd++) {
        (void) close(fd);
    }

You can use PR_GetSysfdTableMax as an example of getting
the value of 'MaxFdOfTheProcess'. (PR_GetSysfdTableMax
is an unsupported function, so its code may be incorrect.)

In the time since I field this bug I realized that open file descriptors
are just once aspect of the problem of gracefully exiting the
mozilla/gre/whatever environment. In particular, you can't leave it
properly unless you know the start-up environment, which you don't. you
really need to think about the proper way to start before you can leave.

I don't see how you can have a clean secure XP way to do any of this so
I don't think how IPC (whatever that is) can help. I also don't think an
XP solution is appropriate anyway.

See bug 125505, attachment 105336. I try to handle setting
FD_CLOEXEC by testing /proc/self but fall back to looping over all fd's.

I noticed today that the esd (ESound Daemon) spawned by TBird playing the "new mail" sound (mail.biff.play_sound.url) had also inherited all the descriptors. Is this another instance of this bug, or a different one?

Different bug.

*** Bug 368003 has been marked as a duplicate of this bug. ***

Created an attachment (id=252742)
Patch - sets close-on-exec for the profile lock

That patch looks like it should go in a separate bug, really, since it doesn't solve this bug. It's a good special case that we should check in, though.

OK I will think about making this a separate bug. It is a little depressing that nsProfileLock.cpp is a preprocessor mess. I had assumed that the #ifdef jungle was supposed to be confined to NSPR. Alas.

Also, I developed this after getting user complaints that helper applications spawned by previous invocations would prevent Firefox from starting, but now I have to go investigate that more, because on my platform (Linux) fcntl locks are not inherited by the child.

on ubuntu 6.06 with firefox 2.0 with an NFS $HOME i've had the problem where i've killed every firefox instance but still have an acroread instance going (which was spawned via firefox -- not the plugin, i hate the plugin) and firefox won't restart... and the acroread has a good 60 fds open which were inherited from the firefox, including a couple .parentlock fds.

when i kill the acroread i can start firefox again.

so something is inherited here.

you know, it occurs to me that NSPR should endeavour to present the same close-on-exec behaviour on all platforms... and honestly, the unix API is totally broken in this area -- this is something which NT (actually VMS, which NT inherits from) get right. the unix behaviour is OK when you're dealing with a half dozen fds... but real programs (especially multithreaded programs) have way more than a half dozen open. when you spawn a process you should explicitly specify the fds to be inherited. NSPR should default to setting F_CLOEXEC on everything on unix.

Unfortunately in NSPR the same path is used to spawn an unrelated process and to fork your own process. If you explicitly close all the FDs when NSPR forks, half the unit tests fail.

Sounds like we could use an NSPR bug about passing more information into that code, eh?

Now that this is blocking a bug I'm not allowed to view, we can only assume that the security implications here have finally been realized?

They were realized all along. All it means in this case is someone filed a duplicate of this bug with the security flag set and someone else didn't mark it duplicate.

Binary package hint: evince

(for no good reason, I think)

I'm noticing this, because right now my sound system isn't working and I get error messages like (this is no big deal; I'll have it working again in no time):
ALSA lib pcm_direct.c:224:(make_local_socket) connect failed: /tmp/alsa-dmix-12345: No such file or directory
ALSA lib pcm_dmix.c:894:(snd_pcm_dmix_open) unable to connect client

I'm starting evince from the command line with a .pdf file as parameter, and as soon as the window opens, the message appears.

What in the world is a PDF viewer doing with my sound device? I'm just wondering. Maybe this is one call and one dependency that can be optimized away. Or maybe there *is* a reason for using ALSA, but I've never heard any sounds when opening PDF files, and I don't see why that application should have any business in the sound area at all. Click when I follow a hyperlink?

Silospen (silospen) wrote :

I've experienced the same problem a few times.

Like just a few mins ago, for instance. Beep media player is working fine, then suddenly "Could not open audio device". lsof -n |grep "/dev/snd" showed evince was using sound and ending the process freed the card.

I have no idea why this is going on, but i'd like to know.

Sebastien Bacher (seb128) wrote :

Thank you for your bug. What version of Ubuntu do you use? Does it happen if you don't open anything? Does it happen with other applications?

Changed in evince:
assignee: nobody → desktop-bugs
importance: Undecided → Low
status: Unconfirmed → Needs Info
Ulrich Hobelmann (u-hobelmann) wrote :

This was Ubuntu 6.10. It was only that one time that my sound device couldn't be opened (I think some non-Ubuntu app was accessing the device directly and blocked it), so I don't know where else it might happen.

Daniel Holbach (dholbach) wrote :

Does this happen with Feisty too?

Changed in evince:
status: Needs Info → Rejected
Q. (qberdugo) wrote :

YES, it does

Changed in evince:
status: Rejected → Confirmed
Changed in evince:
status: Confirmed → Triaged
Changed in evince:
status: Unknown → New
21 comments hidden view all 101 comments
Dave Neary (dneary) wrote :

Also happens for me - both on Feisty and Gutsy. I usually open up pdfs as attachments to mail messages in Thunderbird, these also sometimes block the sound card.

Dave.

This bug hits me *every single day*. I play mp3's (usually internet radio using rhythmbox or songbird) and use firefox extensively, reading lots of pdf documents with xpdf. After some time, xpdf will block the sound device and I have to go track down which pid is blocking it and kill all the documents I have open (usually I have several). The choice by ubuntu to use pulseaudio seems to make the situation worse. I usually find it's the first thing I have to kill, then it's xpdf sessions, or firefox itself.

This may be a deeper bug. Clearly it works with 2 pid's holding the same file handle, but shows up as more and more pid's hold the filehandle. Exactly which circumstance leads to the sound device being blocked? The reported programs which cause this (xpdf, evince, gv) do not read or write the sound device at all. So why does it get blocked in the first place?

Closing the file handle may or may not fix the problem depending on the ultimate reason that the sound device gets blocked. Has anyone actually tested the theory that it's just holding file handles that causes the problem? (e.g. with a helper script as suggested above?)

Changed in evince:
status: New → Invalid

upstream says that's a firefox issue

Changed in evince:
assignee: desktop-bugs → nobody
status: Triaged → New
Ulrich Hobelmann (u-hobelmann) wrote :

I'm pretty sure it's not, because I started evince from the command line, not from within Firefox (I always save files to the desktop). Even if, it would still be evince that opened my sound device, not Firefox.

Of course there might be another (different) bug in evince, which might be related to Firefox. My original observation was related to stand-alone evince, however.

pierz (p-ohanlon) wrote :

The fact that Evince appears to access audio is probably some side effect of libasound/esd linkage with some of the gnome libraries: Whilst a listing from ldd /usr/bin/evince shows up various audio related libraries (e.g. libasound), however readelf -d /usr/bin/evince does not, which would indicate that it's not a direct dependency.

John Vivirito (gnomefreak) wrote :

does this happen with firefox-2 and firefox-3?
if so this fix will not be added to firefox-2 but will land for firefox-3.
I didnt see a post about what versions you are seeing htis on.
Also let me know what Ubuntu version it is happening on if the above versions changed.

Changed in firefox:
status: New → Incomplete
status: Incomplete → Confirmed
John Vivirito (gnomefreak) wrote :

Is there a reason we are tracking CentOS for this bug? I havent seena comment that relates to CentOS

Changed in firefox:
status: Unknown → Confirmed

Have also seen this bug on gentoo and ubuntu (32 and 64 bit). If I open a pdf from firefox (using kpdf), the next day the pdf will be stealing the sound devices and trying to play flash sound from mozilla causes a spike in cpu. I usually resort to killing firefox and the app stealing the sound device.

This bug makes the "open with" functionality potentially dangerous, since it leaks file descriptors concerning hardware access (raw sound devices). Maybe "open with" should be disabled on affected systems to avoid security issues.

Graham Binns (gmb) wrote :

Marking the centos task invalid at Pedro's request.

Changed in centos-dead:
status: New → Invalid
Changed in firefox-3.0 (Ubuntu):
status: Confirmed → Triaged

This bug is very annoying, as I regularly find Ktorrent (launched from Firefox) stealing away my soundcard (I debug with "fuser"). I am pretty sure many users actually don't realize it's due to Firefox that their sound doesn't work, and merely blame 'Linux' in general. I am convinced that fixing this bug would help many: it would be great to get this fixed after 7 years.

Micah Gersten (micahg) wrote :

Firefox 3.0 is only receiving Security Updates and major bug fixes at this point.

Changed in firefox-3.0 (Ubuntu):
status: Triaged → Won't Fix
Micah Gersten (micahg) wrote :

Moving tracking to Firefox 3.5

Changed in firefox-3.5 (Ubuntu):
importance: Undecided → Medium
status: New → Triaged

This bug is embarassing. Of all the applications I'm using, Firefox is the /only/ one that reliably manages to break sound, of all functionality. Sound doesn't work? Find out what application was spawned by FF, terminate it, sound resumes.

It can't be that hard to iterate over and close all file descriptors >= 2.

(In reply to comment #51)
> This bug hits me *every single day*. I play mp3's (usually internet radio
> using rhythmbox or songbird) and use firefox extensively, reading lots of pdf
> documents with xpdf. After some time, xpdf will block the sound device and I
> have to go track down which pid is blocking it and kill all the documents I
> have open (usually I have several). The choice by ubuntu to use pulseaudio
> seems to make the situation worse. I usually find it's the first thing I have
> to kill, then it's xpdf sessions, or firefox itself.

I am pretty sure that the PA libraries set O_CLOEXEC on all its fds. What makes you think PA is at fault here?

That said, regardless whether PA is misbehaving here or not, firefox definitely needs to close all open fds after fork()ing and before exec()ing an application. On linux this is best done by iterating through /proc/self/fd. Something like this:
http://git.0pointer.de/?p=pulseaudio.git;a=blob;f=src/pulsecore/core-util.c;h=b64c51e18fb71d2a111b0a0236bc81bc5d849950;hb=HEAD#l2230

That code is fast on Linux and still compatible with non-Linux Unixes, so should be a safe choice.

*** Bug 457272 has been marked as a duplicate of this bug. ***

*** Bug 419489 has been marked as a duplicate of this bug. ***

Just think how fun it will be to have a tenth birthday party for this bug in 2012 when it's still not fixed. Security and platform parity used to be important features of the Mozilla project but not these days.

Changed in firefox:
importance: Unknown → High
Changed in evince:
importance: Unknown → Medium
status: Invalid → Unknown

Roc, why is this blocking? It's been around forever and we have a TON of bugs blocking 4.0 at this point. Something has to give.

It's blocking because it's a pretty bad bug, we have no other procedure for getting bugs like this fixed, and I was hoping Karl would have time to do it without impacting anything else important.

Comment 59 is nasty but correct. Sometimes you just have to put your foot down and fix it.

We do need to fix this, but we're not going to hold Gecko 2 for it. Moving to the 2.x list, pending Karl actually having time...

yeah let's not fix this before it's a decade old. not much longer to wait!

*** Bug 460936 has been marked as a duplicate of this bug. ***

Hey, we only have 13 months before the 10-year anniversary of this bug!

Here's a doodle for scheduling the party:

http://www.doodle.com/9uav5v7igabaxndv

*Maybe* we could get the ticket status updated to a status that actually exists for the occasion :) "NEW" doesn't seem to be on the list anymore: https://bugzilla.mozilla.org/page.cgi?id=fields.html#status

I won't go so far as to suggest RESOLVED, but IN_PROGRESS would be *awesome*.

So this is the culprit?
http://mxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/md/unix/uxproces.c#263

How exactly do you trigger this? I thought download code used GIO or GnomeVFS now to spawn processes.

Is there any valid case for wanting to keep the file descriptors open after a fork? I was thinking that, in that function ForkAndExec(), if attr is null then I close the fds. karl, thoughts?

Michael, I'd love it if you could take this.

(In reply to comment #67)
> So this is the culprit?
> http://mxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/md/unix/
> uxproces.c#263
>
> How exactly do you trigger this? I thought download code used GIO or
> GnomeVFS now to spawn processes.
>
> Is there any valid case for wanting to keep the file descriptors open after
> a fork? I was thinking that, in that function ForkAndExec(), if attr is null
> then I close the fds. karl, thoughts?

My 2 cents:
- The problem is not keeping the file descriptors open after a fork, but after an exec.
- There are a lot of valid reasons to keep file descriptors open after a fork
- There are a few valid reasons to keep file descriptors open after an exec, but I don't think any of these reasons are in the scope of a browser
- Modifying ForkAndExec() behaviour in nspr means a change in behaviour in all applications using NSPR, which may or may not expect file descriptors not to be closed.
- As you pointed out, nowadays GIO and GnomeVFS is used (except for mailcap entries), but I don't think they close file descriptors either.

wtc doesn't want the proposed approach as he mentioned in bug 372734. We need to find a way to do this without touching NSPR.

This might possibly mean auditing all PRFileDesc* in mozilla-central and calling PR_SetFDInheritable(..., PR_FALSE) on them....

Doing some more code searching, the most important place to put this call might be nsLocalFile::OpenNSPRFileDesc (and possibly OpenANSIFileDesc). Writing that here for my own records.

gnome_vfs_mime_application_launch_with_env calls g_spawn_async without
G_SPAWN_LEAVE_DESCRIPTORS_OPEN so helper apps should be fine with GnomeVFS.

Non-GNOME (and newer GNOME) systems without --enable-gio will use nsIProcess
which uses PR_CreateProcess, thus hitting this bug.

Helper apps should be fine in --enable-gio builds because
g_desktop_app_info_launch_uris also calls g_spawn_async without
G_SPAWN_LEAVE_DESCRIPTORS_OPEN.

For plugin/content processes this is addressed by CloseSuperfluousFds.

Most systems should have GIO these days, so it looks like the appropriate fix
here is to update glib on our build systems so we can --enable-gio.

(I haven't done an audit of other nsIProcess consumers.)

Ideally at the same time we can install GTK3 on our build systems and migrate to that asap.

(In reply to Karl Tomlinson (:karlt) from comment #73)
> Helper apps should be fine in --enable-gio builds because
> g_desktop_app_info_launch_uris also calls g_spawn_async without
> G_SPAWN_LEAVE_DESCRIPTORS_OPEN.

Actually, Non-GNOME systems won't be covered with --enable-gio, because the gio service is in the mozgnome component, which depends on GNOME libs, which will most likely fail to load on non-GNOME systems. And if mozgnome loads anyway, it contains the GnomeVFSService, which would already use gnome_vfs_mime_application_launch. So --enable-gio would change nothing.

(In reply to Mike Hommey [:glandium] from comment #75)
> Actually, Non-GNOME systems won't be covered with --enable-gio, because the
> gio service is in the mozgnome component, which depends on GNOME libs, which
> will most likely fail to load on non-GNOME systems.

We'd have to either --disable-gconf --disable-gnomevfs (as I expect GIO distributions will do), or make mozgnome dynamically check for gconf and gnomevfs if still useful.

Also note that mozgnome also may depend on libnotify. (mozilla builds don't but distros' do)

Mozilla builds depend on libnotify.so.1. The ABI and soname has since changed, so that may be a problem on newer distros. If there's no stable notification library we can use instead, then we could dynamically detect.

Actually, I was thinking about this last week, but why do we have the gio support in the mozgnome component? gio exists on any system with a gtk version of 2.14 or newer (the version there is just plucked from a brief look through the git history of gtk)

And if we modified the mozgnome component to dynamically check for gconf/gnomevfs, then is there any point to having the separate component for them anyway? (perhaps I'm missing the point of putting them in libmozgnome in the first place though?)

Ah, I answered my own question - the minimum gtk requirement is 2.10

Something else: mailcap support uses nsIProcess to start commands. These wouldn't be covered either.

Changed in firefox:
status: Confirmed → Invalid
Displaying first 40 and last 40 comments. View all 101 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.