MASTER Firefox crashes on instant X server shutdown
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Mozilla Firefox |
Confirmed
|
Unknown
|
|||
firefox (Ubuntu) |
Won't Fix
|
Medium
|
Unassigned | ||
firefox-3.0 (Ubuntu) |
Triaged
|
Medium
|
Unassigned |
Bug Description
Firefox crashes when X server is forcefully torn down (e.g. by pressing ctrl-alt-backspace) and a crash report gets generated on next login.
(Original Report:
I've reproduced it once on my machine with the following steps. With, oh, about 5 tabs open, I just pressed ctrl-alt-backspace, logged back in when the X server restarted, and FF crashed with a bug report when Gnome tried to restore the session.
It's not terribly important, I don't think anyone does this very often, but maybe it'll be helpful.
)
In Mozilla Bugzilla #336193, Benjamin Smedberg (Mozilla) [:bs] (benjamin-smedbergs) wrote : | #1 |
In Mozilla Bugzilla #336193, Dietrich-mozilla (dietrich-mozilla) wrote : | #2 |
I believe SIGTERM can be synchronous or asynchronous. It's a signal that can by hanled by the target process, with the basic intent of: "Please terminate yourself, whatever that may entail". This is in contrast to SIGKILL, which cannot be handled by the target process: "Hey kernel, immediately terminate that process over there".
For example, IIRC, in the case of SIGTERM Apache returns from the signal, and then goes about shutting down it's child processes.
In Mozilla Bugzilla #336193, Johnsreid (johnsreid) wrote : | #3 |
If the app is left running when the machine is shut down then shutdown actually hangs. In practice the OS steps in and asks the user what to do about the app which is not responding (Firefox 2).
I've seen this problem with 10.4.8 and 10.3.9
kfinity (kfinity) wrote : Firefox crashes after abrupt X server restart & login | #4 |
I've reproduced it once on my machine with the following steps. With, oh, about 5 tabs open, I just pressed ctrl-alt-backspace, logged back in when the X server restarted, and FF crashed with a bug report when Gnome tried to restore the session.
It's not terribly important, I don't think anyone does this very often, but maybe it'll be helpful.
kfinity (kfinity) wrote : | #5 |
In Mozilla Bugzilla #336193, Mozilla (mozilla) wrote : | #6 |
FWIW: That's the same on Linux.
People are closing their session often w/o closing every single application.
(In reply to comment #1)
> I don't know a lot about unix signal handling, but don't unix signals need to
> be handled synchronously? Or can we watch for SIGTERM, call
> nsIAppStartup.
> the eventloop and stack to unwind gracefully later?
You can watch for SIGTERM since that signal should do the same as a normal exit. So you can catch the signal, start the shutdown process and return from the signal handler.
> Should SIGTERM cause eAttemptQuit (prompting the user if there is unsaved
> work/multiple tabs), or eForceQuit?
That's a harder question and I think there is no rule for that. I tend to vote for eForceQuit since if the application gets a SIGTERM the user may not be in front of the application (or display) at all.
Martin Morrison (isoschiz) wrote : Re: Firefox crashes after abrupt X server restart & login | #7 |
- Firefox Crash report. Edit (13.7 MiB, text/plain)
I also just restarted X with Ctrl-Alt-backspace, and when I logged back in, there was a crash report for Firefox. However, unlike the original submitter, Firefox isn't automatically restored by my gnome session, so it is probably the Firefox that was open before I restarted X that crashed.
Firefox restarted fine, and restored my session, so no biggy really.
Freddy Martinez (freddymartinez9) wrote : | #8 |
Can not reproduce. KDE 3.5.6 on Herd 2. Fx 2.0
Changed in firefox: | |
status: | Unconfirmed → Needs Info |
Changed in firefox: | |
importance: | Undecided → Medium |
Alexander Sack (asac) wrote : Re: Firefox crashes on instant X server shutdown | #9 |
Martin is right, this is most likely a report generated for the instance running when X server is shut down. Retitled and updated description accordingly.
However, its really minor since forcing X meld down is rather an exceptional operation.
description: | updated |
John Vivirito (gnomefreak) wrote : | #10 |
Taking for retrace.
Changed in firefox: | |
assignee: | nobody → gnomefreak |
John Vivirito (gnomefreak) wrote : | #11 |
- Stacktrace Edit (792 bytes, text/plain)
Kfinity's retrace
https:/
John Vivirito (gnomefreak) wrote : | #12 |
- Stacktrace Edit (803 bytes, text/plain)
Martin's Retrace
https:/
John Vivirito (gnomefreak) wrote : | #13 |
Finished retrace. Let me know if you need more.
Changed in firefox: | |
assignee: | gnomefreak → mozillateam |
Changed in firefox: | |
assignee: | mozillateam → mozilla-bugs |
Axel (axelbrink) wrote : | #14 |
In Mozilla Bugzilla #336193, Dries Desmet (dries) wrote : | #15 |
There is a bug report in ubuntu related to this bug. I'm just providing a link to it here:
https:/
Alexander Sack (asac) wrote : | #16 |
this is known upstream. Someone should dig the right bug number out of bugzilla.
Changed in firefox: | |
status: | Needs Info → Confirmed |
Changed in firefox: | |
status: | Unknown → Confirmed |
Alexander Sack (asac) wrote : | #17 |
thank you for upstream bug, TrioTorus.
Changed in firefox: | |
status: | Confirmed → In Progress |
In Mozilla Bugzilla #336193, Zeniko (zeniko) wrote : | #18 |
*** Bug 381828 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #336193, maxauthority (stubenschrott) wrote : | #19 |
(In reply to comment #4)
> > Should SIGTERM cause eAttemptQuit (prompting the user if there is unsaved
> > work/multiple tabs), or eForceQuit?
>
> That's a harder question and I think there is no rule for that. I tend to vote
> for eForceQuit since if the application gets a SIGTERM the user may not be in
> front of the application (or display) at all.
I also think eForceQuit is better, although it would be fine to have at least some SIGTERM handling at all. I often just start ./firefox from the commandline when developing my extension and ctrl-c it when done with the result of getting a "do you want to restore tabs" message at next start :(
In Mozilla Bugzilla #336193, Linux2-6 (linux2-6) wrote : | #20 |
(In reply to comment #7)
> (In reply to comment #4)
> > > Should SIGTERM cause eAttemptQuit (prompting the user if there is unsaved
> > > work/multiple tabs), or eForceQuit?
> >
> > That's a harder question and I think there is no rule for that. I tend to vote
> > for eForceQuit since if the application gets a SIGTERM the user may not be in
> > front of the application (or display) at all.
>
> I also think eForceQuit is better, although it would be fine to have at least
> some SIGTERM handling at all. I often just start ./firefox from the commandline
> when developing my extension and ctrl-c it when done with the result of getting
> a "do you want to restore tabs" message at next start :(
>
There is sent SIGINT on Ctrl+C instead of SIGTERM.
In Mozilla Bugzilla #336193, maxauthority (stubenschrott) wrote : | #21 |
(In reply to comment #8)
> There is sent SIGINT on Ctrl+C instead of SIGTERM.
Right, my mistake. But usually it is handled the same as SIGTERM in most UNIX
programs, and so should Firefox to also "feel" like a true UNIX program which
can also be run/terminated from the command line.
In Mozilla Bugzilla #336193, Bshanks2 (bshanks2) wrote : | #22 |
Is this bug really restricted to Mac OS?
Running Debian GNU/Linux, I find it annoying that when I shutdown my computer via my window manager, Firefox thinks it crashed and asks me if I want to restore tabs the next time I boot up my computer. So I suppose I think that eAttemptQuit sounds better -- when you shut down your computer normally, you don't expect your programs to think that they crashed, rather you want them to exit normally. When you shut down your computer normally, it's okay for applications to ask you if you want to save your work.
In Mozilla Bugzilla #336193, Vsync (vsync) wrote : | #23 |
My vote: SIGTERM should force quit. I just used "kill" because Firefox was doing its spinlocking thing and taking 5min to get to each window to ask if I really wanted to close it. (I'll file a separate bug if that doesn't exist... quit should ask ONCE, not once for every window.) Anyway, leaving aside the CPU usage problem, "kill" bypasses all that. It should definitely force a quit.
However: when it receives a SIGTERM it should NOT ask me on next start that it quit "unexpectedly". No, I killed you on purpose. Don't make me kill you again.
In Mozilla Bugzilla #336193, maxauthority (stubenschrott) wrote : | #24 |
@comment 11: I think 'kill' should do the same as just clicking the "X" in most window managers. For a really forceful quit when firefox is hanging in an infinite loop we still have 'kill -9'
In Mozilla Bugzilla #336193, Zeniko (zeniko) wrote : | #25 |
*** Bug 459282 has been marked as a duplicate of this bug. ***
Changed in firefox: | |
assignee: | mozilla-bugs → nobody |
status: | In Progress → Triaged |
Changed in firefox-3.0: | |
importance: | Undecided → Medium |
status: | New → Triaged |
mmware (massimo-mmware) wrote : | #26 |
well, if you are talking about kde/firefox integration, this is a work-around :
http://
Micah Gersten (micahg) wrote : | #27 |
No more fixes for Firefox 2.
Changed in firefox (Ubuntu): | |
status: | Triaged → Won't Fix |
Changed in firefox: | |
importance: | Unknown → Medium |
meven (meven29) wrote : | #28 |
This bug still occurs as it has for years.
This bug doesn't deserve to finish in the bug's graveyard, forgotten and denied by every one.
The "Won't Fix" status is not encouraging.
In Mozilla Bugzilla #336193, Bugzilla-mozilla-org-5 (bugzilla-mozilla-org-5) wrote : | #29 |
Six years later... Firefox 10 has the same bug Firefox 2 had. It is ridiculous that Firefox shows the "This is embarrassing" message when it receives a normal SIGTERM. People are hacking around the problem by disabling session restore completely:
http://
Hasn't there been enough embarrassment? Let's fix this.
In Mozilla Bugzilla #336193, Benjamin Smedberg (Mozilla) [:bs] (benjamin-smedbergs) wrote : | #30 |
ok, fix it! I'd happily review a patch!
In Mozilla Bugzilla #336193, Nfroyd (nfroyd) wrote : | #31 |
Created attachment 607268
patch
Here's a patch which appears to DTRT on Linux. I don't have a Mac machine and I even less familiar with the OS X event loop than I am with the Linux one, so that bit would have to come from someone else.
In Mozilla Bugzilla #336193, Nfroyd (nfroyd) wrote : | #32 |
Comment on attachment 607268
patch
Actually, this does the right thing, but it would drastically inflate the version of GTK we require; g_unix_signal_add is said in the docs to be available from 2.30 onwards (though this version isn't actually available from the website...?), and we require 2.14. So I guess the better way to do it is to roll our own pipes and such.
In Mozilla Bugzilla #336193, Nfroyd (nfroyd) wrote : | #33 |
Created attachment 608074
patch
Here's a better, somewhat more verbose patch that doesn't require anything beyond what we already use. When receiving SIG{INT,TERM} and restarting, we now display about:home with an option to restore your last session instead of the "this is embarrassing" page.
As with any modular app using signals, I am unsure of how this interacts with code that already tries to deal with signals:
http://
nsAppShell appears to install its handlers after this.
There's also some code in breakpad, which I assume doesn't come into play here.
In Mozilla Bugzilla #336193, Ted Mielczarek (ted-mielczarek) wrote : | #34 |
Breakpad doesn't handle SIGTERM, so you should be okay there. You will probably need to make sure that your code interacts properly with the profile locking code, since the purpose of that signal handler is to clean up the profile lock correctly upon exit.
In Mozilla Bugzilla #336193, Nfroyd (nfroyd) wrote : | #35 |
(In reply to Ted Mielczarek [:ted] from comment #19)
> Breakpad doesn't handle SIGTERM, so you should be okay there. You will
> probably need to make sure that your code interacts properly with the
> profile locking code, since the purpose of that signal handler is to clean
> up the profile lock correctly upon exit.
Doing a clean exit (nsAppShell::Exit) from handling the signal won't cause the profile unlocking code to be run? I do still see a .parentlock in my profile.
This is going to be tricky: we can't run the old signal handler before nsAppShell::Exit, because nsProfileLock's signal handlers run _exit or the SIG_DFL handler, which would exit. But we can't run the old signal handler after nsAppShell::Exit because we've exited.
In Mozilla Bugzilla #336193, Ted Mielczarek (ted-mielczarek) wrote : | #36 |
I suppose the right thing to do here then is either:
a) Drop SIGTERM support from nsProfileLock, make your new code call into the nsProfileLock code to unlock the profile before exiting.
or
b) Forget about your new code, just tweak nsProfileLock to special-case SIGTERM and do the handling you're doing.
In Mozilla Bugzilla #336193, Mwu-mozilla (mwu-mozilla) wrote : | #37 |
Two things:
1. You should probably use nsAppStartup::Quit because it will do the Right Things. This includes sending the proper shutdown notifications to code that needs to clean up.
2. Should probably make this generic enough to work on all unixish systems instead of just GTK2. Sounds like ted's second suggestion does that.
In Mozilla Bugzilla #336193, Nfroyd (nfroyd) wrote : | #38 |
(In reply to Michael Wu [:mwu] from comment #22)
> 1. You should probably use nsAppStartup::Quit because it will do the Right
> Things. This includes sending the proper shutdown notifications to code that
> needs to clean up.
Mmm, good point.
> 2. Should probably make this generic enough to work on all unixish systems
> instead of just GTK2. Sounds like ted's second suggestion does that.
It does, but it runs rather a lot of code from inside the signal handler, which seems like a Bad Idea. To make sure things run from outside the signal handler, we need integration with the event loop. I guess we could have nsAppShell:
In Mozilla Bugzilla #336193, Nfroyd (nfroyd) wrote : | #39 |
Created attachment 609331
patch
New patch using nsIAppStartup; also fewer printfs.
In Mozilla Bugzilla #336193, Benjamin Smedberg (Mozilla) [:bs] (benjamin-smedbergs) wrote : | #40 |
I am not the right person to review this, having never written anything useful with signal handlers. If I understand this correctly, the signal handler itself is signaling a pipe which wakes up the event loop. Is this because it is not safe to make any GTK calls from within the signal handler? I do think that eForceQuit is correct for SIGTERM, but I wonder if we shouldn't be using eAttemptQuit for SIGQUIT, so that it can properly prompt for unsaved work.
In Mozilla Bugzilla #336193, Nfroyd (nfroyd) wrote : | #41 |
(In reply to Benjamin Smedberg [:bsmedberg] from comment #15)
> ok, fix it! I'd happily review a patch!
(In reply to Benjamin Smedberg [:bsmedberg] from comment #25)
> I am not the right person to review this, having never written anything
> useful with signal handlers.
I am confused. :)
> If I understand this correctly, the signal
> handler itself is signaling a pipe which wakes up the event loop. Is this
> because it is not safe to make any GTK calls from within the signal handler?
Doing almost anything non-trivial from a signal handler is a Bad Idea, unless you write your code very very carefully. I do not think GTK code follows the necessary rules for being callable from a signal handler.
> I do think that eForceQuit is correct for SIGTERM, but I wonder if we
> shouldn't be using eAttemptQuit for SIGQUIT, so that it can properly prompt
> for unsaved work.
Wikipedia suggests that SIGQUIT is sent "when the user requests that the process perform a core dump." The glibc info pages agree, but also suggest that "certain kinds of cleanups are best omitted in handling SIGQUIT." (e.g. you might want to examine temporary files along with the core dump you just received.) Given this information, I think eForceQuit is acceptable for both signals...and I doubt people SIGQUIT'ing applications is all that common anyway.
In Mozilla Bugzilla #336193, Benjamin Smedberg (Mozilla) [:bs] (benjamin-smedbergs) wrote : | #42 |
Oh, I confused SIGINT with SIGQUIT. SIGINT should presumably do normal prompting, I guess SIGQUIT doesn't need to as you said.
Somebody who knows signal handlers and GTK well should be the primary code reviewer for this patch (which isn't me). I suggest karlt.
In Mozilla Bugzilla #336193, Nfroyd (nfroyd) wrote : | #43 |
Created attachment 618671
patch
Redirecting f? to karlt as bsmedberg suggests. Updated patch with SIGQUIT handling as well.
I left SIGINT doing a eForceQuit; from comments in this bug, it sounds like people start firefox from the command line and then C-c it. For those sorts of usecases, it would be annoying to click a dialog. (Firefox should just save the data always, without asking, but that's a separate discussion.)
In Mozilla Bugzilla #336193, Karlt (karlt) wrote : | #44 |
Comment on attachment 618671
patch
Communicating via pipe is the only reasonable async-signal-safe way that I
know to signal the event loop.
eForceQuit on SIGINT sounds sensible to me.
I wouldn't implement this for SIGQUIT for the reasons you list in comment 26.
It should be possible to share the single existing pipe by sending the
appropriate tokens down the same mPipeFDs.
Is there a reason you chose to use SA_RESTART? Not sure it makes much
difference in practice, but I would have expected an interrupt signal to try
to abort ASAP rather than restarting a system call.
These signal handlers should by removed once one is called. Once they've sent
the eForceQuit, sending that again will not achieve much. When they remove
themselves, they should restore the previous signal handler. See how this is
done in nsProfileLock.cpp, and nsProfileLock is really the reason why the
previous handler needs to be restored.
I assume embedders have their own event loop and so this code won't be used.
That is good as this could interfere with their signal handlers.
The app is meant to exit by reraising the signal. The glibc page says "should
end by specifying the default action for the signal that happened and then
reraising it", but I think it is actually more correct use the previous action
to chain up to other signal handlers, which will eventually raise with the
default action. If there is no appropriate place to raise the right signal,
then this may be more trouble than it is worth. If we don't do this, then
that is another good reason not to handle SIGQUIT.
In Mozilla Bugzilla #336193, Karlt (karlt) wrote : | #45 |
Think also about what happens when a signal is received after the nsAppShell is deleted.
In Mozilla Bugzilla #336193, Nfroyd (nfroyd) wrote : | #46 |
(In reply to Karl Tomlinson (:karlt, offline til July 2) from comment #29)
> Is there a reason you chose to use SA_RESTART? Not sure it makes much
> difference in practice, but I would have expected an interrupt signal to try
> to abort ASAP rather than restarting a system call.
I would expect that too, but since we're not actually quitting immediately, enabling automatic restart of anything still in-progress seems reasonable, giving those calls a chance to complete, etc. I'd be really surprised if all the code in Gecko &co. was prepared to handle interrupted system calls anyway.
> The app is meant to exit by reraising the signal. The glibc page says
> "should
> end by specifying the default action for the signal that happened and then
> reraising it", but I think it is actually more correct use the previous
> action
> to chain up to other signal handlers, which will eventually raise with the
> default action. If there is no appropriate place to raise the right signal,
> then this may be more trouble than it is worth. If we don't do this, then
> that is another good reason not to handle SIGQUIT.
I assume you mean re-raising once we've handled application shutdown? Oof, that's quite some complexity. I appreciate the philosophical point behind trying to exit with the proper codes, but I'm not sure it makes much pragmatic difference here. And for things like nsProfileLock, it seems like a proper invocation of the shutdown process, as triggered by this patch, makes much more sense than trying to invoke nsProfileLock's signal handlers.
In Mozilla Bugzilla #336193, Karlt (karlt) wrote : | #47 |
(In reply to Nathan Froyd (:froydnj) from comment #31)
> I assume you mean re-raising once we've handled application shutdown?
Yes.
> Oof, that's quite some complexity.
Yes, it could well be too much trouble. It can be attacked separately if/when that turns out to be important for anything.
> And for things like nsProfileLock, it seems like
> a proper invocation of the shutdown process, as triggered by this patch,
> makes much more sense than trying to invoke nsProfileLock's signal handlers.
Yes, a proper shutdown is definitely the appropriate response to the first signal.
If the lock has been removed, then subsequently running nsProfileLock's handlers should be harmless.
In Mozilla Bugzilla #336193, Choller (choller) wrote : | #48 |
Getting this fixed should fix some issues with coverage measurement and external applications using the browser. My current problem is that an external testing application uses SIGTERM to signal the browser to terminate but GCOV data is not written out because no regular exit is performed.
Are there any remaining problems with the attached patch?
In Mozilla Bugzilla #336193, Nateredding (nateredding) wrote : | #49 |
I have the same problem, essentially with JS Test Driver trying to close Firefox (I assume using SIGTERM). Sometimes when JS Test Driver tries to restart the browser, the This is embarrassing dialog pops up and JS Test Driver is unable to capture the browser to run unit tests. I see this problem on my Mac and we think we see this problem when we try to run headlessly in a Solaris environment using XVFB. Any word on when this might be fixed? It sounds like there is already a potential patch for it.
In Mozilla Bugzilla #336193, oneguycoding (steeve-oneguycoding) wrote : | #50 |
Still not resolved? FWIW, I would deal with SIGINT and SIGTERM using the same handler. Functionally this should be no different than whatever happens when the user chooses File/Quit or hits Ctrl-Q.
In Mozilla Bugzilla #336193, Gijskruitbosch+bugs (gijskruitbosch+bugs) wrote : | #51 |
*** Bug 1088008 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #336193, gothbert (omega-y) wrote : | #52 |
(In reply to Christian Holler (:decoder) from comment #33)
>
> Are there any remaining problems with the attached patch?
What happened back in December 2012? Did the patch make it to the official source?
The issue is still present in the current version of Firefox for Linux.
In Mozilla Bugzilla #336193, Choller (choller) wrote : | #53 |
Someone needs to rebase the patch here and push it to try. I'd be willing to mentor this, but I don't have the time to do it myself.
In Mozilla Bugzilla #336193, Plurtu (plurtu) wrote : | #54 |
Created attachment 8626516
bug336193.patch
Summary:
* nsProfileLock is already rigged for handling signals and is already terminating (fatally) so it is only a small change to make it clean, this is preferable to a GTK-specific implementation.
* A clean quit should be performed for SIGINT and SIGTERM, but not SIGQUIT as it is not intended to be clean.
* nsAppStartup::Quit calls nsIAppShell::Exit asynchronously which does most of the work so it doesn't block the signal handler.
* raise() needs to be called at some point after calling Quit().
* To avoid SessionStore attempting to restore a crashed session, SessionFile needs to be closed properly, complete a final write, notify CrashMonitor with "sessionstore-
This patch avoids SessionStore problems with SIGINT/SIGTERM by calling Quit(eForceQuit) in the nsProfileLock signal handler and observing "profile-
In Mozilla Bugzilla #336193, Jmathies (jmathies) wrote : | #55 |
Comment on attachment 8626516
bug336193.patch
Review of attachment 8626516:
-------
Karl, mind reviewing this? I'm not qualified to review unix signal handling code.
In Mozilla Bugzilla #336193, Karlt (karlt) wrote : | #56 |
Comment on attachment 8626516
bug336193.patch
Unfortunately, the last paragraph of comment 23 is still relevant.
(In reply to Kestrel from comment #39)
> * nsAppStartup::Quit calls nsIAppShell::Exit asynchronously which does most
> of the work so it doesn't block the signal handler.
It is more complicated than just not blocking the signal handler.
See man 7 signal:
"Async-
A signal handler function must be very careful, since processing
elsewhere may be interrupted at some arbitrary point in the execution
of the program. POSIX has the concept of "safe function". If a
signal interrupts the execution of an unsafe function, and handler
calls an unsafe function, then the behavior of the program is
undefined."
One of the most likely problems is taking a lock, if the process is interrupted with the lock already held. For example, nsComponentMana
Also, unless the signal mask of every thread is carefully managed, the signals may be received on any thread. Therefore any methods called that have state would need to be thread-safe, but locks can't be used for the reason described above.
"A process-directed signal may be delivered to any one of the threads that
does not currently have the signal blocked. If more than one of the threads
has the signal unblocked, then the kernel chooses an arbitrary thread to
which to deliver the signal."
In Mozilla Bugzilla #336193, Jed Davis (jld-moz) wrote : | #57 |
The SignalPipeWatcher class defined in xpcom/base/
In Mozilla Bugzilla #336193, R-fab (r-fab) wrote : | #58 |
Created attachment 8750107
Bug 336193 handle SIGTERM in SignalPipeWatcher then in the main thread to quit
What about the attempt in this patch? I intercept SIGTERM using "SignalPipeWatcher" as suggested, then I pass a task to the main thread so it calls "nsIAppStartup:
That may seem convoluted, it would look better if it was done in one go.
I don't have any experience on the Firefox codebase, so I may completely misunderstand the issues. If "Quit" is executed by the main thread (like it would be after a keyboard key for example), wouldn't it avoid the aforementioned problems with the locks.
In Mozilla Bugzilla #336193, Thomas (t-hartwig) wrote : | #59 |
Still a major problem, 9 years after raising this I only can see discussions about? Firefox is not able to handle a standard SIGTERM? For me a reason to change. Sorry to be this rude, but it is really a shame.
Changed in firefox: | |
status: | Confirmed → In Progress |
In Mozilla Bugzilla #336193, 8ames (8ames) wrote : | #60 |
This is a problem for geckodriver and for anything else that's using marionette to control the browser (e.g. various test harnesses including wpt). If we don't shut down the browser gracefully we lose out on the ability to do various things that depend on clean shutdown (certainly leak logging, perhaps code coverage). Marionette has a method to invoke shutdown, but sending the response races with closing the marionette connection, so it isn't reliable. Signalling the process with SIGTERM would be the obvious solution, except for this bug. So we either need a way to ensure that the marionette response is sent before the corresponding socket is closed, or a fix here.
In Mozilla Bugzilla #336193, 8ames (8ames) wrote : | #61 |
Comment on attachment 8750107
Bug 336193 handle SIGTERM in SignalPipeWatcher then in the main thread to quit
Let's at least review all the patches here.
In Mozilla Bugzilla #336193, Karlt (karlt) wrote : | #62 |
Comment on attachment 8750107
Bug 336193 handle SIGTERM in SignalPipeWatcher then in the main thread to quit
I think this approach will work well, thanks.
SIGTERM will have no effect if Firefox gets stuck during shutdown, but I guess
that's reasonable because SIGTERM is intended to effect normal shutdown. If
Firefox shutdown is broken in such a way that it gets stuck, then other
signals are available to terminate the process. The alternative of
reinstating the previous handler after the first SIGTERM would lead to the
disadvantage that two SIGTERMs would terminate the app before shutdown
completes even if shutdown would otherwise complete as expected.
> #endif
>
>+
> *aRetVal = true;
No extra newline here please.
>+ die_cb(NULL, NULL);
nullptr
https:/
>+namespace {
>+ void termSignalHandl
Please use static instead of the anonymous namespace.
https:/
Gecko capitalizes the first letter of C++ method names (distinct from variable
names).
>+ RefPtr<QuitTask> task = new QuitTask();
>+ NS_DispatchToMa
NS_DispatchToMa
Please also remove the trailing whitespace in this patch.
Some changes will be needed to merge with changes for bug 1372405.
NS_NewRunnableF
In Mozilla Bugzilla #336193, Karlt (karlt) wrote : | #63 |
(In reply to James Graham [:jgraham] from comment #45)
> Marionette has a method to invoke shutdown,
> but sending the response races with closing the marionette connection, so it
> isn't reliable. Signalling the process with SIGTERM would be the obvious
> solution, except for this bug. So we either need a way to ensure that the
> marionette response is sent before the corresponding socket is closed, or a
> fix here.
What kind of response are you expecting from SIGTERM that you would not get
from a marionette shutdown?
In Mozilla Bugzilla #336193, Tolf Tolfsen (tolf) wrote : | #64 |
(In reply to Karl Tomlinson (:karlt) from comment #48)
> (In reply to James Graham [:jgraham] from comment #45)
>
> > Marionette has a method to invoke shutdown, but sending the
> > response races with closing the marionette connection, so it
> > isn't reliable. Signalling the process with SIGTERM would be
> > the obvious solution, except for this bug. So we either need a
> > way to ensure that the marionette response is sent before the
> > corresponding socket is closed, or a fix here.
>
> What kind of response are you expecting from SIGTERM that you
> would not get from a marionette shutdown?
As I understand it, handling SIGTERM would ensure a graceful
shutdown allowing leak logs to be generated before the process
exits.
The problem with the Marionette shutdown command (Marionette:Quit)
is that the TCP socket is not guaranteed to be flushed before
calling Services.
was a way through the nsIServerSocket XPCOM interface to set
SO_LINGER or similar this would block process termination, but the
Marionette issue here is essentially irrelevant to the context of
this bug.
In Mozilla Bugzilla #336193, Plurtu (plurtu) wrote : | #65 |
*** Bug 365749 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #336193, Plurtu (plurtu) wrote : | #66 |
*** Bug 1032010 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #336193, Plurtu (plurtu) wrote : | #67 |
Created attachment 8959924
Bug 336193 - P1: Use SignalPipeWatcher to detect signals SIGTERM/
Review commit: https:/
See other reviews: https:/
In Mozilla Bugzilla #336193, Plurtu (plurtu) wrote : | #68 |
Here's a revised version of the SignalPipeWatcher patch.
* SIGTERM/
* SignalPipeWatcher does not chain to previous handler and registers after nsProfileLock so it takes over for these signals. The nsProfileLock signal handler is unwanted since we are doing a normal exit which unlocks the profile in the destructor.
* After the first termination attempt the previous handler is restored so subsequent attempts can unlock the profile and kill the process immediately like before.
In Mozilla Bugzilla #336193, Karlt (karlt) wrote : | #69 |
Comment on attachment 8959924
Bug 336193 - P1: Use SignalPipeWatcher to detect signals SIGTERM/
https:/
(I'm not an XPCOM peer and so I considered passing the nsDumpUtils part to
Nathan, but the initial implementation of SignalPipeWatcher was not reviewed
by an XPCOM peer either, and so I guess it doesn't really matter who reviews.)
::: toolkit/
(Diff revision 1)
> + static bool dispatchQuit = false;
> + if (!dispatchQuit) {
> + dispatchQuit = true;
> + ForceAppQuitAsy
> + }
Is the |dispatchQuit| single quit logic necessary?
It looks like nsAppStartup:
::: xpcom/base/
(Diff revision 1)
> PipeCallback mCallback;
> + struct sigaction oldAction;
Please follow existing convention with |mOldAction|.
::: xpcom/base/
(Diff revision 1)
> }
> - SignalInfo signalInfo = { aSignal, aCallback };
> +
> + // Save current signal action so it can be restored later
> + struct sigaction oldAction = { 0 };
> + if (sigaction(aSignal, nullptr, &oldAction) != 0 ||
> + oldAction.
::: xpcom/base/
(Diff revision 1)
> }
> - SignalInfo signalInfo = { aSignal, aCallback };
> +
> + // Save current signal action so it can be restored later
> + struct sigaction oldAction = { 0 };
> + if (sigaction(aSignal, nullptr, &oldAction) != 0 ||
> + oldAction.
Why prevent registering callbacks for signals where the old action is SIG_IGN?
I guessing this is not necessary.
::: xpcom/base/
(Diff revision 1)
> + // Restore previous signal action
> + MutexAutoLock lock(mSignalInf
> + for (SignalInfoArra
> + if (aSignal == mSignalInfo[
> + struct sigaction* oldAction = &mSignalInfo[
> + if (oldAction-
SIG_DFL is typically 0.
I assume this is intended to detect when oldAction has not been set, but mSignalInfo elements are added only when the old action is successfully retrieved.
Can this check just be dropped?
In Mozilla Bugzilla #336193, Plurtu (plurtu) wrote : | #70 |
Comment on attachment 8959924
Bug 336193 - P1: Use SignalPipeWatcher to detect signals SIGTERM/
https:/
> Is the |dispatchQuit| single quit logic necessary?
> It looks like nsAppStartup:
>
> https:/
It makes sense that it was designed for handling repeated calls and I can't find any other code that does this extra check so it is indeed redundant.
> Why prevent registering callbacks for signals where the old action is SIG_IGN?
>
> I guessing this is not necessary.
I based this off nsProfileLock's CATCH_SIGNAL() which does not register for signal handlers that have already been set to SIG_IGN. I guess the logic is that if some other component has already chosen to ignore the signal then there is probably a good reason for that and we shouldn't interfere. It would also be inconsistent if we treated it like normal, started shutdown and then restored SIG_IGN which would not follow through with killing the process for subsequent termination attempts. It is always possible to change the signal handler to something else before calling RegisterCallback() to get around this if need be.
> SIG_DFL is typically 0.
>
> I assume this is intended to detect when oldAction has not been set, but mSignalInfo elements are added only when the old action is successfully retrieved.
>
> Can this check just be dropped?
Yeah this evolved into something bad, check is redundant and SIG_DFL needs to be restorable (oops!).
In Mozilla Bugzilla #336193, Plurtu (plurtu) wrote : | #71 |
Comment on attachment 8959924
Bug 336193 - P1: Use SignalPipeWatcher to detect signals SIGTERM/
Review request updated; see interdiff: https:/
In Mozilla Bugzilla #336193, Karlt (karlt) wrote : | #72 |
Comment on attachment 8959924
Bug 336193 - P1: Use SignalPipeWatcher to detect signals SIGTERM/
https:/
> I based this off nsProfileLock's CATCH_SIGNAL() which does not register for signal handlers that have already been set to SIG_IGN. I guess the logic is that if some other component has already chosen to ignore the signal then there is probably a good reason for that and we shouldn't interfere. It would also be inconsistent if we treated it like normal, started shutdown and then restored SIG_IGN which would not follow through with killing the process for subsequent termination attempts. It is always possible to change the signal handler to something else before calling RegisterCallback() to get around this if need be.
SignalPipeWatcher is more general than just handling termination signals (although the SignalPipeWatcher interface is not documented and so it is not clear what it is suitable for, and the whole situation with others also calling sigaction for the same signals is already delicate), and so I wouldn't treat the SIG_IGN case special if I were writing this myself.
Changing the signal handler to something else before calling RegisterCallback, doesn't work so well with UnregisterCallback.
But I checked "The default action for an unhandled real-time signal is to terminate the receiving process." and so the existing client, nsMemoryInfoDumper should not trigger this case. That makes this harmless for now at least, and so feel free to drop this issue if you prefer. At least we have a record of the reason for the check, so that someone can re-evaluate later if required.
In Mozilla Bugzilla #336193, Karlt (karlt) wrote : | #73 |
Comment on attachment 8959924
Bug 336193 - P1: Use SignalPipeWatcher to detect signals SIGTERM/
https:/
r+ with or without the SIG_IGN case.
In Mozilla Bugzilla #336193, Plurtu (plurtu) wrote : | #74 |
Created attachment 8963553
Bug 336193 - P2: Use a second SIGTERM in killPid() to ensure parent process is killed.
killPid() needs to do more than send a single SIGTERM to the parent process now that it only triggers graceful shutdown, sending a second SIGTERM terminates it forcefully like before.
Review commit: https:/
See other reviews: https:/
In Mozilla Bugzilla #336193, Plurtu (plurtu) wrote : | #75 |
Comment on attachment 8959924
Bug 336193 - P1: Use SignalPipeWatcher to detect signals SIGTERM/
Review request updated; see interdiff: https:/
In Mozilla Bugzilla #336193, Gbrown-s (gbrown-s) wrote : | #76 |
Comment on attachment 8963553
Bug 336193 - P2: Use a second SIGTERM in killPid() to ensure parent process is killed.
https:/
r+ with improved logging.
It seems odd that this is the only place in our test automation requiring this change. Other test harnesses have similar measures in place to ensure the browser is killed. Do none of them use sigterm?
::: testing/
(Diff revision 1)
> for p in gone:
> log.info('psutil found pid %s dead' % p.pid)
> for p in alive:
> log.info('failed to kill pid %d after 30s' % p.pid)
> + # Try again, second SIGTERM is more forceful
> + p.send_
It is important for debugging intermittent failures that logging accurately identifies when a process cannot be killed.
I suggest you add another wait_procs call after the second sigterm and log its results before exiting killPid.
In Mozilla Bugzilla #336193, Hskupin (hskupin) wrote : | #77 |
Comment on attachment 8963553
Bug 336193 - P2: Use a second SIGTERM in killPid() to ensure parent process is killed.
https:/
Harnesses built on top of Marionette client as test runner will use SIGKILL to force killing the browser process. It is only used if the normal shutdown as requested through the application itself via `Services.
In Mozilla Bugzilla #336193, 9onas (9onas) wrote : | #78 |
Does this bug also include the session not being properly counted as regularly closed when I shut down the computer normally with firefox open? (And firefox offering me a misguided tab restore after a supposed "crash" on next run) Because that's what I'm seeing here as a normal Linux desktop user. Or should I file another bug for that?
In Mozilla Bugzilla #336193, Victor Engmark (victor-engmark) wrote : | #79 |
(In reply to jonas from comment #63)
That is a bug (IMO) in most distros' shutdown/reboot sequences, where they give applications no time to shut down before sending SIGKILL. I would report to your distro.
In Mozilla Bugzilla #336193, 9onas (9onas) wrote : | #80 |
'killall firefox' has no timeout as far as I know, so I think that is completely irrelevant
In Mozilla Bugzilla #336193, 9onas (9onas) wrote : | #81 |
(sorry, by which I meant to say more specifically: 'killall firefox' has the same issue, it prompts a session crash dialog for me on next startup. It's just not something I use often since I usually use the close button which works, but on shutdown the same mechanism is used - of course with a timeout as you correctly state, but that doesn't help when it doesn't even work correctly with no timeout)
In Mozilla Bugzilla #336193, Markus-egg (markus-egg) wrote : | #82 |
Any news on this?
killall firefox does not work at all (Ubuntu).
pkill -x firefox-bin gives a crash dialog when starting firefox next time.
A simple and gentle way to end firefox from Linux commandline would be nice.
In Mozilla Bugzilla #336193, Liam-0 (liam-0) wrote : | #83 |
I'd like this feature, however if this is difficult to implement then a workaround for my use case would be if the firefox command could support a '--close' or similar option to exit gracefully, even if handled asynchronously and I had to poll to wait for exit.
Changed in firefox: | |
status: | In Progress → Confirmed |
In Mozilla Bugzilla #336193, Stephen-donner (stephen-donner) wrote : | #84 |
Mass-removing myself from cc; search for 12b9dfe4-
In Mozilla Bugzilla #336193, Kbrosnan-mozilla (kbrosnan-mozilla) wrote : | #85 |
*** Bug 1634905 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #336193, Kbrosnan-mozilla (kbrosnan-mozilla) wrote : | #86 |
*** Bug 1629049 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #336193, Thatoo (thatoo) wrote : | #87 |
I'm quiet surprised to discover that the bug I'm facing is 14 years old...
Hopefully it will be solved some day.
On Gnome, would a gnome extension creating a "close Firefox and shutdown button" work ? What would be the command? quit Firefox && sleep 5 && shutdown - h now? I don't know how to "quit" properly Firefox with a command line...
In Mozilla Bugzilla #336193, Derek (bugs-m8y) wrote : | #88 |
Once upon a time Firefox had -remote for executing commands on linux like that, but it was removed back in Firefox 39. This also allowed doing useful stuff like telling firefox to reload a url.
but also:
-remote "xfeDoCommand(
Regrettably there's no replacement for this super useful legacy functionality that I know of apart from Selenium Webdriver (have fun) or xdotool scripting.
Of those options, xdotool sending commands to window with Firefox name to ctrl-q is probably sanest, with possible confirmation prompt handling.
In Mozilla Bugzilla #336193, Thatoo (thatoo) wrote : | #89 |
So no easy way to solve this issue for most users... Hopefully some day...
Thank you for answering @nemo.
In Mozilla Bugzilla #336193, Thatoo (thatoo) wrote : | #90 |
By the way, yes, I shut down well by clicking on the system shutdown function, I do not do this by a long press on the power button.
I did not get that behavior under Ubuntu, only since I migrated to Debian 10 but I experience it on 2 computers under Debian 10.
Trye, "quitting Firefox properly" is not difficult but why should we make an exception for Firefox and bow to its need.
Firefox should know how to quit properly on its own when the operating system gives it the order because it has itself received the order to shut down.
All other applications and services can do this under Debian. Why not Firefox?
I am surprised at the lack of expectation and the acceptance of such a simple fault on the part of the community.
I always used Firefox, I always defended Firefox. I expect excellence from it. This defect is not serious indeed but it's not excellence.
In Mozilla Bugzilla #336193, Thatoo (thatoo) wrote : | #91 |
Is it not on contradiction with the idea of free software to force users to quit exclusively using the UX interface and forbid them to close thanks to CLI?
Changed in firefox: | |
importance: | Medium → Unknown |
In Mozilla Bugzilla #336193, Release-mgmt-account-bot (release-mgmt-account-bot) wrote : | #92 |
The severity field for this bug is relatively low, S3. However, the bug has 7 duplicates, 34 votes and 66 CCs.
:mossop, could you consider increasing the bug severity?
For more information, please visit [auto_nag documentation](https:/
In Mozilla Bugzilla #336193, Romain-failliot-p (romain-failliot-p) wrote : | #93 |
(In reply to Release mgmt bot [:suhaib / :marco/ :calixte] from comment #77)
> The severity field for this bug is relatively low, S3. However, the bug has 7 duplicates, 34 votes and 66 CCs.
> :mossop, could you consider increasing the bug severity?
>
> For more information, please visit [auto_nag documentation](https:/
I'd also argue that the bug has been opened 17 years ago, and has a proposed patch.
In Mozilla Bugzilla #336193, Autonag-nomail-bot (autonag-nomail-bot) wrote : | #94 |
The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.
In Mozilla Bugzilla #336193, Sledru (sledru) wrote : | #95 |
SIGTERM is now handled thanks to bug 1837907
In Mozilla Bugzilla #336193, Hskupin (hskupin) wrote : | #96 |
(In reply to Sylvestre Ledru [:Sylvestre] from comment #80)
> SIGTERM is now handled thanks to bug 1837907
Bug 1837907 was only about Linux. Reading through the comments I can see that we may already support this on Windows, but does that also include MacOS?
In Mozilla Bugzilla #336193, Hskupin (hskupin) wrote : | #97 |
(In reply to Henrik Skupin [:whimboo][⌚️UTC+1] from comment #81)
> (In reply to Sylvestre Ledru [:Sylvestre] from comment #80)
> > SIGTERM is now handled thanks to bug 1837907
>
> Bug 1837907 was only about Linux. Reading through the comments I can see that we may already support this on Windows, but does that also include MacOS?
Haik, maybe you have some input here? Thanks.
In Mozilla Bugzilla #336193, Sledru (sledru) wrote : | #98 |
my bad, thanks
I don't know a lot about unix signal handling, but don't unix signals need to be handled synchronously? Or can we watch for SIGTERM, call nsIAppStartup. quit(), and return from the signal handler immediately, allowing the eventloop and stack to unwind gracefully later?
Should SIGTERM cause eAttemptQuit (prompting the user if there is unsaved work/multiple tabs), or eForceQuit?