qrunner stops for no apparent reason

Bug #265855 reported by Midrangeman
2
Affects Status Importance Assigned to Milestone
GNU Mailman
Fix Released
High
Unassigned

Bug Description

About once every day or so, qrunner will stop for no
apparent reason.

The qrunner log file has the following ...

Jan 18 14:29:09 2003 (3443) IncomingRunner qrunner
caught SIGTERM. Stopping.
Jan 18 14:29:09 2003 (3443) IncomingRunner qrunner
exiting.
Jan 18 14:29:09 2003 (3441) BounceRunner qrunner
caught SIGTERM. Stopping.
Jan 18 14:29:09 2003 (3441) BounceRunner qrunner
exiting.
Jan 18 14:29:09 2003 (3445) OutgoingRunner qrunner
caught SIGTERM. Stopping.
Jan 18 14:29:09 2003 (3445) OutgoingRunner qrunner
exiting.
Jan 18 14:29:09 2003 (3442) CommandRunner qrunner
caught SIGTERM. Stopping.
Jan 18 14:29:09 2003 (3442) CommandRunner qrunner
exiting.
Jan 18 14:29:09 2003 (3446) VirginRunner qrunner
caught SIGTERM. Stopping.
Jan 18 14:29:09 2003 (3446) VirginRunner qrunner
exiting.
Jan 18 14:29:09 2003 (3440) ArchRunner qrunner caught
SIGTERM. Stopping.
Jan 18 14:29:09 2003 (3440) ArchRunner qrunner exiting.
Jan 18 14:29:10 2003 (3444) NewsRunner qrunner
caught SIGTERM. Stopping.
Jan 18 14:29:12 2003 (3444) NewsRunner qrunner
exiting.

No other log has any indication of what might be
happening.

Is there a way to increase the logging somewhere so the
cause can be identified?

[http://sourceforge.net/tracker/index.php?func=detail&aid=670535&group_id=103&atid=100103]

Revision history for this message
Barry Warsaw (barry) wrote :

I'm not sure what kind of logging would help. Some process
somewhere is SIGTERMing the mailmanctl controller process.
There's no way to know where a signal is coming from, so I'm
not sure what more you could do in mailmanctl.

Revision history for this message
Midrangeman (midrangeman) wrote :

After some further research, QRUNNER seems to stop after
exactly 24 hours of operation. That is, 24 hours after qrunner
starts, it ends as if someone killed it with SIGTERM. I know
for a fact that nobody is actually doing this ... and no process
on my system should be aware of the fact that qrunner is
actually running.

I will not discount the possiblity that this is an environmental
factor, but it seems to me that a daemon process should not
be affected by environmental factors.

Revision history for this message
Midrangeman (midrangeman) wrote :

I added some debug code to mailmanctl and found out that the
sigalarm handler is firing just before the qrunners are terminating.

Revision history for this message
Midrangeman (midrangeman) wrote :

Additional environment details:

Redhat Linux 8.0, uname = "Linux xxx.midrange.com 2.4.18-
26.8.0 #1 Mon Feb 24 10:21:42 EST 2003 i686 i686 i386
GNU/Linux"

Python: 2.2.1

CPU: P4 2.4ghz, 512mb RAM

Dunno if this makes a difference, but I have the following
directories ...

/usr/lib/python1.5
/usr/lib/python2.1
/usr/lib/python2.2

Any chance there is a conflict?

Revision history for this message
Thomas Wouters (thomas) wrote :

No, having multiple versions of Python should not be causing
this. Nor should the SIGALRM handler being triggered cause
it, unless something is seriously broken in your setup --
but we've already been there.

The only way to see if a SIGTERM is actually being delivered
is running the processes under strace or gdb, but this
seriously disrupts regular operation. There is no way that i
know of to find out where a signal is coming from, once you
find out that it really is a signal. If it *isn't* a real
signal, I would start looking at libc bugs and other
platform bugs. You can try upgrading Python to 2.2.2 (the
latest bugfix release) but I would be very suprised if it
fixed your problem. RedHat does not have a great reputation
for stability, so be sure to check for any RedHat updates.

Revision history for this message
Barry Warsaw (barry) wrote :

David, have you been able to dig up more information about
this problem?

I'm moving this to Pending as we have no clue why it's
happening for you and cannot reproduce it on any systems we
have available to us.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.