OutgoingRunner gets in expensive recursive loop
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
GNU Mailman |
New
|
Medium
|
Unassigned |
Bug Description
I have had a problem spring up the past few weeks where
the OutgoingRunner gets in a loop which effectively
brings down the machine by spiking the CPU to 99%.
Running "strace" on the process I see it constantly
deleting and reimplanting the same queue file in
qfiles/out/ over and over, many times per second.
The initial problem inurred with a 2.1.2 install, but
installing 2.1.6b4 shows the same.
Unfortunately the problem is somewhat ephemeral when
trying to diagnose it - if I manage to kill the
OutgoingRunner between a read and write, the queue file
gets lost the the problem disappears for a while.
I don't know if it is useful, but attached is the
strace output of a
complete read/write cycle. I haven't had the
opportunity to further debug it (by stepping through
the python) as currently I am not in this state. I am
not sure how long it will be until it is triggered
again, but it has happened about 4 times in the past
two weeks. It has never occured before this over 3 years.
I consider this issue fairly problematic - the machine
becomes unusable when it reaches this state due to CPU
exhaustion.
Any tips of helping isolate the problem are welcome. I
have modified mailmanctl to run all queuerunners with a
verbose flag, so next time maybe there will be useful
information logged.
[http://
Other attachments