OutgoingRunner gets in expensive recursive loop

Bug #266209 reported by Kjd-users
2
Affects Status Importance Assigned to Milestone
GNU Mailman
New
Medium
Unassigned

Bug Description

I have had a problem spring up the past few weeks where
the OutgoingRunner gets in a loop which effectively
brings down the machine by spiking the CPU to 99%.
Running "strace" on the process I see it constantly
deleting and reimplanting the same queue file in
qfiles/out/ over and over, many times per second.

The initial problem inurred with a 2.1.2 install, but
installing 2.1.6b4 shows the same.

Unfortunately the problem is somewhat ephemeral when
trying to diagnose it - if I manage to kill the
OutgoingRunner between a read and write, the queue file
gets lost the the problem disappears for a while.

I don't know if it is useful, but attached is the
strace output of a
complete read/write cycle. I haven't had the
opportunity to further debug it (by stepping through
the python) as currently I am not in this state. I am
not sure how long it will be until it is triggered
again, but it has happened about 4 times in the past
two weeks. It has never occured before this over 3 years.

I consider this issue fairly problematic - the machine
becomes unusable when it reaches this state due to CPU
exhaustion.

Any tips of helping isolate the problem are welcome. I
have modified mailmanctl to run all queuerunners with a
verbose flag, so next time maybe there will be useful
information logged.

[http://sourceforge.net/tracker/index.php?func=detail&aid=1168999&group_id=103&atid=100103]

Revision history for this message
Kjd-users (kjd-users) wrote : strace output of OutgoingRunner cycle

Other attachments

Revision history for this message
Kjd-users (kjd-users) wrote :

I have managed to captures a number of qfiles that are
causing this phenomenon (which is recurring more often the
past few weeks). They have the following properties:

- They are all "Post by non-member to a members-only list
" responses to spam that has gone to a moderated list.
- They come from non-existant domains.

Here is a sample dump of one of the .db files from the queue
that is looping:

$ /usr/local/mailman/bin/dumpdb
1114500259.9807329+a1518ee474d8edb0e83615df60270885165d5f83.db
{ 'deliver_after': 1114503867.0899999,
    'deliver_until': 1114932267.0899999,
    'lang': 'en',
    'last_recip_count': 1,
    'listname': 'ga',
    'nodecorate': 1,
    'original_sender': '<email address hidden>',
    'personalize': 1,
    'pipeline': [],
    'received_time': 1114500259.9807329,
    'recips': ['<email address hidden>'],
    'reduced_list_headers': 1,
    'verp': 1,
    'version': 3}

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.