Duplicate messages from mailman (PR#88)

Bug #265226 reported by Bug Importer
2
Affects Status Importance Assigned to Milestone
GNU Mailman
Fix Released
Medium
Unassigned

Bug Description

Jitterbug-Id: 88
Submitted-By: <email address hidden>
Date: Wed, 21 Jul 1999 06:31:22 -0400 (EDT)
Version: 1.0rc2
OS: Linux - Redhat 6.0

exim feeds one message into the mailman wrapper, mailman feeds out two.
Very
occasional, only noticable on a fairly high-volume list. Example from the
"post" log:

Jul 20 18:04:48 1999 post to sluts from <email address hidden> size=542
Jul 20 18:05:03 1999 post to sluts from <email address hidden> size=542

Here's another (ignore the middle entry, just interesting that it's
processing
another between the duplicates):

Jul 20 17:04:59 1999 post to sluts from <email address hidden> size=979
Jul 20 17:10:20 1999 post to sluts from <email address hidden> size=655
Jul 20 17:13:11 1999 post to sluts from <email address hidden> size=979

Headers on received mails confirm that only one is hitting the wrapper, but
there
are two coming out, a little time apart, with different exim msg-IDs.

More details on request, am prefectly willing to run it in debug mode (if
there
is
one) if that might help.

====================================================================
Audit trail:
Wed Jul 21 12:57:10 1999 hmeland moved from incoming to open

[http://sourceforge.net/tracker/index.php?func=detail&aid=214103&group_id=103&atid=100103]

Revision history for this message
Bug Importer (bug-importer) wrote :

From: Harald Meland <email address hidden>
Subject: Re: [Mailman-cabal] Duplicate messages from mailman (PR#88)
Date: 27 Jul 1999 16:35:48 +0200

[<email address hidden>]

> After an annoyingly-timed pause, finally another duplicate. The
mailman
> log entries:
>
> [post]
> Jul 27 11:04:56 1999 post to sluts from address@concealed size=146
pid=28264
> Jul 27 11:07:40 1999 post to sluts from address@concealed size=146
pid=28464
>
> [debug]
> Jul 27 11:04:56 1999 List 'sluts': Running 'post' under pid 28264
> Jul 27 11:07:40 1999 List 'sluts': Running 'post' under pid 28464
>
> And from exim:
>
> [root@ibm1 exim]# grep 11946S-0007Lp-00 main.log # The ID from both
posts
> 1999-07-27 11:04:56 11946S-0007Lp-00 <= <email address hidden>
> H=library.lspace.org [195.200.1.58] U=mail P=esmtp S=1174
> id=014f01bed818$abc6cda0$040109c0@gastuk_nt4.gastltd_pd T="Re:
[sluts] Re:
> Saturday" for <email address hidden>
> 1999-07-27 11:07:52 11946S-0007Lp-00 =>
|/home/mailman/mail/wrapper post
> sluts <email address hidden> D=mailman_alias T=address_pipe
> 1999-07-27 11:07:52 11946S-0007Lp-00 Completed

So, to sum this up:

 1. At 11:04:56, Exim logs reception of the message.
 2. At the same time, Mailman logs startup of the "post" script
in the
    debug log.
 3. Also at the same time, Mailman logs a delivery to the list from
    pretty far down in the MailList.Post() method.

 4. Then, at 11:07:40, Mailman logs startup of the "post" script
once
    more, now with a different pid.
 5. Also at 11:07:40, Mailman logs a new delivery to the list, with
    the same pid as the latter "post" script.

 6. Exim logs that the delivery is completed at 11:07:52. Delivery
    completion means that the Mailman pipe has exited with status zero
    (indicating a successful delivery).

> So, two seperate (but similar - _surely_ coincidence?)

As the printed PIDs are taken directly from calls to os.getpid(), I
think it is rather safe to assume that the similarity is coincidental.

> PIDs, which is clearly what you were looking for.

Indeed.

> I presume that this has to be exim barfing and retrying for some
> bizzarre reason?

If not exactly "barfing", then at least spawning more than one
pipe
delivery while apparently only logging one of them.

> Although I'd have expected it to reflect that in it's own
> log... perhaps the wrapper doing much the same?

I've grepped the entire Mailman source for anything that might re-exec
the "post" script, but found nothing. And, a simple re-exec
would not
be enough -- the new "post" instance would also need to be fed
the
message on it's stdin.

I haven't had the time to keep up with the Exim-users list recently,
but I (now) strongly suspect that this problem belongs there rather
than with Mailman.

If the problem is of such a nature that it might also bite other
Mailman/Exim users, please keep us (<email address hidden>)
posted on any progress/solution.

Cheers,
--
Harald

Revision history for this message
Bug Importer (bug-importer) wrote :

From: <email address hidden>
Subject: Re: [Mailman-cabal] Duplicate messages from mailman (PR#88)
Date: Tue, 27 Jul 1999 13:38:37 +0100

Harald Meland wrote:

> Hmm... this one's a toughie. I've gone over the internal message
flow
> in Mailman without seeing any obvious reason why this happens...
Could
> you try installing the below debugging patch?

After an annoyingly-timed pause, finally another duplicate. The mailman
log entries:

[post]
Jul 27 11:04:56 1999 post to sluts from address@concealed size=146
pid=28264
Jul 27 11:07:40 1999 post to sluts from address@concealed size=146
pid=28464

[debug]
Jul 27 11:04:56 1999 List 'sluts': Running 'post' under pid 28264
Jul 27 11:07:40 1999 List 'sluts': Running 'post' under pid 28464

And from exim:

[root@ibm1 exim]# grep 11946S-0007Lp-00 main.log # The ID from both posts
1999-07-27 11:04:56 11946S-0007Lp-00 <= <email address hidden>
H=library.lspace.org [195.200.1.58] U=mail P=esmtp S=1174
id=014f01bed818$abc6cda0$040109c0@gastuk_nt4.gastltd_pd T="Re:
[sluts] Re:
Saturday" for <email address hidden>
1999-07-27 11:07:52 11946S-0007Lp-00 => |/home/mailman/mail/wrapper
post
sluts <email address hidden> D=mailman_alias T=address_pipe
1999-07-27 11:07:52 11946S-0007Lp-00 Completed

So, two seperate (but similar - _surely_ coincidence?) PIDs, which is
clearly what you were looking for. I presume that this has to be exim
barfing and retrying for some bizzarre reason? Although I'd have expected
it to reflect that in it's own log... perhaps the wrapper doing much the
same?

Anyhow, I wait on your response...
--
  Kieran Hartnett - Frontier Internet Network Operations
  Tel: +44 20 7531 9840 Fax: +44 20 7531 9930

 All statements made and agreements come to by means of email are
 at all times subject to Frontier's Terms and Conditions of service
 and product descriptions/sales literature. Representations made
 above and beyond those contained therein are not to be relied
 upon and are at no time contractually binding.

Revision history for this message
Bug Importer (bug-importer) wrote :

From: Harald Meland <email address hidden>
Subject: Re: [Mailman-cabal] Duplicate messages from mailman (PR#88)
Date: 22 Jul 1999 13:13:32 +0200

[<email address hidden>]

> > Hmm... this one's a toughie. I've gone over the internal
message flow
> > in Mailman without seeing any obvious reason why this happens...
Could
> > you try installing the below debugging patch?
>
> Yup, but FYI, two of the lines you said to leave unchanged in
MailList.py
> weren't there in the first place:

Oops, sorry -- I'm making patches against current CVS. See
<URL:http://www.python.org/mailman/listinfo/mailman-checkins> for
info
on how you can get current CVS, if you'd like.

However, I think it's quite safe to assume that the lines not present
in your copy are unrelated to the problem you're seeing.

> I did at one point, after noticing this duplication prob, try (and I
> think fail, jusging by the mailman-version header in mails since) to
> "make update" to rc3, which may (or may not) explain this.

You will have to "make install" _before_ you "make
update" (and
besides, "make update" shouldn't be necessary for going from rc2
to
rc3, although it won't hurt, either).
--
Harald

Revision history for this message
Bug Importer (bug-importer) wrote :

From: <email address hidden>
Subject: Re: [Mailman-cabal] Duplicate messages from mailman (PR#88)
Date: Thu, 22 Jul 1999 12:06:06 +0100

> Hmm... this one's a toughie. I've gone over the internal message
flow
> in Mailman without seeing any obvious reason why this happens...
Could
> you try installing the below debugging patch?

Yup, but FYI, two of the lines you said to leave unchanged in MailList.py
weren't there in the first place:

> diff -u -r1.132 MailList.py
> --- MailList.py 1999/07/21 02:19:21 1.132
> +++ MailList.py 1999/07/21 16:53:21
> @@ -1344,8 +1344,9 @@
> # (not metoo), but delivery to their address is
disabled
> # (nomail)
> pass
> - self.LogMsg("post", "post to %s from %s
size=%d",
> - self._internal_name, msg.GetSender(),
len(msg.body))
> + self.LogMsg("post", "post to %s from %s
size=%d, pid=%d",
> + self._internal_name, msg.GetSender(),
len(msg.body)
> + os.getpid())
> d = self.__dict__.copy() # THIS ONE
> d['cgiext'] = mm_cfg.CGIEXT # AND THIS ONE

I did at one point, after noticing this duplication prob, try (and I think
fail, jusging by the mailman-version header in mails since) to "make
update" to rc3, which may (or may not) explain this.
--
  Kieran Hartnett - Frontier Internet Network Operations
  Tel: +44 20 7531 9840 Fax: +44 20 7531 9930

 All statements made and agreements come to by means of email are
 at all times subject to Frontier's Terms and Conditions of service
 and product descriptions/sales literature. Representations made
 above and beyond those contained therein are not to be relied
 upon and are at no time contractually binding.

Revision history for this message
Bug Importer (bug-importer) wrote :

From: Harald Meland <email address hidden>
Subject: Re: [Mailman-cabal] Duplicate messages from mailman (PR#88)
Date: 21 Jul 1999 18:55:39 +0200

[<email address hidden>]

> > > Jul 20 18:04:48 1999 post to sluts from
<email address hidden> size=542
> > > Jul 20 18:05:03 1999 post to sluts from
<email address hidden> size=542
> >
> > Hmm, those log lines are written from MailList.Post() which
either is
> > invoked from the "$prefix/mail/post" script (via
Exim's alias file),
> > or from "$prefix/Mailman/Cgi/admindb.py" (when
approving held
> > messages).
>
> The former, in this case then. The messages in question were never
passed
> for moderation.

I suspected as much.

> > I can't see how any of those two calls to the Post() method
could get
> > called twice from a single Exim pipe delivery.
> >
> > What do the Exim logs look like for these incidents?
[snippety snip]

Thanks -- I didn't get any more clues out of those, unfortunately.

Hmm... this one's a toughie. I've gone over the internal message flow
in Mailman without seeing any obvious reason why this happens... Could
you try installing the below debugging patch?

Index: post
===================================================================
RCS file: /export/public/cvsroot/mailman/scripts/post,v
retrieving revision 1.22
diff -u -r1.22 post
--- post 1999/04/16 02:31:35 1.22
+++ post 1999/07/21 16:52:45
@@ -29,7 +29,7 @@
 # Todo: check size of To: list < 100
 # Send back why the post was rejected.

-import sys
+import sys, os
 import paths
 from Mailman import MailList
 from Mailman import Message
@@ -43,6 +43,8 @@

 # TODO: This can fail, and should send back an error message when it
does.
 mlist = MailList.MailList(sys.argv[1])
+mlist.LogMsg("debug", "List '%s': Running 'post' under pid
%d",
+ sys.argv[1], os.getpid())
 if len(sys.argv) > 2:
     # assert sys.argv[1] == 'fromusenet'
     fromusenet = 1
Index: MailList.py
===================================================================
RCS file: /export/public/cvsroot/mailman/Mailman/MailList.py,v
retrieving revision 1.132
diff -u -r1.132 MailList.py
--- MailList.py 1999/07/21 02:19:21 1.132
+++ MailList.py 1999/07/21 16:53:21
@@ -1344,8 +1344,9 @@
                 # (not metoo), but delivery to their address is disabled
                 # (nomail)
                 pass
- self.LogMsg("post", "post to %s from %s
size=%d",
- self._internal_name, msg.GetSender(), len(msg.body))
+ self.LogMsg("post", "post to %s from %s size=%d,
pid=%d",
+ self._internal_name, msg.GetSender(), len(msg.body)
+ os.getpid())
         d = self.__dict__.copy()
         d['cgiext'] = mm_cfg.CGIEXT
  self.DeliverToList(msg, recipients,

--
Harald

Revision history for this message
Bug Importer (bug-importer) wrote :
Download full text (4.5 KiB)

From: <email address hidden>
Subject: Re: [Mailman-cabal] Duplicate messages from mailman (PR#88)
Date: Wed, 21 Jul 1999 15:40:45 +0100

> > Jul 20 18:04:48 1999 post to sluts from <email address hidden>
size=542
> > Jul 20 18:05:03 1999 post to sluts from <email address hidden>
size=542
>
> Hmm, those log lines are written from MailList.Post() which either is
> invoked from the "$prefix/mail/post" script (via Exim's
alias file),
> or from "$prefix/Mailman/Cgi/admindb.py" (when approving
held
> messages).

The former, in this case then. The messages in question were never passed
for moderation.

> I can't see how any of those two calls to the Post() method could get
> called twice from a single Exim pipe delivery.
>
> What do the Exim logs look like for these incidents?

First, parts of the headers from the two posts...
----
Received: from ibm1.ftech.net (relay2.ftech.net) [::ffff:212.32.16.71]
        by library.lspace.org with esmtp (Exim 3.02 #1 (Debian))
        id 116dRZ-0000sM-00; Tue, 20 Jul 1999 18:12:41 +0100
Received: from localhost ([127.0.0.1] helo=ibm1.ftech.net)
        by relay2.ftech.net with esmtp (Exim 2.12 #3)
        id 116dKD-00081T-00; Tue, 20 Jul 1999 18:05:05 +0100
Received: from library.lspace.org ([195.200.1.58] ident=mail)
        by relay2.ftech.net with esmtp (Exim 2.12 #3)
        id 116dJu-00080X-00
        for <email address hidden>; Tue, 20 Jul 1999 18:04:46 +0100
Message-ID: <email address hidden>
From: "Nef. Blue" <email address hidden>
To: <email address hidden>
Subject: Re: Whats with the Pruple Turtle?
Date: Tue, 20 Jul 1999 17:05:36 GMT
----
Received: from ibm1.ftech.net (relay2.ftech.net) [::ffff:212.32.16.71]
        by library.lspace.org with esmtp (Exim 3.02 #1 (Debian))
        id 116dRL-0000sE-00; Tue, 20 Jul 1999 18:12:27 +0100
Received: from localhost ([127.0.0.1] helo=ibm1.ftech.net)
        by relay2.ftech.net with esmtp (Exim 2.12 #3)
        id 116dJx-00080m-00; Tue, 20 Jul 1999 18:04:49 +0100
Received: from library.lspace.org ([195.200.1.58] ident=mail)
        by relay2.ftech.net with esmtp (Exim 2.12 #3)
        id 116dJu-00080X-00
        for <email address hidden>; Tue, 20 Jul 1999 18:04:46 +0100
Message-ID: <email address hidden>
From: "Nef. Blue" <email address hidden>
To: <email address hidden>
Subject: Re: Whats with the Pruple Turtle?
Date: Tue, 20 Jul 1999 17:05:36 GMT
----
And from the exim logs:

[root@ibm1 old]# zgrep 116dJu-00080X-00 main.log.21-07-1999.gz
1999-07-20 18:04:47 116dJu-00080X-00 <= <email address hidden>
H=library.lspace.org [195.200.1.58] U=mail P=esmtp S=1340
<email address hidden> T="Re: Whats with the
Pruple
Turtle?" for <email address hidden>
1999-07-20 18:05:33 116dJu-00080X-00 => |/home/mailman/mail/wrapper
post
sluts <email address hidden> D=mailman_alias T=address_pipe
1999-07-20 18:05:33 116dJu-00080X-00 Completed

[root@ibm1 old]# zgrep 116dKD-00081T-00 main.log.21-07-1999.gz
1999-07-20 18:05:21 116dKD-00081T-00 <= <email address hidden>
H=localhost (ibm1.ftech.net) [127.0.0.1] P=esmtp S=1726
<email address hidden> T="Re: Whats with the
Pruple
Turtle?" for <email address hidden> <email address hidden>...

Read more...

Revision history for this message
Bug Importer (bug-importer) wrote :

From: Harald Meland <email address hidden>
Subject: Re: [Mailman-cabal] Duplicate messages from mailman (PR#88)
Date: 21 Jul 1999 16:22:30 +0200

[<email address hidden>]

> Full_Name: Kieran Hartnett
> Version: 1.0rc2
> Python: 1.5.1
> OS: Linux - Redhat 6.0
> Submission from: kieran.tech.ftech.net (212.32.17.4)
>
>
> exim feeds one message into the mailman wrapper, mailman feeds out
two. Very
> occasional, only noticable on a fairly high-volume list. Example from
the
> "post" log:
>
> Jul 20 18:04:48 1999 post to sluts from <email address hidden> size=542
> Jul 20 18:05:03 1999 post to sluts from <email address hidden> size=542

Hmm, those log lines are written from MailList.Post() which either is
invoked from the "$prefix/mail/post" script (via Exim's alias
file),
or from "$prefix/Mailman/Cgi/admindb.py" (when approving held
messages).

I can't see how any of those two calls to the Post() method could get
called twice from a single Exim pipe delivery.

What do the Exim logs look like for these incidents?
--
Harald

Revision history for this message
Barry Warsaw (barry) wrote :

Almost assuredly no longer a problem in Mailman 2.0

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.