attachments archived even when archiving disabled

Bug #266317 reported by Rscottbailey
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
GNU Mailman
Medium
Unassigned

Bug Description

I just noticed disappearing disk space under /var on
my Debian system running mailman 2.1.7... :-)

Investigation reveals lots of space tied up
under /var/lib/mailman/archives/private/<list>/attachm
ents/<yyyymmdd>/<blah> -- it appears that any message
containing an attachment causes the attachment to be
stashed here in the archive tree, even when archiving
is disabled (and nothing else in the archives tree is
getting updated).

I do not believe it is correct behavior for
attachments to be saved in these circumstances.

Thanks,

  Scott Bailey
  <email address hidden>

[http://sourceforge.net/tracker/index.php?func=detail&aid=1442639&group_id=103&atid=100103]

Revision history for this message
Mark Sapiro (msapiro) wrote :

This is expected behavior. The scrubber saves attachments in
the archives/private/<listname>/attachments/ directory. This
happens for all messages if scrub_nondigest is Yes, and for
all plain digests in any case even if the list does not do
archiving.

If you allow attachments at all, the only way to avoid this
is to set both scrub_nondigest and digestable to No. I.e,
don't scrub individual messages and don't allow digests.

Revision history for this message
Mark Sapiro (msapiro) wrote :

Closing per my previous comment.

Revision history for this message
James Ralston (ralston) wrote :

We're affected by this issue as well.

Let's be clear what's happening here: ToDigest.send_i18n_digests() needs to scrub attachments from the RFC1153 version of the digest. To do that, it calls Handlers.Scrubber.process(), which saves the scrubbed attachments to disk (in the list's archive directory)—regardless of whether the list is configured to archive.

Thus, I must respectfully disagree that this behavior is "expected". The fact that ToDigest.send_i18n_digests() silently fills up the list's archive directory with scrubbed attachments (as a side-effect of producing the RFC1153 version of each digest) may be "expected" behavior from the point of view of the Mailman developers, but it is NOT expected behavior from the point of view of Mailman administrators. Side-effects are bugs, regardless of whether they are unintended, unwanted, or undocumented—and this bug hits at least 2 out of 3.

Another way to look at this issue is that Mailman is dropping temporary files that are never referenced again and never removed. If any other program did that, the behavior would be properly described as a bug.

Even worse, there is no way to stop this behavior. We've configured our lists not to archive, but we cannot configure our lists not to permit digests, as we have subscribers who prefer digest mode. As the code stands right now, If I want to stop Mailman from continually trying to fill up my /var filesystem, I'm going to have to write a cron job to go clean up after Mailman. But that will be very challenging to do correctly for lists that were configured to archive at any point in the past.

We've been pleased with Mailman, and we're grateful to all of the developers who work on it. But come on, guys... this behavior is a bug. Handlers.Scrubber.process() shouldn't leave temporary scrubbed attachments on disk when it is called by ToDigest.send_i18n_digests(). Please fix this.

Revision history for this message
Barry Warsaw (barry) wrote :

Setting back to New state because I do think that when archiving is disabled, the scrubber shouldn't store the attachments on disk. IOW, it's a legitimate bug.

Changed in mailman:
status: Invalid → New
Revision history for this message
Mark Sapiro (msapiro) wrote :

The question is what to do? The attachments are not really stored "in the archive". They are just stored in a web accessible place with a known, hopefully valid URL. In fact, if the list does do archiving and is digestible, the scrubbed attachments are stored twice, once for the archive and linked from the archive and once for the digest and linked from the digest. (This duplication probably is a bug, but not one I know how to fix for MM 2.1.)

If the list is digestable, any MIME parts which are not text/plain with known character set cannot be directly included in the plain (RFC 1153) format digest. They either have to be stored aside in some accessible place and linked from the digest, or a note can be put in the digest that an attachment has been removed and if you want attachments in the future switch to messages or the MIME digest, or the attachments can be silently ignored. I suppose this choice could be a site or list setting, instead of being fixed at the first alternative, but I think this is properly called a design decision, not a bug.

Revision history for this message
Denis Roy (eclipse.webmaster) wrote :

I've been struggling with this for years. If there is no fix in sight, is it safe to remove these attachments directories? Or will all the attachment files just come back?

Looking for a workaround here...

Revision history for this message
Mark Sapiro (msapiro) wrote :

> is it safe to remove these attachments directories? Or will all the attachment files just come back?

Assuming the list doesn't archive (archive = No), the only things stored in the attachments directory are attachments scrubbed from the 'plain' format digest. You can remove the attachments directory and its contents, and those specific files won't come back, but as long as your list is digestable, attachments from subsequent digests will continue to be stored.

You might think creating a attachments directory as a symlink to /dev/null would work, but it won't because Mailman tries to create and access subdirectories under attachments.

Probably the best you can do is create a cron job to 'rm -rf archives/private/listname/attachments' or maybe 'find archives/private/listname/attachments -maxdepth 1 -mtime +2 -exec rm -f \{\} \;'. The 'find' example should just remove those subdirectories older than 2 days (-mtime +2).

Revision history for this message
claus (claus2) wrote :

I suggest marking this bug as a security issue.
While testing a mailserver I came across this bug: Admins and Mailinglistadmins were not aware that years of archived attachments were stored on their server and could be accessed by an attacker gaining access to that system. They explicitly set archiving to "No" so confidential information would not be left lying around on the mailserver.
=> This is not expected behaviour
=> This can be become a very serious security issue for some users

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers