Messages can take days to appear in the MhonArc archive

Bug #779915 reported by Curtis Hovey
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lp-mailman
Triaged
High
Unassigned

Bug Description

User have noted that a message they receive from an Lp mailing list may take hours or days appear in the archive. It was observed that there can be a backlog of messages in mailman/qfiles/archive, possibly 1000s of messages from the previous week.

This issue appears to be caused by giant archives, which need to regenerate all the index pages and many messages when a new message is added. Lp indexes are in reverse order, newest to oldest, so when a message is added, a message is pushed to the previous index. Process names on the server imply a lot of time is spent adding messages to https://lists.launchpad.net/ubuntu-x-swat/date.html which has more than 100,000 messages.

Lp could consider changing the index order for large lists so less work in needed.

Another route is modify Lp's mailman configuration to process a larger slice or archive messages. Lp processes one message at at a time in each queue, but it clearly takes longer to archive one message than it does to send a 1000 messages. The archive qrunner could process 2 or 3 messages to keep up, but this will in fact slow down the other queue slightly.

Curtis Hovey (sinzui)
tags: added: mailing-lists
Revision history for this message
Curtis Hovey (sinzui) wrote :

We can improve the queuing issue by updating the QRUNNERS tuple in
    lib/lp/services/mailman/monkeypatches.py
to a multiple of 2 (only documented in mailman itself). eg:
    ('ArchRunner', 4)
will archive 4 messages per sent message.

test_mm_cfg.TestMMCfgDefaultsTestCase.test_qrunners needs revision. The current test only verifies that the queue is enabled. We may want to verify that we gave it a larger slice.

tags: added: trivial
Curtis Hovey (sinzui)
tags: added: ml-archive-sucks
removed: ml-archives-sucks
Revision history for this message
Francis J. Lacoste (flacoste) wrote :

Note that a work-around is to use mail-archives.org which has a copy of all our public lists, and process things in much better time. (They don't use Mhonarc on ancient hardware)

Revision history for this message
Francis J. Lacoste (flacoste) wrote :
tags: removed: trivial
tags: added: feature
Revision history for this message
Simon Quigley (tsimonq2) wrote :

This is still an issue, but the delay has decreased.

Colin Watson (cjwatson)
affects: launchpad → lp-mailman
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.