Unable to delete messages from list archive
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| Launchpad itself |
High
|
Curtis Hovey |
Bug Description
I've been trying to delete messages from a list archive using the instructions defined on https:/
tags: | added: canonical-losa-lp |
affects: | launchpad → launchpad-foundations |
affects: | launchpad-foundations → launchpad-registry |
tags: | added: ml-archive-sucks |
Changed in launchpad-registry: | |
assignee: | nobody → Curtis Hovey (sinzui) |
importance: | Undecided → High |
status: | New → In Progress |
tags: | added: docs |
Curtis Hovey (sinzui) wrote : | #1 |
Michael Barnett (mbarnett) wrote : | #2 |
I added the --nokeeponrmm flag to the previous removal run. It bitched again about the messages not being in the db:
https:/
After running with that flag, i can still pull up messages that were "removed" on the web:
Curtis Hovey (sinzui) wrote : | #3 |
I can see from the last pasetbin that it generated 727 messages. We should check the timestamps. I suspect that we may need to delete all the html files before calling `mhonarc --nokeeponrmm -rm <ids>`.
Curtis Hovey (sinzui) wrote : | #4 |
I updated the how too to include the --nokeeponrmm and inlined the regenerate instructions. I think we should verify there is not a cache issue involved here. The pastebins report that threads.html was written. We expect the last-modified to be "Tue, 07 Dec, 2010", but the response headers for the file state
Last-Modified: Sun, 07 Nov 2010 15:40:03 GMT
Someone should look at contents of <root>/
Michael Barnett (mbarnett) wrote : | #5 |
launchpad@
-rw-r--r-- 1 launchpad launchpad 5830 2010-10-11 10:32 msg00772.html
launchpad@
-rw-r--r-- 1 launchpad launchpad 5389 2010-10-11 10:31 msg00771.html
launchpad@
[<a href="msg00770.
<strong><a href="msg00772.
Curtis Hovey (sinzui) wrote : | #6 |
I wonder if we need to delete the files before asking mhonarc to regenerate them. I suppose we could test this by tarring up the messages first, then attempt the delete again. If the messages are regenerated, we declare success, otherwise we untar the files.
Curtis Hovey (sinzui) wrote : | #7 |
Michael confirmed that regenerating the entire web archive does indeed update all pages, but the problem messages are still there. So how can they be deleted from the archive/db, yet still be regenerated?
Curtis Hovey (sinzui) wrote : | #8 |
I think I understand the nature of the problem. MHonArc silently falls back to bad defaults. We must always specify the -dbfile and -outdir because most operations involved a comparison of the two. In this case we will see reports that the messages were not found in the archive (the html directory), then it wrote out new html files. We did not see an update because we were looking at the expected -outdir that we never specified; I do not know where we generated the other archives ;).
I updated the howto with the explicit options that allowed me to perform a delete that updates the correct archive.
Changed in launchpad-registry: | |
milestone: | none → 10.12 |
Changed in launchpad-registry: | |
status: | In Progress → Fix Released |
The messages are not in the thread index. As was report many weeks ago, the messages were indeed removed from the archive, the threads and messages were rewritten without the removed message. The output does not say it regenerated the messages, sans the deleted message.
The pages for the message *were not* updated, nor were the pages for the deleted messages deleted. Deleting the messages. I suspect that KEEPONRMM is enabled in the archive db or M2H_KEEPONRMM=1 in the env.
Also, the loop is very slow. This is faster and may do the right thing:
mhonarc --nokeeponrmm -rmm 722 724 725 726 727 728 729 73 731 732 737 742 747 752 756 762 767 772 772