pop3 grabber leaves old emails on server
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Quotient |
New
|
Undecided
|
Jean-Paul Calderone |
Bug Description
The POP3 grabber never deletes messages from the server it retrieves them from. It keeps track of what it has downloaded and doesn't download it again. This has several downsides:
* Local tracking state grows forever. Overhead for each individual message is pretty minimal, but tracking hundreds of thousands or millions of emails eventually adds up to something.
* The network conversation necessary to download new messages gets longer and longer over time due to shortcomings of POP3. Each old message which has already been downloaded contributes to some traffic in this conversation.
* The disk cost to the server for the POP3 account grows as old messages, already downloaded by Quotient, pile up higher and higher in the mailbox.
The downsides to deleting old messages:
* there is no longer a duplicate copy of the message to use to recover from a data loss
* another POP3 client will not be able to download them after Quotient has deleted them
The first of these should be mitigated by taking backups. This is necessary in practice anyway, and due to the growing costs of the current behavior, in practice messages will have been deleted from the mailbox anyway (perhaps by some manual process).
The second can be mitigated by leaving a time delay - perhaps a week - between when a message appears in a mailbox and when Quotient decides it should be deleted.
Related branches
- Divmod-dev: Pending requested
-
Diff: 909 lines (+611/-85)4 files modifiedQuotient/xquotient/grabber.py (+166/-51)
Quotient/xquotient/test/historic/stub_pop3uid1to2.py (+37/-0)
Quotient/xquotient/test/historic/test_pop3uid1to2.py (+32/-0)
Quotient/xquotient/test/test_grabber.py (+376/-34)
Changed in quotient: | |
assignee: | nobody → Jean-Paul Calderone (exarkun) |