we would like to be able to purge deleted documents (eventually)

Bug #1000731 reported by John A Meinel
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
U1DB
Confirmed
Wishlist
Unassigned

Bug Description

Deleted document still take up some space in the database. (document_id, document_rev.)

Those fields are likely to be covered in U1DB-server storage quota (document_id is user-defined, so we don't want to have an avenue for users to abuse their quota by injecting all their 'content' as a document_id.)

That implies that a user that creates lots of documents, and then deletes them, could still run into quota limitations. In the limit case, they could have a 'full' database with nothing but deleted documents.

One option is that when we purge history (bug #1000726) we could also look for documents that are marked as deleted, and purge them completely from the database.

This has the side-effect of changing the actual semantics of purging. Consider the case:

1) replica's A and B.
2) A and B sync at A:10, both sides have a doc-foo
3) A deletes doc-foo at generation A:20
4) A purges all history from before A:30.
5) B goes to sync with A
 a) If A saved the deleted document's information, B could tell that doc-foo's revision in A supersedes the old revision, and B should delete the document.
 b) If we purge the record, then doc-foo just looks like a document that B has, that needs to be added to A.

As such, purging deleted document history from the table does mean that those documents will magically reappear if you sync with a client that has never seen the deletion.

One option is to try and make deleted things 'cheaper' to store. For example, they could be saved 'off-to-the-side', or somehow in the rollup content for the transaction log. (store just the doc_id,doc_rev in the rollup, indicating it was deleted, but have that information not be indexed, etc.0
The main downside is you are storing information in different ways, which adds complexity for possibly modest/no gain.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.