Comment 2 for bug 680467

Revision history for this message
Jason Gerard DeRose (jderose) wrote :

Okay, wrapping this up. Ended up becoming more involved than originally planned, but it was highly instructive to work on the very formal, test-driven schema definition. Compared to what was originally roughed out in the bug report, the schema is less verbose yet more intuitive. For example:

'durability' => 'copies'
'priority' => 'origin'
'original' => 'user'

So {'origin': 'user'} is the irreplaceable stuff we must protect so carefully. Much better than {'priority': 'original'}, I think. There are all sorts of design issues and additional features this bug has opened up, but they will be address in additional bugs as the work scoped in this bug is in fact complete. Some high priority issues to note:

  1) 'dmedia/store' records need to include the permanent id of the physical storage device they're located on

  2) We need to be careful not to overestimate durability when there are multiple FileStore on the same physical storage device - eg, if you have two FileStore on the same hard drive, and file X is stored in both FileStore, you still only have {'copies': 1}, not {'copies': 2}

  3) 'dmedia/file' records should have an 'atime' attribute storing last access time so that we can better guess what will be needed, what can be reclaimed. Probably shouldn't update at every access, but an algorithm like, "update atime if current atime was more than a day ago", should give enough precision without creating too many conflicts, too much traffic

Anyway, the essential required metadata for a dmedia file now looks like this:

{
   "_id": "ZJAJLVLG5FHKEGFL5VJPNM4UHGWLVGVQ",
   "type": "dmedia/file",
   "time": 1297061923.757858,
   "bytes": 2659418521,
   "ext": "mov",
   "origin": "user",
   "stored": {
       "FLKMHJL2E2WIV4FXAIWLKSTR": {
           "copies": 1,
           "time": 1297061923.757858
       }
   }
}