Add proper "job" support

Bug #1019704 reported by Jason Gerard DeRose
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Dmedia
Fix Released
High
Jason Gerard DeRose

Bug Description

From the beginning our plan has been to use CouchDB as a way to coordinate compute across any mix of local and remote nodes:

http://vimeo.com/15463106

In general, deciding when to run background tasks, and which tasks to run, has been a tricky problem, and one of the last things that needs to be solved to get Dmedia production ready.

I've been daydreaming on this lately and think I'm making progress. First, we have two quite different type of "background-tasks":

1) Housekeeping tasks Dmedia must perform in order to keep data safe and fulfil our social contract with the user. For example, the quick "scans" to make sure the files we expect on an HDD are still there and have the correct size and mtime, creating additional copies of user files with low durability, reclaiming space when needed, downloading files from peer devices, etc. All these tasks are very high priority and should generally run as soon as needed. Then there is also the full content-hash verification. This will be done periodically, but must me scheduled with more care as it's both IO and fairly compute intensive. For example, we shouldn't go crazy with full verification when on battery power, or when the system workload is already high (say because someone is editing with Novacut). Note these housekeeping tasks are *not* distributed.

2) Jobs in which other apps want to do some compute-intensive task involving one or more files stored in Dmedia. For example, transcoding, rendering, and automatic audio sync are all "jobs". We want to spread these jobs across local computers and the cloud. We want Dmedia to decide when and where to start them based on system activity and whether the files are already present on a given node. But we also want to keep the details of these jobs well abstracted so that 3rd party apps can add jobs into Dmedia in a pluggable fashion.

So I'm going to rough-out the schema for this 2nd type of background task so we can start experimenting with it. Importantly, we want to start dispatching Novacut renders through these jobs. The idea is that we create a CouchDB doc for a job that needs to be performed. Before a node starts executing a job, it updates this doc to show it's now active on a certain node (so other nodes don't run the same job). Once a job finishes, the doc is again updated.

There are 2 key schema bits that Dmedia needs to be aware of:

* the executable that will run the job. For security reasons, we don't want Dmedia executing arbitrary paths, so we'll require these workers to be installed in a special location like /var/lib/dmedia/workers. Not all nodes will have the same workers installed, so that's a deciding factor in whether to executed a job.

* the list of files the job is going to operate on. Say you have a local cluster: you'd want to execute jobs on nodes that already have most (if not all) the files available locally. A big Novacut render might require a thousand files, but we want to abstract Dmedia from the details of the Novacut edit description... so the job will just have a flat list of all the file IDs needed.

Related branches

Revision history for this message
David Jordan (dmj726) wrote :

This sounds fairly sensible. Off the top of my head I can't see anything wrong with the reasoning here.

Revision history for this message
Jason Gerard DeRose (jderose) wrote :

Okay, so there is a bit more work to do still, but I'm going to merge what I have and then finish this up in 12.08.

I feel I've done a lot to clarify the problem, and we have something that's usable in a test harness at least, despite not yet being hooked up live in Dmedia.

Changed in dmedia:
status: In Progress → Fix Committed
Changed in dmedia:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.