Define interface and protocol for synchronizing data from external data sources

Bug #429562 reported by Chris Rossi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
KARL3
Fix Released
Medium
Chris Rossi

Bug Description

A mechanism is needed for allowing Karl to pull data from external resources. Real world use cases include: syncing user/profile data with an organization's data source (ldap, active directory, etc...) such as currently done for OSI via custom scripting; allowing bulk import of content from a different content management system.

The following are assumed to be true:

 1) The external data source is authoritative as far as its data is concerned. Updates in the external data source are reflected in Karl after a refresh.

 2) The external data source is read only. Data may be pulled from the data source but no provision is made for pushing changes back upstream.

It is conceivable that in the future Karl instances can offer datasource views which can be used to by other systems (including other Karl instances) to pull data from Karl in the same way. This is not in the scope of this ticket but is mentioned here as a possibly useful and related corollary to this ticket.

This ticket covers:

 1) Defining an interface for Python objects that can represent the data in an external resource for purposes of synchronization.

 2) Defining an xml schema for data to be synced.

 3) Write an adapter class which implements the interface defined in 1 by reading xml feeds in the format defined in 2. This represents the default, canonical way to import content into Karl.

Note that it will be possible to extend Karl to sync to any arbitrary data source by either writing an adapter in Python which provides the interface defined in 1 *or* transforming the data source in such a way that produces an xml file in the schema provided in 2.

Revision history for this message
Chris Rossi (chris-archimedeanco) wrote :

There is an obvious edge case here: what do we do in the case where data is pulled from an external source, edited in Karl, and then a newer version comes down the pipe from the external source? There are a few ways we can attack this:

1) Make all externally synced content read-only in Karl. This effectively avoids the problem altogether.

2) Overwrite any local edits on update.

3) If local content has been edited locally, no longer update from external source.

4) Merge the two versions. This is rather challenging from a technical point of view and is not really being seriously considered at this point.

1, above, will be considered the "right" way to handle this for the time being.

Changed in karl3:
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Chris Rossi (chris-archimedeanco)
milestone: none → m32
Changed in karl3:
milestone: m32 → m33
Changed in karl3:
status: In Progress → Fix Committed
Changed in karl3:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.