set up read/write edit API

Bug #135428 reported by Aaron Swartz
10
Affects Status Importance Assigned to Milestone
Open Library
Fix Released
High
Anand Chitipothu

Bug Description

Creative Commons and the Wikimedia Foundation want to integrate with us, adding copyright data and allowing us to be added to Wikipedia pages. But to do this we need to have a read-write API. They'd prefer something based on the Atom Publishing Protocol, like GData (see http://code.google.com/apis/gdata/basics.html). This is a key task for integrating with them.

Aaron Swartz (aaronsw)
Changed in openlibrary:
assignee: nobody → anandology
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Daniel B. Giffin (daniel-mybuttocks) wrote :

so you mean they want a REST-like, XML/HTTP-based protocol for inserting and updating records in our system?

that sounds good. maybe i'll just make the importer work this way.

Revision history for this message
Aaron Swartz (aaronsw) wrote : Re: [Bug 135428] Re: set up read/write edit API

> so you mean they want a REST-like, XML/HTTP-based protocol for inserting
> and updating records in our system?
>
> that sounds good. maybe i'll just make the importer work this way.

Yes.

Revision history for this message
Anand Chitipothu (anandology) wrote :

GData API might not be sufficient to express the structured data of infogami.
How about something similar to mac's plist format?

<thing version="1.0">
    <key>name</key>
    <string>a/Mark_Twain</string>
    <key>created</key>
    <timestamp>2006-01-23T16:26:03-08:00</timestamp>
    <key>author</key>
    <ref>user/webchick</ref>
    <key>type</key>
    <ref>type/author</ref>
    <key>revision</key>
    <int>7</int>
    <key>data</key>
    <dict>
        <key>name</key>
        <string>Mark Twain</string>
        <key>alternate_names</key>
        <string>Samuel Clemens</string>
        <key>birth_date</key>
        <date>November 30, 1835</date>
        <key>death_date</key>
        <date>April 21, 1910</date>
        <key>location</key>
        <string>Mark Twain is buried in Elmira, New York</string>
        <key>bio</key>
        <text>blah bah blah</text>
    </dict>
</thing>

Revision history for this message
Aaron Swartz (aaronsw) wrote :

Is the data dict the only non-flat thing? If so, it might be worth
flattening it just for compatibility

Revision history for this message
Anand Chitipothu (anandology) wrote :

On 31-Aug-07, at 6:12 PM, Aaron Swartz wrote:

> Is the data dict the only non-flat thing? If so, it might be worth
> flattening it just for compatibility

There could be problems. for example author.name and author.d.name
will both map to the same key.

Revision history for this message
Aaron Swartz (aaronsw) wrote :

> There could be problems. for example author.name and author.d.name
> will both map to the same key.

We can use a different namespace for things in d.

Revision history for this message
Anand Chitipothu (anandology) wrote :

On 31-Aug-07, at 6:32 PM, Aaron Swartz wrote:

>> There could be problems. for example author.name and author.d.name
>> will both map to the same key.
>
> We can use a different namespace for things in d.

Does this sound okay?
http://infogami.org/dev/idata

Revision history for this message
Aaron Swartz (aaronsw) wrote :

> Does this sound okay?
> http://infogami.org/dev/idata

1. How about foo.atom instead of foo/feed
2. You're missing some namespaces. How about making the second example:

<entry xmlns="http://www.w3.org/2005/Atom"
xmlns:i="http://infogami.org/schema"
xmlns:d="http://demo.openlibrary.org/type/page">
    <i:type>type/page</type>
    <d:title>Foo</d:title>
    <d:body>Bar</d:body>
</entry>

3. When you say <author>user/anand</author> it should probably be:

<author>
   <name>Anand Chitipothu</name>
   <uri>http://demo.openlibrary.org/user/anand</uri>
</author>

(cf. http://atompub.org/rfc4287.html#atomPersonConstruct)

4. Some of the URLs are wrong, e.g.:

In the following example, we're changing the entry's body from its old value.

PUT /myFeed/1/1/

Shouldn't that be just /foo.atom

If so, the edit links are wrong too.

Revision history for this message
Anand Chitipothu (anandology) wrote :

> 1. How about foo.atom instead of foo/feed

I think foo/atom is better than foo.atom.
Every /foo page can be exposed at /foo/atom or /foo/feed.

> 2. You're missing some namespaces. How about making the second
> example:

Yep.
Will xmlns:d always depend on the type of the page?

So for author pages, it will be
xmlns:d="http://demo.openlibrary.org/type/author"

> 3. When you say <author>user/anand</author> it should probably be:
>
> <author>
> <name>Anand Chitipothu</name>
> <uri>http://demo.openlibrary.org/user/anand</uri>
> </author>

OK.

> 4. Some of the URLs are wrong, e.g.:
>
> In the following example, we're changing the entry's body from its old
> value.
>
> PUT /myFeed/1/1/

Yes. It was a typo.

> Shouldn't that be just /foo.atom

No. /foo/atom/1. for detecting version conflicts.

Revision history for this message
Anand Chitipothu (anandology) wrote :

What about the open issues?

Revision history for this message
Anand Chitipothu (anandology) wrote :

>
> <entry xmlns="http://www.w3.org/2005/Atom"
> xmlns:i="http://infogami.org/schema"
> xmlns:d="http://demo.openlibrary.org/type/page">
> <i:type>type/page</type>
> <d:title>Foo</d:title>
> <d:body>Bar</d:body>
> </entry>

There is a problem with this approach. How to represent lists?
Because we are not showing the parent information, there is also a
need to represent dictionaries sometimes.
For example, value of properties field of /type/page/feed is list of
dicts.

['title': {'type': 'type/string', 'unique': True, 'description':''},
'body' : {'type': 'type/text', 'unique':True, 'description': ''}]

I think plist kind of format suits better than above flat format.

<entry xmlns="http://www.w3.org/2005/Atom">
     <key>name</key>
     <string>type/page</string>
     <key>created</key>
     <timestamp>2006-01-23T16:26:03-08:00</timestamp>
     <key>author</key>
     <ref>user/anand</ref>
     <key>type</key>
     <ref>type/type</ref>
     <key>revision</key>
     <int>7</int>
     <key>data</key>
     <dict>
         <key>description</key>
         <string></string>
         <key>is_primitive</string>
         <boolean>false</boolean>
         <key>properties</key>
         <list>
             <key>title</key>
             <dict>
                 <key>type</key>
                 <ref>type/string</ref>
                 <key>unique</key>
                 <boolean>true</ref>
                 <key>description</key>
                 <string></string>
             </dict>
             <key>body</key>
             <dict>
                 <key>type</key>
                 <ref>type/text</ref>
                 <key>unique</key>
                 <boolean>true</ref>
                 <key>description</key>
                 <string></string>
             </dict>
         </list>
</entry>

Revision history for this message
Aaron Swartz (aaronsw) wrote :

On 9/20/07, Anand <email address hidden> wrote:
> >
> > <entry xmlns="http://www.w3.org/2005/Atom"
> > xmlns:i="http://infogami.org/schema"
> > xmlns:d="http://demo.openlibrary.org/type/page">
> > <i:type>type/page</type>
> > <d:title>Foo</d:title>
> > <d:body>Bar</d:body>
> > </entry>
>
> There is a problem with this approach. How to represent lists?

Just repeat the element:

<d:tag>food</d:tag>
<d:tag>drink</d:tag>

> Because we are not showing the parent information, there is also a
> need to represent dictionaries sometimes.
> For example, value of properties field of /type/page/feed is list of
> dicts.
>
> ['title': {'type': 'type/string', 'unique': True, 'description':''},
> 'body' : {'type': 'type/text', 'unique':True, 'description': ''}]

<t:title>
   <t:type>type/string</t:type>
   <t:unique>true</t:unique>
   <t:description></t:description>
</t:title>

> I think plist kind of format suits better than above flat format.

I can definitely see the appeal of plist format, but I think it will
be really hard for most people to parse. I think the above proposal
will be more compatible.

Revision history for this message
Anand Chitipothu (anandology) wrote :

Why not provide a JSON API like freebase?
I strongly feel that with structured data, it is going to very difficult if we stick with a flat format.

Soon, we are going to need something like Compound Value Type[1,2] in freebase. Properties is just one such example.

[1]: http://blog.freebase.com/?p=5
[2]: http://www.freebase.com/view/%239202a8c04000641f8000000003caca8e

Revision history for this message
Aaron Swartz (aaronsw) wrote :

Sure, a Freebase-style REST JSON interface sounds great.

Revision history for this message
rejon (rejon) wrote :

On Tue, 2007-10-09 at 16:02 +0000, Aaron Swartz wrote:
> Sure, a Freebase-style REST JSON interface sounds great.
>

Yes, this is brilliant to add...

Jon

--
Jon Phillips

San Francisco, CA
USA PH 510.499.0894
<email address hidden>
http://www.rejon.org

MSN, AIM, Yahoo Chat: kidproto
Jabber Chat: <email address hidden>
IRC: <email address hidden>

Please note: the contents of this email are not intended to be
legal advice nor should they be relied upon as or represented to be
legal advice. Jon Phillips does not represent any organization through
this email address.

Revision history for this message
David Strauss (davidstrauss) wrote :

Just a note on integration with Wikipedia: a read-only API will serve our needs just fine. I'd hate to significantly delay a read-only API with the complexities of figuring out the write API.

Revision history for this message
solrize (solrize) wrote :

The current (new) implementation is pretty close to the stuff described above. We need documentation...

Revision history for this message
rejon (rejon) wrote :

Copying David's last message to the list:

I've started the initial work on a module to integrate with Wikipedia (MediaWiki). This first version will use screen-scraping to pull data from OL in order to avoid delaying the project while Open Library develops its API.

Specifically, Wikimedia's API priorities are:
* An OpenSearch foundation for querying the system
  * Most importantly, this should allow querying by unique IDs in Open Library
  * Later, we will need full-text searching and querying by Open Library record timestamps for creation and updates
* Basic XML records delivered as results from the system
  * Enough to fill in Wikipedia's citation templates

We'd like to move forward on API implementation over the next few weeks.

###

Yes, totally moving on this asap...Anand, any help on the documentation on the AP?

Revision history for this message
Anand Chitipothu (anandology) wrote :

> Specifically, Wikimedia's API priorities are:
> * An OpenSearch foundation for querying the system
> * Most importantly, this should allow querying by unique IDs in Open Library
> * Later, we will need full-text searching and querying by Open Library record timestamps for creation and updates
> * Basic XML records delivered as results from the system
> * Enough to fill in Wikipedia's citation templates
>
> We'd like to move forward on API implementation over the next few weeks.

Great!

> Yes, totally moving on this asap...Anand, any help on the documentation
> on the AP?

I am working on the documentation. It will be ready with in a day.

Revision history for this message
webchick (webchick) wrote :

Anand sent me some docs yesterday that I am working on. I am going to put them together in a cohesive package, and send it to him for review. I should have them ready by this weekend.

Revision history for this message
rejon (rejon) wrote :

Awesome!

On Fri, 2008-04-18 at 16:10 +0000, Anand Chitipothu wrote:
> > Specifically, Wikimedia's API priorities are:
> > * An OpenSearch foundation for querying the system
> > * Most importantly, this should allow querying by unique IDs in Open Library
> > * Later, we will need full-text searching and querying by Open Library record timestamps for creation and updates
> > * Basic XML records delivered as results from the system
> > * Enough to fill in Wikipedia's citation templates
> >
> > We'd like to move forward on API implementation over the next few weeks.
>
> Great!
>
> > Yes, totally moving on this asap...Anand, any help on the documentation
> > on the AP?
>
> I am working on the documentation. It will be ready with in a day.
>
--
Jon Phillips
San Francisco, CA
CHINA +86 1-360-282-8624
http://www.rejon.org
IM/skype: kidproto
Jabber: <email address hidden>
IRC: <email address hidden>

Revision history for this message
webchick (webchick) wrote :

(disregard my message then - if you can get them done sooner without me, by all means - go for it! I can do my part later ...)

raj (raj-archive)
Changed in openlibrary:
milestone: 1.0 → 1.6
Changed in openlibrary:
status: Confirmed → Fix Committed
Changed in openlibrary:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.