Open Library

set up read/write edit API

Bug #135428 reported by Aaron Swartz on 2007-08-29

Affects		Status	Importance	Assigned to	Milestone
	Open Library	Fix Released	High	Anand Chitipothu	Open Library 1.6

Bug Description

Creative Commons and the Wikimedia Foundation want to integrate with us, adding copyright data and allowing us to be added to Wikipedia pages. But to do this we need to have a read-write API. They'd prefer something based on the Atom Publishing Protocol, like GData (see http://code.google.com/apis/gdata/basics.html). This is a key task for integrating with them.

Aaron Swartz (aaronsw) on 2007-08-29

Changed in openlibrary:
assignee:	nobody → anandology
importance:	Undecided → High
status:	New → Confirmed

Revision history for this message

Daniel B. Giffin (daniel-mybuttocks) wrote on 2007-08-29:

so you mean they want a REST-like, XML/HTTP-based protocol for inserting and updating records in our system?

that sounds good. maybe i'll just make the importer work this way.

Revision history for this message

Aaron Swartz (aaronsw) wrote on 2007-08-29: Re: [Bug 135428] Re: set up read/write edit API

> so you mean they want a REST-like, XML/HTTP-based protocol for inserting
> and updating records in our system?
>
> that sounds good. maybe i'll just make the importer work this way.

Yes.

Revision history for this message

Anand Chitipothu (anandology) wrote on 2007-08-31:

GData API might not be sufficient to express the structured data of infogami.
How about something similar to mac's plist format?

<thing version="1.0">
    <key>name</key>
    <string>a/Mark_Twain</string>
    <key>created</key>
    <timestamp>2006-01-23T16:26:03-08:00</timestamp>
    <key>author</key>
    <ref>user/webchick</ref>
    <key>type</key>
    <ref>type/author</ref>
    <key>revision</key>
    <int>7</int>
    <key>data</key>
    <dict>
        <key>name</key>
        <string>Mark Twain</string>
        <key>alternate_names</key>
        <string>Samuel Clemens</string>
        <key>birth_date</key>
        <date>November 30, 1835</date>
        <key>death_date</key>
        <date>April 21, 1910</date>
        <key>location</key>
        <string>Mark Twain is buried in Elmira, New York</string>
        <key>bio</key>
        <text>blah bah blah</text>
    </dict>
</thing>

Revision history for this message

Aaron Swartz (aaronsw) wrote on 2007-08-31:

Is the data dict the only non-flat thing? If so, it might be worth
flattening it just for compatibility

Revision history for this message

Anand Chitipothu (anandology) wrote on 2007-08-31:

On 31-Aug-07, at 6:12 PM, Aaron Swartz wrote:

> Is the data dict the only non-flat thing? If so, it might be worth
> flattening it just for compatibility

There could be problems. for example author.name and author.d.name
will both map to the same key.

Revision history for this message

Aaron Swartz (aaronsw) wrote on 2007-08-31:

> There could be problems. for example author.name and author.d.name
> will both map to the same key.

We can use a different namespace for things in d.

Revision history for this message

Anand Chitipothu (anandology) wrote on 2007-08-31:

On 31-Aug-07, at 6:32 PM, Aaron Swartz wrote:

>> There could be problems. for example author.name and author.d.name
>> will both map to the same key.
>
> We can use a different namespace for things in d.

Does this sound okay?
http://infogami.org/dev/idata

Revision history for this message

Aaron Swartz (aaronsw) wrote on 2007-08-31:

> Does this sound okay?
> http://infogami.org/dev/idata

1. How about foo.atom instead of foo/feed
2. You're missing some namespaces. How about making the second example:

3. When you say <author>user/anand</author> it should probably be:

<author>
<name>Anand Chitipothu</name>
<uri>http://demo.openlibrary.org/user/anand</uri>
</author>

(cf. http://atompub.org/rfc4287.html#atomPersonConstruct)

4. Some of the URLs are wrong, e.g.:

In the following example, we're changing the entry's body from its old value.

PUT /myFeed/1/1/

Shouldn't that be just /foo.atom

If so, the edit links are wrong too.

Revision history for this message

Anand Chitipothu (anandology) wrote on 2007-08-31:

> 1. How about foo.atom instead of foo/feed

I think foo/atom is better than foo.atom.
Every /foo page can be exposed at /foo/atom or /foo/feed.

> 2. You're missing some namespaces. How about making the second
> example:

Yep.
Will xmlns:d always depend on the type of the page?

So for author pages, it will be
xmlns:d="http://demo.openlibrary.org/type/author"

> 3. When you say <author>user/anand</author> it should probably be:
>
> <author>
> <name>Anand Chitipothu</name>
> <uri>http://demo.openlibrary.org/user/anand</uri>
> </author>

OK.

> 4. Some of the URLs are wrong, e.g.:
>
> In the following example, we're changing the entry's body from its old
> value.
>
> PUT /myFeed/1/1/

Yes. It was a typo.

> Shouldn't that be just /foo.atom

No. /foo/atom/1. for detecting version conflicts.

Revision history for this message

Anand Chitipothu (anandology) wrote on 2007-08-31:

#10

What about the open issues?

Revision history for this message

Anand Chitipothu (anandology) wrote on 2007-09-21:

#11

>
> <entry xmlns="http://www.w3.org/2005/Atom"
> xmlns:i="http://infogami.org/schema"
> xmlns:d="http://demo.openlibrary.org/type/page">
> <i:type>type/page</type>
> <d:title>Foo</d:title>
> <d:body>Bar</d:body>
> </entry>

There is a problem with this approach. How to represent lists?
Because we are not showing the parent information, there is also a
need to represent dictionaries sometimes.
For example, value of properties field of /type/page/feed is list of
dicts.

['title': {'type': 'type/string', 'unique': True, 'description':''},
'body' : {'type': 'type/text', 'unique':True, 'description': ''}]

I think plist kind of format suits better than above flat format.

<entry xmlns="http://www.w3.org/2005/Atom">
     <key>name</key>
     <string>type/page</string>
     <key>created</key>
     <timestamp>2006-01-23T16:26:03-08:00</timestamp>
     <key>author</key>
     <ref>user/anand</ref>
     <key>type</key>
     <ref>type/type</ref>
     <key>revision</key>
     <int>7</int>
     <key>data</key>
     <dict>
         <key>description</key>
         <string></string>
         <key>is_primitive</string>
         <boolean>false</boolean>
         <key>properties</key>
         <list>
             <key>title</key>
             <dict>
                 <key>type</key>
                 <ref>type/string</ref>
                 <key>unique</key>
                 <boolean>true</ref>
                 <key>description</key>
                 <string></string>
             </dict>
             <key>body</key>
             <dict>
                 <key>type</key>
                 <ref>type/text</ref>
                 <key>unique</key>
                 <boolean>true</ref>
                 <key>description</key>
                 <string></string>
             </dict>
         </list>
</entry>

Revision history for this message

Aaron Swartz (aaronsw) wrote on 2007-10-03:

#12

On 9/20/07, Anand <email address hidden> wrote:
> >
> > <entry xmlns="http://www.w3.org/2005/Atom"
> > xmlns:i="http://infogami.org/schema"
> > xmlns:d="http://demo.openlibrary.org/type/page">
> > <i:type>type/page</type>
> > <d:title>Foo</d:title>
> > <d:body>Bar</d:body>
> > </entry>
>
> There is a problem with this approach. How to represent lists?

Just repeat the element:

<d:tag>food</d:tag>
<d:tag>drink</d:tag>

> Because we are not showing the parent information, there is also a
> need to represent dictionaries sometimes.
> For example, value of properties field of /type/page/feed is list of
> dicts.
>
> ['title': {'type': 'type/string', 'unique': True, 'description':''},
> 'body' : {'type': 'type/text', 'unique':True, 'description': ''}]

<t:title>
   <t:type>type/string</t:type>
   <t:unique>true</t:unique>
   <t:description></t:description>
</t:title>

> I think plist kind of format suits better than above flat format.

I can definitely see the appeal of plist format, but I think it will
be really hard for most people to parse. I think the above proposal
will be more compatible.

Revision history for this message

Anand Chitipothu (anandology) wrote on 2007-10-09:

#13

Why not provide a JSON API like freebase?
I strongly feel that with structured data, it is going to very difficult if we stick with a flat format.

Soon, we are going to need something like Compound Value Type[1,2] in freebase. Properties is just one such example.

[1]: http://blog.freebase.com/?p=5
[2]: http://www.freebase.com/view/%239202a8c04000641f8000000003caca8e

Revision history for this message

Aaron Swartz (aaronsw) wrote on 2007-10-09:

#14

Sure, a Freebase-style REST JSON interface sounds great.

Revision history for this message

rejon (rejon) wrote on 2007-10-09:

#15

On Tue, 2007-10-09 at 16:02 +0000, Aaron Swartz wrote:
> Sure, a Freebase-style REST JSON interface sounds great.
>

Yes, this is brilliant to add...

Jon

--
Jon Phillips

San Francisco, CA
USA PH 510.499.0894
<email address hidden>
http://www.rejon.org

MSN, AIM, Yahoo Chat: kidproto
Jabber Chat: <email address hidden>
IRC: <email address hidden>

Please note: the contents of this email are not intended to be
legal advice nor should they be relied upon as or represented to be
legal advice. Jon Phillips does not represent any organization through
this email address.

Revision history for this message

David Strauss (davidstrauss) wrote on 2008-04-15:

#16

Just a note on integration with Wikipedia: a read-only API will serve our needs just fine. I'd hate to significantly delay a read-only API with the complexities of figuring out the write API.

Revision history for this message

solrize (solrize) wrote on 2008-04-15:

#17

The current (new) implementation is pretty close to the stuff described above. We need documentation...

Revision history for this message

rejon (rejon) wrote on 2008-04-18:

#18

Copying David's last message to the list:

I've started the initial work on a module to integrate with Wikipedia (MediaWiki). This first version will use screen-scraping to pull data from OL in order to avoid delaying the project while Open Library develops its API.

Specifically, Wikimedia's API priorities are:
* An OpenSearch foundation for querying the system
  * Most importantly, this should allow querying by unique IDs in Open Library
  * Later, we will need full-text searching and querying by Open Library record timestamps for creation and updates
* Basic XML records delivered as results from the system
  * Enough to fill in Wikipedia's citation templates

We'd like to move forward on API implementation over the next few weeks.

###

Yes, totally moving on this asap...Anand, any help on the documentation on the AP?

Revision history for this message

Anand Chitipothu (anandology) wrote on 2008-04-18:

#19

> Specifically, Wikimedia's API priorities are:
> * An OpenSearch foundation for querying the system
> * Most importantly, this should allow querying by unique IDs in Open Library
> * Later, we will need full-text searching and querying by Open Library record timestamps for creation and updates
> * Basic XML records delivered as results from the system
> * Enough to fill in Wikipedia's citation templates
>
> We'd like to move forward on API implementation over the next few weeks.

Great!

> Yes, totally moving on this asap...Anand, any help on the documentation
> on the AP?

I am working on the documentation. It will be ready with in a day.

Revision history for this message

webchick (webchick) wrote on 2008-04-18:

#20

Anand sent me some docs yesterday that I am working on. I am going to put them together in a cohesive package, and send it to him for review. I should have them ready by this weekend.

Revision history for this message

rejon (rejon) wrote on 2008-04-18:

#21

Awesome!

On Fri, 2008-04-18 at 16:10 +0000, Anand Chitipothu wrote:
> > Specifically, Wikimedia's API priorities are:
> > * An OpenSearch foundation for querying the system
> > * Most importantly, this should allow querying by unique IDs in Open Library
> > * Later, we will need full-text searching and querying by Open Library record timestamps for creation and updates
> > * Basic XML records delivered as results from the system
> > * Enough to fill in Wikipedia's citation templates
> >
> > We'd like to move forward on API implementation over the next few weeks.
>
> Great!
>
> > Yes, totally moving on this asap...Anand, any help on the documentation
> > on the AP?
>
> I am working on the documentation. It will be ready with in a day.
>
--
Jon Phillips
San Francisco, CA
CHINA +86 1-360-282-8624
http://www.rejon.org
IM/skype: kidproto
Jabber: <email address hidden>
IRC: <email address hidden>

Revision history for this message

webchick (webchick) wrote on 2008-04-18:

#22

(disregard my message then - if you can get them done sooner without me, by all means - go for it! I can do my part later ...)

raj (raj-archive) on 2009-02-05

Changed in openlibrary:
milestone:	1.0 → 1.6

Anand Chitipothu (anandology) on 2009-03-03

Changed in openlibrary:
status:	Confirmed → Fix Committed

Anand Chitipothu (anandology) on 2009-04-27

Changed in openlibrary:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.