Open Library

FR: Create books.txt XML file from edit page

Bug #343744 reported by Matt Work on 2009-03-16

Affects		Status	Importance	Assigned to	Milestone
	Open Library	Won't Fix	High	Matt Work	Open Library may-release

Bug Description

when on a books edit page, we would like to provide the user with the option to generate a books.txt (definition to folow) XML style file

example edit page: http://openlibrary.org/b/OL7277480M/Cryptonomicon?m=edit
could simply have another button or link that generates file for user

an idea for books.txt (from www.lexcycle.com/developer)

<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <title>Online Catalog</title>
  <id>urn:uuid:09aeccc1-c633-aa48-22ab-000052cbd81c</id>
  <updated>2008-09-12T00:44:20+00:00</updated>
  <link rel="self" type="application/atom+xml" href="http://www.billybobsbooks.com/catalog/top.atom"/>
  <link rel="search" title="Search Billy Bob's Books" type="application/atom+xml" href="http://www.billybobsbooks.com/catalog/search.php?search={searchTerms}"/>
  <author>
    <name>Billy Bob</name>
    <uri>http://www.billybobsbooks.com</uri>
    <email><email address hidden></email>
  </author>
  <entry>
    <title>1984</title>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml"> Published: 1949 Subject: Novels Language: en</div>
    </content>
    <id>urn:billybobsbooks:1166</id>
    <author>
      <name>Orwell, George</name>
    </author>
    <updated>2008-09-12T00:44:20+00:00</updated>
    <link type="application/epub+zip" href="http://www.billybobsbooks.com/book/1984.epub"/>
    <link rel="x-stanza-cover-image-thumbnail" type="image/png" href="http://www.billybobsbooks.com/book/1984.png"/>
    <link rel="x-stanza-cover-image" type="image/png" href="http://www.billybobsbooks.com/book/1984.png"/>
  </entry>
  <entry>
    <title>The Art of War</title>
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">Published: -500 Subject: Non-Fiction Language: en</div>
    </content>
    <id>urn:billybobsbooks:168</id>
    <author>
      <name>Sun Tzu</name>
    </author>
    <updated>2008-09-12T00:44:20+00:00</updated>
    <link type="application/epub+zip" href="http://www.billybobsbooks.com/book/artofwar.epub"/>
    <link rel="x-stanza-cover-image-thumbnail" type="image/png" href="http://www.billybobsbooks.com/book/artofwar.png"/>
    <link rel="x-stanza-cover-image" type="image/png" href="http://www.billybobsbooks.com/book/artofwar.png"/>
  </entry>
</feed>

Tags:

Matt Work (mwork) on 2009-03-19

Changed in openlibrary:
milestone:	none → may-release

Revision history for this message

raj (raj-archive) wrote on 2009-03-30:

Make every edition page have a <link rel="alternate" type="application/atom+xml"> element that links to an atom feed for that edition.

The atom feed should correspond to what Lexcycle has defined: http://www.lexcycle.com/developer

Talk to Peter to make sure we are going this right.

Changed in openlibrary:
assignee:	nobody → edward-debian
importance:	Undecided → High
status:	New → Confirmed

Revision history for this message

Edward Betts (edwardbetts) wrote on 2009-03-30:

What is the problem we are trying to solve?

Is the JSON at http://openlibrary.org/b/OL7277480M.json not good enough?

Revision history for this message

Edward Betts (edwardbetts) wrote on 2009-03-30:

Who is Peter?

Revision history for this message

solrize (solrize) wrote on 2009-03-30:

Peter Brantley is a publishing expert (among other things) who recently joined the IA. The idea is to generate an Atom XML record for each book and put a link on the edition page, using the Lexcycle format that Peter apparently is familiar with.

I think I see how to do this: basically extend the edition plugin to emit the contents of the edition node in XML format, and add a query format that runs the extension. Raj assigned this to you because he didn't like my initial idea of doing it in the solr update daemon (which was a dumb idea, but was the first thing I thought of because I'm already converting OL data to XML that way). Maybe I should ask for it back.

Revision history for this message

Edward Betts (edwardbetts) wrote on 2009-03-30:

I'm just wondering what is going to be consuming the books.txt

Revision history for this message

Matt Work (mwork) wrote on 2009-03-30:

books.txt may not be the final name of the file, but the idea is that a search engine would be consuming the information

Revision history for this message

Edward Betts (edwardbetts) wrote on 2009-03-30:

An existing search engine that reads this format, or a new search engine?

Revision history for this message

solrize (solrize) wrote on 2009-03-30:

The idea is to export a standard format that other book cataloguers could index in their own search engines. (So is this really about re-inventing MARC?)

Revision history for this message

solrize (solrize) wrote on 2009-03-30:

Edward, I can take this bug unless you're eager to do it.

Revision history for this message

Aaron Swartz (aaronsw) wrote on 2009-03-31: Re: [Bug 343744] Re: FR: Create books.txt XML file from edit page

#10

Let's keep Edward on it; he's distant enough from the insanity to push
back on it until the request is sane.

Revision history for this message

raj (raj-archive) wrote on 2009-03-31:

#11

Matt, I think the Atom feeds should be on archive.org details pages, not OpenLibrary pages.

OpenLibrary doesn't have any books, just links to books.

The books we've scanned are on archive.org, so that's where the books.txt thing should live.

OpenLibrary might be useful for indexing the books.txt files out on the net, but most likely that would be a different project, since OL is just fed by marc records and doesn't do crawling.

Changed in openlibrary:
assignee:	edward-debian → mwork

Revision history for this message

Matt Work (mwork) wrote on 2009-03-31:

#12

I agree Raj. Can we move this to Archive.org?

Edward Betts (edwardbetts) on 2009-11-18

Changed in openlibrary:
status:	Confirmed → Won't Fix

Revision history for this message

raj (raj-archive) wrote on 2009-11-18:

#13

We did this as part of the BookServer project. It's amazing to see what we got done on BookServer in the eight months since this bug was filed!

Revision history for this message

raj (raj-archive) wrote on 2011-03-03:

#14

Open Library now supports OPDS for edition records!

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.