Internet Archive BookReader

Bug #606506
Comment #2

Comment 2 for bug 606506

Revision history for this message

Hank Bromley (hank-archive) wrote on 2010-07-18:

All of that except the leaf number => page index mapping is available now, in JSON, via URLs like:

http://www.archive.org/details/{itemID}&output=json

for instance,

http://www.archive.org/details/amisamile00kelmuoft&output=json

The returned JSON provides the location (server and directory path) of the image stack, the entire contents of meta.xml, files.xml, and reviews.xml, and a few bits of other information. It does not include the page mapping, as that would require fetching and parsing the scandata.xml, which is generally much larger than those other XML files, but I don't think it would be difficult to include the scandata on request (perhaps &output=json&scandata=1 ?).

The existing API does, by the way, also support a callback parameter:

http://www.archive.org/details/amisamile00kelmuoft&output=json&callback=my_func

Other related info:

Essentially the same information is also available in XML form via a find_file.php request:

http://www.archive.org/services/find_file.php?file=amisamile00kelmuoft

And if all one needs is the item location info (server and directory path), add "&loconly=1" to the request:

http://www.archive.org/services/find_file.php?file=amisamile00kelmuoft&loconly=1

Given the server, the scandata is available through the getScandata.php script:

http://ia341012.us.archive.org/getScandata.php?identifier=amisamile00kelmuoft

(The advantage of this route is that getScandata.php understands that scandata sometimes lives inside a zip file, and knows how to extract just the scandata portion from the zip.)