[Enhancement] Get tags from Amazon

Bug #1206763 reported by James
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Fix Released
Undecided
Martha Herr

Bug Description

Could the Amazon metadata plug-in pull the info contained in the "Look for Similar Items by Category" as tags?

It's defined on most book details pages in every language and the categories are well defined and fairly stable so it'd make a good source for tag info.
There are some extraneous "tags" that differ depending on the language of the site used due to the differences in core categories used by each site (Ex. "Books, Kindle Store & Kindle eBooks" frequently appear on the .com site, while things like "Special Features, By Authors, and Authors & Illustrators, A-Z" appear on the .jp site). The trade off of actually getting tags for many books not listed on other sources' sites would be more than worth the effort needed to clean out a few extraneous tags via the tag editor.

Revision history for this message
Kovid Goyal (kovid) wrote : Fixed in master

Fixed in branch master. The fix will be in the next release. calibre is usually released every Friday.

 status fixreleased

Changed in calibre:
status: New → Fix Released
Martha Herr (the503girl)
Changed in calibre:
assignee: nobody → Martha Herr (the503girl)
Revision history for this message
James (ecwfrk) wrote :

Any chance of this functionality being restored? It disappeared with the recent Amazon updates. The tags have moved to Amazon Author Rank under the More About the Author section.

Revision history for this message
Kovid Goyal (kovid) wrote : Re: calibre bug 1206763

I cant get the author rank feature to appear reliably in the html from
amazon, it either uses javascript or is only present for very few books
or only certain book types/geographical regions. As such it is too
unreliable to parse.

Revision history for this message
Kovid Goyal (kovid) wrote :

Confirmed, the entire More About the Author section is loaded
dynamically with javascript. There is however an Amazon Best Sellers
Rank sectionf or some (presumably best selling) books that can serve as
a source for tags, which I will implement.

Revision history for this message
James (ecwfrk) wrote :

Yeah, it also looks like they also only appear on details pages for paper books. I should have looked deeper before asking.

The Tags under the best sellers ranking seem to mostly be limited to the common, very high ranking books.

But it looks like everything found in the "Kindle Store" category, as well as quite a few of the older and self-published books, still display the Look for Similar Items by Category section (URLs to a couple examples below).
Those are the kinds of books it's really valuable to get Amazon tags for as they contain a lot of free, self published and obscure books that often don't show up on any other data source. Would it be possible to have Calibre pick those up when they exist? Maybe even have an option somewhere to limit title/author searches to the Kindle Store to ensure they do (if possible and practical).

http://www.amazon.com/Knights-Divinity-Fantasy-Series-ebook/dp/B005620I2M
http://www.amazon.com/Historical-Atlas-World-Rand-McNally/dp/0528004913/
http://www.amazon.com/UnEnchanted-Unfortunate-Fairy-Tale-ebook/dp/B006ROK1UM/

Revision history for this message
Kovid Goyal (kovid) wrote :

UNfortunately, it isn;t that simple, for example, when I visit
http://www.amazon.com/Knights-Divinity-Fantasy-Series-ebook/dp/B005620I2M

I see a page with no get similar books, but I do see an Amazon Best
Sellers rank. I can only assume that one or more of the following are
true:

1) Amazon deliberately serves different data based on the geographic location of the
connecting IP address
2) Amazon's database is severly fragmented globally with inconsistent
data

This makes it rather difficult to automatically scrape tags in even a
semi-reliable fashion.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.