Amazon source picks up ratings from ANY book on the page (suggested etc)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
calibre |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
The XPaths in lib/calibre/
def parse_rating (self, root, isbn):
# pick up link "N customer reviews" that links to the current book page
nrev = root.xpath ("//a [contains (@href, '/{}')]/text() [contains (., ' customer review'
if nrev: # number of reviews
nrev = nrev [0]
# find number of stars
stars = nrev.xpath ("..//* [contains (@title, ' out of ') and contains (@title, ' stars')]/@title")
nrev = nrev.xpath ("text()") [0].split (" ", 1) [0]
if stars:
stars = re.match ("([0-9.]+) out of ([0-9.]+) stars$", stars [0])
if stars:
stars = float (stars.group (1)) / float (stars.group (2))
# return stars, nrevs
It picks up any link with the text "N customer reviews" *that links to the current book page* (not to some other book). Of course, this code needs to i18n etc.
If you wish to exclude the ratings from the recommended books, a better
approach is to detect and remove that section from root before running
the xpaths. The xpaths are loose for a reason, amazons various servers
and software generations across the worls serve up very different
markup.