meta data request to amazon failed for some user agents

Bug #1827027 reported by Peter Richter on 2019-04-30
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Undecided
Unassigned

Bug Description

I noticed that some metadata queries to Amazon went wrong even though they had a correct ID.
Here is an example error log for such a case:

----------------------------------------------------------------------------------------------------
calibre, version 3.41.3
FEHLER: Keine Übereinstimmungen gefunden: <p>Es wurden keine Bücher für die aktuelle Suche gefunden. Versuchen Sie, die Suche <b>weniger spezifisch</b> zu formulieren. Verwenden Sie beispielsweise nur den Nachnamen des Autors und ein einzelnes, prägnantes Wort aus dem Titel.<p>Um den gesamten Protokoll zu lesen, klicken Sie auf "Details anzeigen".

Running identify query with parameters:
{u'authors': [u'Alfred Bekker'], u'timeout': 30, u'title': u'John Devlin, Blood Empire - Widerg\xe4nger: Cassiopeiapress Vampir Roman (German Edition)', u'identifiers': {u'mobi-asin': u'B07L1T6JP5'}}
Using plugins: Amazon.com (1, 2, 7)
The log from individual plugins is below

****************************** Amazon.com (1, 2, 7) ******************************
Found 0 results
Downloading from Amazon.com took 3.26099991798
User-agent: Mozilla/5.0 (Linux; Android 8.0.0; VTR-L29; rv:63.0) Gecko/20100101 Firefox/63.0
Server: amazon
Getting details from: https://www.amazon.de/dp/B07L1T6JP5
Error parsing title for url: u'https://www.amazon.de/dp/B07L1T6JP5'
Traceback (most recent call last):
  File "<string>", line 359, in parse_details
  File "<string>", line 496, in parse_title
ValueError: No title block found

Could not find title/authors/asin for u'https://www.amazon.de/dp/B07L1T6JP5'
ASIN: u'B07L1T6JP5' Title: None Authors: []

********************************************************************************
The identify phase took 3.41 seconds
The longest time (3.261000) was taken by: Amazon.com
Merging results from different sources
We have 0 merged results, merging took: 0.00 seconds
----------------------------------------------------------------------------------------------------

Some tests have shown that this depends on the user agent text used. For all user agents with iPhone or Android platform the result page has no h1 Tag for the title and parsing the title failed.
They all have instead:

----------------------------------------------------------------------------------------------------
<span id="ebooksTitle" class="a-size-base a-text-bold" role="heading">
John Devlin, Blood Empire - Widerg&auml;nger: Cassiopeiapress Vampir Roman
</span>
----------------------------------------------------------------------------------------------------

For the server amazon.com the result is the same.
So I think the user agents in question should be excluded or the parser should be extended.

Best regards, Peter

Fixed in branch master. The fix will be in the next release. calibre is usually released every alternate Friday.

 status fixreleased

Changed in calibre:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers