BibleGateway importer crashes on non unicode urls
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
OpenLP | Status tracked in Trunk | |||||
2.0 |
Fix Released
|
Medium
|
Phill | |||
Trunk |
Fix Released
|
Medium
|
Tomas Groth |
Bug Description
The user tried to import "Dette er bibelen på dansk" from BibleGateway and got the following error. Reported on 2.0.3 confirmed on trunk.
See: http://
*OpenLP fejlrapport*Udgave: {u'full': u'2.0.3', u'version': u'2.0.3',
u'build': None}--- Undtagelsens detaljer. ---prøvede at importer en
bibel --- Undtagelsens traceback ---Traceback (most recent call last):
File
D:\OpenLP_
line 188, in onCurrentIdChanged File
D:\OpenLP_
line 714, in performWizard File
D:\OpenLP_
line 537, in do_import File
D:\OpenLP_
line 276, in get_books_from_http File
D:\OpenLP_
line 467, in get_web_
byte 0xc3 in position 54: ordinal not in range(128)--- Information om
system ---Platform: Windows-
---Python: 2.7.3Qt4: 4.8.3Phonon: 4.4.0PyQt4: 4.9.5QtWebkit:
534.34SQLAlchemy: 0.7.7SQLAlchemy Migrate: 0.7.2BeautifulSoup:
3.2.1lxml: 2.3.0Chardet: 1.0.1PyEnchant: 1.6.5PySQLite: 1.0.1Mako:
0.7.0pyUNO bridge: -Inholdet i fejlrapporten bedes skrives på engelsk,
da udviklerne af OpenLP er fra mange forskellige nationaliteter.
Related branches
- Andreas Preikschat (community): Approve
- Raoul Snyman: Approve
-
Diff: 16 lines (+5/-1)1 file modifiedopenlp/core/utils/__init__.py (+5/-1)
- Tim Bentley: Approve
- Raoul Snyman: Approve
-
Diff: 72 lines (+32/-1)3 files modifiedopenlp/core/utils/__init__.py (+19/-0)
openlp/plugins/bibles/lib/http.py (+0/-1)
tests/interfaces/openlp_plugins/bibles/test_lib_http.py (+13/-0)
tags: | added: bible bible-import support-system |
summary: |
- BibleGateway importer crashes on non ASCII names + BibleGateway importer crashes on non unicode urls |
It appears that when a url encoded in unicode is requested with urllib2.urlopen .getUrl() returns unicode. However the url that we request in the biblegateway importer contains redirects. In this case the redirect uses an utf-8 url (im guessing that) urllib2 takes this as an utf-8 encoded url, so returns getUrl() encoded as utf-8. For the 2.0 branch I've just detected if getUrl return unicode or not. In trunk it might be wise to encode urls as utf-8 before we request them?