Crashes with UnicodeDecodeError when trying to handle paths with non-ascii chars

Bug #1694261 reported by Mattia Rizzolo
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
python-werkzeug (Ubuntu)
Fix Released
Medium
Unassigned
Trusty
Won't Fix
Medium
Mattia Rizzolo
Xenial
Won't Fix
Medium
Mattia Rizzolo

Bug Description

[Impact]

 * I've discovered this in a MoinMoin instance I admin, where there are pages with non-ascii character (i.e. àèéìòù), there trying to open a page with said name yields

[:info] mod_wsgi (pid=29522, process='', application='wiki.ubuntu-it.org:8801|'): Loading WSGI script '/srv/wiki.ubuntu-it.org/www/moin.wsgi'.
[:error] mod_wsgi (pid=29522): Exception occurred processing WSGI script '/srv/wiki.ubuntu-it.org/www/moin.wsgi'.
[:error] Traceback (most recent call last):
[:error] File "/srv/wiki.ubuntu-it.org/www/moin.wsgi", line 71, in __call__
[:error] return self.app(environ, start_response)
[:error] File "/usr/lib/python2.7/dist-packages/werkzeug/wsgi.py", line 558, in __call__
[:error] cleaned_path = cleaned_path.encode(sys.getfilesystemencoding())
[:error] UnicodeEncodeError: 'ascii' codec can't encode character u'\\xe0' in position 37: ordinal not in range(128)

[Test Case]

 * https://bugs.launchpad.net/ubuntu/+source/python-werkzeug/+bug/1694261/+attachment/4886272/+files/test.py

[Regression Potential]

 * The patch only changes all sys.getfilesystemencoding() calls with a local function, wrapping the same original call. The overall changeset seems low on regression potential.
 * I've installed the proposed package in the wiki.ubuntu-it.org server and it is working :)

[Other Info]

 * This is upstream bug https://github.com/pallets/werkzeug/issues/635 - fixed with upstream commit https://github.com/pallets/werkzeug/commit/bba0cdcc67d4a1160d4ed9d3f99aef170a79dd88 and released in version 0.11 and present in yakkety+.

Revision history for this message
Mattia Rizzolo (mapreri) wrote :
Changed in python-werkzeug (Ubuntu Trusty):
assignee: nobody → Mattia Rizzolo (mapreri)
importance: Undecided → Medium
status: New → In Progress
Changed in python-werkzeug (Ubuntu):
status: In Progress → Fix Released
Mattia Rizzolo (mapreri)
Changed in python-werkzeug (Ubuntu):
assignee: Mattia Rizzolo (mapreri) → nobody
Revision history for this message
Pietro Albini (pietroalbini) wrote :

Wrote a test for this bug, it's attached to this comment.

description: updated
Revision history for this message
Mattia Rizzolo (mapreri) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in python-werkzeug (Ubuntu Xenial):
status: New → Confirmed
Changed in python-werkzeug (Ubuntu Xenial):
assignee: nobody → Mattia Rizzolo (mapreri)
status: Confirmed → In Progress
Mattia Rizzolo (mapreri)
Changed in python-werkzeug (Ubuntu Xenial):
importance: Undecided → Medium
Revision history for this message
Robie Basak (racb) wrote :

Thank you for driving the fix for this bug.

Reading the upstream description though, it seems to me that the root cause is a misconfigured server with an unset locale? It seems that setting LANG=C.UTF-8 or similar would fix the problem for affected users, and is more correct. Is this correct? If so, is this SRU really necessary?

Could there be a case where a user deliberately intends the locale to be ASCII-based, and this SRU will cause a regression?

Revision history for this message
Robie Basak (racb) wrote :

10:53 <mapreri> rbasak: besides that I can't see why anybody would want to limit its filesystem to ASCII-only (which, btw, would be a purely artificial limitation I can't even think where to set), reading it as UTF-8 would Just Work.

10:57 <mapreri> Besides, I'm not convinced that I should be setting LANG=C.UTF-8 anywhere to just open a file that happens to have a non-ASCII filename.

10:57 <mapreri> and yes, setting LANG=C.UTF-8 would workaround the bug, but it's very much not the fix here.

10:58 <mapreri> (i.e. it worked before, it works after with the new upstream, it just doesn't work in-between…)

12:49 <rbasak> mapreri: would setting LANG correctly be a bug workaround, or would it be fixing the root cause? Do you know how it is that your system came to have LANG be unset?

12:49 <rbasak> mapreri: I'm concerned because the patch introduces a behaviour change, and some users may have problems with that in a way I can't necessarily predict.

12:50 <rbasak> mapreri: on the other hand, if you're impacted (I think by having a misconfigured server in the first place), then you can fix the behaviour on your own server without having something forcibly pushed out to everyone else.

12:51 <rbasak> mapreri: if you're concerned about doing it globally, you could arrange to set LANG just for that process even.

Revision history for this message
Robie Basak (racb) wrote :

I am not convinced that an SRU is warranted here, but comments welcome, or if another SRU team member disagrees, go ahead and accept.

I could be convinced if it turns out that the cause isn't a system misconfiguration (or otherwise happens in some default case), or if there's some wider user impact, or if this isn't trivial to work around by setting LANG.

Revision history for this message
Pietro Albini (pietroalbini) wrote :

I reproduced the bug in a fresh xenial container: I simply installed moin and apache in a new container, added moin as a wsgi script in apache and tried to create an /à page in the wiki. I guess the same also happens on trusty, but I don't have time to install moin in another container.

For the possible regression, as Mattia said in the chat log I don't think it will cause any regression: ASCII can be considered a subset of UTF-8, so even if someone forces ASCII for whatever reason nothing should break.

Revision history for this message
Robie Basak (racb) wrote :

I still fail to see why setting LANG correctly won't solve the problem for affected users. Nobody has given any explanation as to why this is not possible. Solving the problem this way eliminates all regression risk. So I'm rejecting this from the queue.

Changed in python-werkzeug (Ubuntu Trusty):
status: In Progress → Invalid
Changed in python-werkzeug (Ubuntu Xenial):
status: In Progress → Invalid
Changed in python-werkzeug (Ubuntu Trusty):
status: Invalid → Won't Fix
Changed in python-werkzeug (Ubuntu Xenial):
status: Invalid → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.