unportable assumption about default character set
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
Fix Released
|
Low
|
scoder |
Bug Description
lxml-3.5.0 on NetBSD 7.99.22 with python-3.4.3 in the default C locale fails two tests:
Doctest: xpathxslt.txt
=======
ERROR: test_etree_
-------
Traceback (most recent call last):
File "/usr/pkg/
yield
File "/usr/pkg/
testMethod()
File "/disk/
dn = tempfile.
File "/usr/pkg/
_os.mkdir(file, 0o700)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-18: ordinal not in range(128)
=======
ERROR: test_etree_
-------
Traceback (most recent call last):
File "/usr/pkg/
yield
File "/usr/pkg/
testMethod()
File "/disk/
dn = tempfile.
File "/usr/pkg/
_os.mkdir(file, 0o700)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-18: ordinal not in range(128)
-------
Ran 1735 tests in 24.823s
````
If I understand the test correctly, it tries to convert a Russian string to an ASCII string (since the default C locale on NetBSD is ASCII) and fails, and that makes the test fail. In my understanding the test works on Linux because the default C locale is UTF-8 there and the conversion works.
I don't know what the test really wants to test. Please change this test to be more portable.
Changed in lxml: | |
milestone: | 5.0 → 4.9.4 |
Changed in lxml: | |
status: | Fix Committed → Fix Released |
Having debugged a similar problem for cookiecutter, I'm pretty sure that this problem will also appear on Linux when you unset LANG* and LC_*, since the Linux C locale is ASCII too.