unportable assumption about default character set

Bug #1522052 reported by Thomas Klausner on 2015-12-02
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Undecided
Unassigned

Bug Description

lxml-3.5.0 on NetBSD 7.99.22 with python-3.4.3 in the default C locale fails two tests:

Doctest: xpathxslt.txt
======================================================================
ERROR: test_etree_parse_io_error (lxml.tests.test_io.ETreeIOTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/pkg/lib/python3.4/unittest/case.py", line 58, in testPartExecutor
    yield
  File "/usr/pkg/lib/python3.4/unittest/case.py", line 577, in run
    testMethod()
  File "/disk/3/archive/obj/textproc/py-lxml/work/lxml-3.5.0/src/lxml/tests/test_io.py", line 276, in test_etree_parse_io_error
    dn = tempfile.mkdtemp(prefix=dirnameRU)
  File "/usr/pkg/lib/python3.4/tempfile.py", line 295, in mkdtemp
    _os.mkdir(file, 0o700)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-18: ordinal not in range(128)

======================================================================
ERROR: test_etree_parse_io_error (lxml.tests.test_io.ElementTreeIOTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/pkg/lib/python3.4/unittest/case.py", line 58, in testPartExecutor
    yield
  File "/usr/pkg/lib/python3.4/unittest/case.py", line 577, in run
    testMethod()
  File "/disk/3/archive/obj/textproc/py-lxml/work/lxml-3.5.0/src/lxml/tests/test_io.py", line 276, in test_etree_parse_io_error
    dn = tempfile.mkdtemp(prefix=dirnameRU)
  File "/usr/pkg/lib/python3.4/tempfile.py", line 295, in mkdtemp
    _os.mkdir(file, 0o700)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-18: ordinal not in range(128)

----------------------------------------------------------------------
Ran 1735 tests in 24.823s
````

If I understand the test correctly, it tries to convert a Russian string to an ASCII string (since the default C locale on NetBSD is ASCII) and fails, and that makes the test fail. In my understanding the test works on Linux because the default C locale is UTF-8 there and the conversion works.

I don't know what the test really wants to test. Please change this test to be more portable.

Thomas Klausner (tk-giga) wrote :

Having debugged a similar problem for cookiecutter, I'm pretty sure that this problem will also appear on Linux when you unset LANG* and LC_*, since the Linux C locale is ASCII too.

Thomas Klausner (tk-giga) wrote :

Still there in 3.6.1.

Thomas Klausner (tk-giga) wrote :

Same in 3.6.4.

Thomas Klausner (tk-giga) wrote :

This bug is still there in 3.8.0, with slightly different line numbers:

======================================================================
ERROR: test_etree_parse_io_error (lxml.tests.test_io.ETreeIOTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/pkg/lib/python3.6/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/usr/pkg/lib/python3.6/unittest/case.py", line 605, in run
    testMethod()
  File "/scratch/textproc/py-lxml/work/lxml-3.8.0/src/lxml/tests/test_io.py", line 276, in test_etree_parse_io_error
    dn = tempfile.mkdtemp(prefix=dirnameRU)
  File "/usr/pkg/lib/python3.6/tempfile.py", line 368, in mkdtemp
    _os.mkdir(file, 0o700)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-18: ordinal not in range(128)

======================================================================
ERROR: test_etree_parse_io_error (lxml.tests.test_io.ElementTreeIOTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/pkg/lib/python3.6/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/usr/pkg/lib/python3.6/unittest/case.py", line 605, in run
    testMethod()
  File "/scratch/textproc/py-lxml/work/lxml-3.8.0/src/lxml/tests/test_io.py", line 276, in test_etree_parse_io_error
    dn = tempfile.mkdtemp(prefix=dirnameRU)
  File "/usr/pkg/lib/python3.6/tempfile.py", line 368, in mkdtemp
    _os.mkdir(file, 0o700)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 5-18: ordinal not in range(128)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers