codecs.open(errors='strict') doesn't fail on invalid encoding with python3.7.4 RC1

Bug #1834236 reported by Olivier Tilloy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
python3.7 (Ubuntu)
Fix Released
High
Unassigned

Bug Description

The autopkgtests for package sphinx passed with Python 3.7.3, and regressed with 3.7.4 RC1.

See http://autopkgtest.ubuntu.com/packages/s/sphinx/eoan/amd64 (first failure on 2019-06-25 03:04:04 UTC).

This upstream commit appears to fix the problem: https://github.com/sphinx-doc/sphinx/commit/02fea02

This might be an indication that there's a bug in codecs.open in Python 3.7.4 RC1 ?

ProblemType: Bug
DistroRelease: Ubuntu 19.10
Package: python3.7 3.7.4~rc1-1
ProcVersionSignature: Ubuntu 5.0.0-17.18-generic 5.0.8
Uname: Linux 5.0.0-17-generic x86_64
ApportVersion: 2.20.11-0ubuntu3
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Tue Jun 25 20:09:54 2019
InstallationDate: Installed on 2018-08-21 (308 days ago)
InstallationMedia: Ubuntu 18.10 "Cosmic Cuttlefish" - Alpha amd64 (20180821)
SourcePackage: python3.7
UpgradeStatus: Upgraded to eoan on 2019-06-05 (20 days ago)

Revision history for this message
Olivier Tilloy (osomon) wrote :
Revision history for this message
Olivier Tilloy (osomon) wrote :

Note that I uploaded https://launchpad.net/ubuntu/+source/sphinx/1.8.5-1ubuntu2 with that upstream commit cherry-picked as a patch to unblock webkit2gtk.

Revision history for this message
Iain Lane (laney) wrote :

laney@raleigh> python3.7 --version
Python 3.7.3
laney@raleigh> python3.7 -c "import codecs; codecs.open('/home/laney/temp/sphinx/tests/roots/test-root/wrongenc.inc', encoding='utf-8-sig', errors='strict').read()"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.7/codecs.py", line 701, in read
    return self.reader.read(size)
  File "/usr/lib/python3.7/codecs.py", line 504, in read
    newchars, decodedbytes = self.decode(data, self.errors)
  File "/usr/lib/python3.7/encodings/utf_8_sig.py", line 117, in decode
    return codecs.utf_8_decode(input, errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xdf in position 70: invalid continuation byte
laney@raleigh> lxc exec e-amd64 -- python3.7 --version
Python 3.7.4rc1
laney@raleigh> lxc exec e-amd64 -- python3.7 -c "import codecs; codecs.open('/home/laney/temp/sphinx/tests/roots/test-root/wrongenc.inc', encoding='utf-8-sig', errors='strict').read()"

Revision history for this message
Iain Lane (laney) wrote :

here's the referenced file

summary: - sphinx autopkgtest failures with Python 3.7.4 RC1
+ codecs.open(errors='strict') doesn't fail on invalid encoding with
+ python3.7.4 RC1
Revision history for this message
Iain Lane (laney) wrote :

It used to throw an error for this incorrect file, but as of 3.7.4 RC1 no exception is thrown.

Revision history for this message
Dmitry Shachnev (mitya57) wrote :

The error is caused by this change: https://github.com/python/cpython/pull/12603

This should fix it (not tested though): https://github.com/python/cpython/pull/14304

tags: added: rls-ee-incoming
Changed in python3.7 (Ubuntu):
importance: Undecided → High
Revision history for this message
Dmitry Shachnev (mitya57) wrote :

This was fixed in python3.7 3.7.4~rc2.

Changed in python3.7 (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.