UnicodeDecodeError from broken package descriptions

Bug #1053749 reported by Mathias Burén on 2012-09-21
316
This bug affects 72 people
Affects Status Importance Assigned to Milestone
dpkg (Ubuntu)
High
Unassigned
Quantal
High
Unassigned
Raring
High
Unassigned
ubuntu-drivers-common (Ubuntu)
High
Martin Pitt
Quantal
High
Unassigned
Raring
High
Martin Pitt

Bug Description

Attempting to launch software-properties-gtk results in this:

$ software-properties-gtk
gpg: /tmp/tmpsw0n10/trustdb.gpg: trustdb created
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 162, in packages_for_modalias
    cache_map = packages_for_modalias.cache_maps[apt_cache_hash]
KeyError: 3989481

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/software-properties-gtk", line 103, in <module>
    app = SoftwarePropertiesGtk(datadir=options.data_dir, options=options, file=file)
  File "/usr/lib/python3/dist-packages/softwareproperties/gtk/SoftwarePropertiesGtk.py", line 178, in __init__
    self.init_drivers()
  File "/usr/lib/python3/dist-packages/softwareproperties/gtk/SoftwarePropertiesGtk.py", line 1097, in init_drivers
    self.devices = detect.system_device_drivers()
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 415, in system_device_drivers
    for pkg, pkginfo in system_driver_packages(apt_cache).items():
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 319, in system_driver_packages
    for p in packages_for_modalias(apt_cache, alias):
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 164, in packages_for_modalias
    cache_map = _apt_cache_modalias_map(apt_cache)
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 129, in _apt_cache_modalias_map
    m = package.candidate.record['Modaliases']
  File "/usr/lib/python3/dist-packages/apt/package.py", line 429, in record
    return Record(self._records.record)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xeb in position 114: invalid continuation byte

ProblemType: Bug
DistroRelease: Ubuntu 12.10
Package: software-properties-gtk 0.92.6
ProcVersionSignature: Ubuntu 3.5.0-15.22-generic 3.5.4
Uname: Linux 3.5.0-15-generic x86_64
ApportVersion: 2.5.2-0ubuntu4
Architecture: amd64
Date: Fri Sep 21 08:54:17 2012
InstallationMedia: Ubuntu 12.10 "Quantal Quetzal" - Alpha amd64 (20120905.2)
PackageArchitecture: all
SourcePackage: software-properties
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Mathias Burén (mathias-buren) wrote :
Revision history for this message
Mathias Burén (mathias-buren) wrote :
Revision history for this message
Mathias Burén (mathias-buren) wrote :

/etc/apt/sources.list.d$ for I in *.list;do echo $I;cat $I;echo;done
google-chrome.list
### THIS FILE IS AUTOMATICALLY CONFIGURED ###
# You may comment out this entry, but any other modifications may be lost.
deb http://dl.google.com/linux/chrome/deb/ stable main

mozillateam-firefox-next-quantal.list
deb http://ppa.launchpad.net/mozillateam/firefox-next/ubuntu quantal main

virtualbox.list
deb http://download.virtualbox.org/virtualbox/debian quantal contrib

webupd8team-java-quantal.list
deb http://ppa.launchpad.net/webupd8team/java/ubuntu quantal main
deb-src http://ppa.launchpad.net/webupd8team/java/ubuntu quantal main

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in software-properties (Ubuntu):
status: New → Confirmed
Revision history for this message
Jan Henke (jhe) wrote :

Push, this bug is a serious breaker. I needs to be fixed in quantal asap!

Revision history for this message
xtsbdu3reyrbrmroezob (xtsbdu3reyrbrmroezob) wrote :

Moving and rebuilding cache did not work for me.

/var/cache/apt/pkgcache.bin is still the same after rebuild.

Error looks like this when starting the app still.

http://paste.ubuntu.com/1290050/

Revision history for this message
xtsbdu3reyrbrmroezob (xtsbdu3reyrbrmroezob) wrote :

$ software-properties-gtk
gpg: /tmp/tmp1_cuxf/trustdb.gpg: trustdb created
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 162, in packages_for_modalias
    cache_map = packages_for_modalias.cache_maps[apt_cache_hash]
KeyError: 3315533

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/software-properties-gtk", line 103, in <module>
    app = SoftwarePropertiesGtk(datadir=options.data_dir, options=options, file=file)
  File "/usr/lib/python3/dist-packages/softwareproperties/gtk/SoftwarePropertiesGtk.py", line 178, in __init__
    self.init_drivers()
  File "/usr/lib/python3/dist-packages/softwareproperties/gtk/SoftwarePropertiesGtk.py", line 1097, in init_drivers
    self.devices = detect.system_device_drivers()
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 415, in system_device_drivers
    for pkg, pkginfo in system_driver_packages(apt_cache).items():
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 319, in system_driver_packages
    for p in packages_for_modalias(apt_cache, alias):
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 164, in packages_for_modalias
    cache_map = _apt_cache_modalias_map(apt_cache)
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 129, in _apt_cache_modalias_map
    m = package.candidate.record['Modaliases']
  File "/usr/lib/python3/dist-packages/apt/package.py", line 429, in record
    return Record(self._records.record)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xeb in position 114: invalid continuation byte

Revision history for this message
xtsbdu3reyrbrmroezob (xtsbdu3reyrbrmroezob) wrote :

$ grep-available -r . | iconv -f utf-8 -t ucs-2le > /dev/null; echo $?
iconv: illegal input sequence at position 226269
1

Revision history for this message
xtsbdu3reyrbrmroezob (xtsbdu3reyrbrmroezob) wrote :

$ for F in /var/lib/apt/lists/*Packages; do iconv -f utf-8 -t ucs-2le $F > /dev/null || echo $F; done
$ echo $?
0

Revision history for this message
Steve Langasek (vorlon) wrote :

This issue has been reported on IRC today. The problem seems to trace back to a locally-installed package with a non-utf8 maintainer field:

Package: davmail
Maintainer: Micka�l Guessant <email address hidden>

Of course, this package fails to comply with Debian policy, but that clearly didn't stop the user from being able to install it - which means the data is in the system and we need to be able to cope with it.

I'm not sure if this needs to be fixed in ubuntu-drivers or in python-apt. Reassigning to ubuntu-drivers for the moment.

Changed in software-properties (Ubuntu):
importance: Undecided → High
status: Confirmed → Triaged
affects: software-properties (Ubuntu) → ubuntu-drivers-common (Ubuntu)
Changed in ubuntu-drivers-common (Ubuntu Quantal):
status: New → Triaged
importance: Undecided → High
Revision history for this message
xtsbdu3reyrbrmroezob (xtsbdu3reyrbrmroezob) wrote :

$ grep-available -r . | iconv -f utf-8 -t ucs-2le 1> /dev/null
iconv: illegal input sequence at position 226269

Revision history for this message
xtsbdu3reyrbrmroezob (xtsbdu3reyrbrmroezob) wrote :

$ grep-available -r . | head -c 226300 | tail -n 1
Maintainer: Micka�l Guessant <email address hidden>

Revision history for this message
xtsbdu3reyrbrmroezob (xtsbdu3reyrbrmroezob) wrote :

$ grep-available -r . | head -c 226300 | tail -n 6

Package: davmail
Priority: extra
Section: mail
Installed-Size: 5401
Maintainer: Micka�l Guessant <email address hidden>

Revision history for this message
xtsbdu3reyrbrmroezob (xtsbdu3reyrbrmroezob) wrote :

$ aptitude show davmail
Package: davmail
State: installed
Automatically installed: no
Version: 3.9.9-1976-1
Priority: extra
Section: mail
Maintainer: Micka?l Guessant <email address hidden>
Architecture: all
Uncompressed Size: 5,531 k
Depends: openjdk-7-jre | openjdk-6-jre | sun-java6-jre, libswt-gtk-3-java | libswt-gtk-3.6-java | libswt-gtk-3.5-java | libswt-gtk-3.4-java
Description: DavMail POP/IMAP/SMTP/Caldav/Carddav/LDAP Exchange Gateway
 Ever wanted to get rid of Outlook ? DavMail is a POP/IMAP/SMTP/Caldav/Carddav/LDAP exchange gateway allowing users to use any mail/calendar client (e.g. Thunderbird
 with Lightning or Apple iCal) with an Exchange server, even from the internet or behind a firewall through Outlook Web Access. DavMail now includes an LDAP gateway to
 Exchange global address book and user personal contacts to allow recipient address completion in mail compose window and full calendar support with attendees free/busy
 display. The main goal of DavMail is to provide standard compliant protocols in front of proprietary Exchange. This means LDAP for global address book, SMTP to send
 messages, IMAP to browse messages on the server in any folder, POP to retrieve inbox messages only, Caldav for calendar support and Carddav for personal contacts sync.
 Thus any standard compliant client can be used with Microsoft Exchange. DavMail gateway is implemented in java and should run on any platform. Releases are tested on
 Windows, Linux (Ubuntu) and Mac OSX. Tested successfully with the Iphone (gateway running on a server).

 http://davmail.sourceforge.net

Revision history for this message
xtsbdu3reyrbrmroezob (xtsbdu3reyrbrmroezob) wrote :

Editing /var/lib/dpkg/available to remove the odd character had no effect on fixing the root cause.

Revision history for this message
xtsbdu3reyrbrmroezob (xtsbdu3reyrbrmroezob) wrote :

$ grep-status -r . | iconv -f utf-8 -t ucs-2le 1> /dev/null; echo $?
iconv: illegal input sequence at position 223829
1

Revision history for this message
xtsbdu3reyrbrmroezob (xtsbdu3reyrbrmroezob) wrote :

Removing the odd character from /var/lib/dpkg/status as well FIXES THE ISSUE. Huge thanks to slangasek in #ubuntu-devel for the troubleshooting.

Revision history for this message
xtsbdu3reyrbrmroezob (xtsbdu3reyrbrmroezob) wrote :
Revision history for this message
mondhs (mondhs) wrote :

I confirm that clean up /var/lib/dpkg/status fixed the issue.

As I do not know right way to do so: I opened the file ans saved with gedit:

sudo gedit /var/lib/dpkg/status

gedit warned me couple of time that I could corupt the file, but actualy it fixed me software-properties-gtk.

Thank to Kristian for a hint.

Revision history for this message
akanewsted (akanewsted) wrote :

Opened with gedit and saved.. also worked for me,

thank you mondhs

Martin Pitt (pitti) on 2012-10-21
Changed in ubuntu-drivers-common (Ubuntu Quantal):
status: Triaged → Invalid
Changed in ubuntu-drivers-common (Ubuntu Raring):
status: Triaged → Invalid
Revision history for this message
teranex (teranex) wrote :

I had this problem with the eid-mw and eid-viewer packages. These are provided by the Belgian governement for our electronic passports so I could have guessed that those would be the problem... anyway, I edited both files, changed the 'ë' in e and it fixed the problem

Revision history for this message
rod singleton (rod40cool) wrote :

Confirmed fixed with me also using gedit as per #19. Davmail was the culprit for me.

Thanks Kristian & mondhs

Revision history for this message
xyloman (xyloman) wrote :

Davmail was also the issue for me. Opening /var/lib/dpkg/status in gedit and saving it resolved the issue with opening sofware-properties-gtk.

Revision history for this message
Jan-Åke Larsson (jalar) wrote :

Davmail was the issue for me too. Anyone cared to tell mr Guessant?

Martin Pitt (pitti) on 2012-10-23
Changed in ubuntu-drivers-common (Ubuntu Quantal):
status: Invalid → Confirmed
Changed in ubuntu-drivers-common (Ubuntu Raring):
status: Invalid → Confirmed
Martin Pitt (pitti) on 2012-10-23
affects: ubuntu-drivers-common (Ubuntu Raring) → dpkg (Ubuntu Raring)
summary: - software-properties-gtk cannot launch
+ installing davmail breaks /var/lib/dpkg/available
Revision history for this message
Steve Langasek (vorlon) wrote : Re: installing davmail breaks /var/lib/dpkg/available

Arguably dpkg could be enforcing the policy requirement that all package fields be UTF8-encoded. However, that doesn't help users who have already installed this package - dpkg isn't going to scrub this data for already-seen packages. The consumers of this data really need to cope with the wrong encodings.

Changed in ubuntu-drivers-common (Ubuntu Quantal):
status: New → Confirmed
importance: Undecided → High
Changed in ubuntu-drivers-common (Ubuntu Raring):
status: New → Confirmed
importance: Undecided → High
Martin Pitt (pitti) on 2012-10-23
Changed in dpkg (Ubuntu Quantal):
status: Confirmed → Invalid
Changed in ubuntu-drivers-common (Ubuntu Raring):
status: Confirmed → Triaged
Martin Pitt (pitti) on 2012-10-23
Changed in ubuntu-drivers-common (Ubuntu Raring):
assignee: nobody → Martin Pitt (pitti)
Changed in ubuntu-drivers-common (Ubuntu Quantal):
status: Confirmed → Triaged
Changed in dpkg (Ubuntu Raring):
status: Confirmed → Invalid
Martin Pitt (pitti) on 2012-10-23
summary: - installing davmail breaks /var/lib/dpkg/available
+ UnicodeDecodeError from broken package descriptions
Revision history for this message
Tobias Leich (cppege430-e079f-9ei9nyjpw) wrote :

I drop him (Micka�l Guessant, davmail) a note.

Revision history for this message
Mickaël Guessant (mguessan) wrote :

Confirmed: it's a bug in ant-deb-task which does not force file encoding => target control file encoding depends on build platform encoding

Revision history for this message
Mickaël Guessant (mguessan) wrote :

Fixed for next release: force UTF-8 file.encoding at build time

Revision history for this message
pabroome@gmail.com (pabroome) wrote :

This worked for me too jut had toedit the status file and save as UTF-8 many many thanks!

Paul

Revision history for this message
Rofko (lukejtmason) wrote :

I have had this problem twice in a couple of days - first with davmail, which I just removed using apt-get, and then with another programme, Scrivener (beta for linux, but very reputable) , which placed illegal characters in the same way. Removed them and saved - resolved the problem.
Rfk

Martin Pitt (pitti) on 2012-11-07
Changed in ubuntu-drivers-common (Ubuntu Raring):
status: Triaged → Fix Committed
dir schneid (d-schneid) on 2012-11-08
Changed in ubuntu-drivers-common (Ubuntu Raring):
status: Fix Committed → Fix Released
Changed in ubuntu-drivers-common (Ubuntu Raring):
status: Fix Released → Fix Committed
Revision history for this message
Martin Pitt (pitti) wrote :

I forgot to close the bug in the changelog:

ubuntu-drivers-common (1:0.2.72) raring; urgency=low

  [ Matthias Klose ]
  * Build-depend on python3-all.

  [ Dmitrijs Ledkovs ]
  * Use /usr/bin/python3 shebang.

  [ Martin Pitt ]
  * debian/tests/system: Fix duplicate output of error message for test
    failures.
  * tests/ubuntu_drivers.py, test_devices_detect_plugins(): Fix failure if
    special.py occurs first in the output. This bug was triggered by Python
    3.3's new hash randomization behaviour. (LP: #1071997)
  * UbuntuDrivers/detect.py: Fix UnicodeDecodeError crash when encountering a
    package with invalid UTF-8 encoding. Just skip those packages instead. Add
    test to tests/ubuntu_drivers.py.

 -- Martin Pitt <email address hidden> Wed, 07 Nov 2012 15:47:19 +0100

Changed in ubuntu-drivers-common (Ubuntu Raring):
status: Fix Committed → Fix Released
Revision history for this message
John (john-e-francis) wrote :
Revision history for this message
cyd (cyd) wrote :

Removed davmail and working

Revision history for this message
adamski99 (adamsomerville) wrote :

so i dont have any reference to davmail in /var/lib/dpkg/source or /var/lib/dpkg/available here on 12.10, can someone sugest a way to find the offending character?

cheers

kenan (kenan23) on 2012-12-22
Changed in ubuntu-drivers-common (Ubuntu Quantal):
status: Triaged → Fix Released
Revision history for this message
Derek (bugs-m8y) wrote :

adamski99 - you can use the various iconv commands in the comments to locate the problematic line in your file, then edit it. You'd probably want to mention the package here, too.

Personally, it was davmail. Thanks for the fix.

Revision history for this message
Xavier Claessens (zdra) wrote :

For belgians: This bug happens if you install beid packages provided by the gov, because the maintainer's name is not valid UTF8.

Revision history for this message
Schlomo Schapiro (sschapiro) wrote :

Extremely annoying. I can imagine that most "users" actually have no chance of fixing this!

My problem is that the error remains evean after fixing the bad davmail packager. Any ideas what else to check?

Revision history for this message
Dustin Falgout (lots0logs) wrote :

I am also experiencing this issue on a Mint Linux 14 Cinnamon. I tried finding invalid characters in the files listed but there were none. I saved each file with gedit which did not help. I did have davmail installed for a day, but I uninstalled it before this error started.

software-properties-gtk --debug
Fontconfig warning: "/etc/fonts/conf.d/50-user.conf", line 9: reading configurations from ~/.fonts.conf is deprecated.
gpg: /tmp/tmpoqfkhc/trustdb.gpg: trustdb created
ENABLED COMPS: {'import', 'main', 'backport', 'upstream'}
INTERNET COMPS: {'import', 'main', 'backport', 'upstream'}
MAIN SOURCES
 URI: http://packages.linuxmint.com/
 Comps: ['main', 'upstream', 'import', 'backport']
 Enabled: True
 Valid: True
 MatchURI: packages.linuxmint.com
 BaseURI: http://packages.linuxmint.com/

CHILD SOURCES
CDROM SOURCES
SOURCE CODE SOURCES
DISABLED SOURCES
ISV
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 162, in packages_for_modalias
    cache_map = packages_for_modalias.cache_maps[apt_cache_hash]
KeyError: 3953453

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/software-properties-gtk", line 103, in <module>
    app = SoftwarePropertiesGtk(datadir=options.data_dir, options=options, file=file)
  File "/usr/lib/python3/dist-packages/softwareproperties/gtk/SoftwarePropertiesGtk.py", line 178, in __init__
    self.init_drivers()
  File "/usr/lib/python3/dist-packages/softwareproperties/gtk/SoftwarePropertiesGtk.py", line 1097, in init_drivers
    self.devices = detect.system_device_drivers()
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 415, in system_device_drivers
    for pkg, pkginfo in system_driver_packages(apt_cache).items():
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 319, in system_driver_packages
    for p in packages_for_modalias(apt_cache, alias):
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 164, in packages_for_modalias
    cache_map = _apt_cache_modalias_map(apt_cache)
  File "/usr/lib/python3/dist-packages/UbuntuDrivers/detect.py", line 129, in _apt_cache_modalias_map
    m = package.candidate.record['Modaliases']
  File "/usr/lib/python3/dist-packages/apt/package.py", line 429, in record
    return Record(self._records.record)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xeb in position 74: invalid continuation byte

Revision history for this message
serpass (serpass) wrote :

I had this problem caused by " zygrib" in Xubuntu (voyager) 12.10

Revision history for this message
Paul Anderson (paulimach) wrote :

i have slightly different bug: instead of position 114 it is position 796, like doug above who has position 74, that must have something to do with it?

Revision history for this message
Brandon Raabe (brandocorp) wrote :

Wanted to post and confirm that I had the same problem, and the fix in posts 16 and 17 fixed this for me. Wanted to say thanks!

Revision history for this message
Peter De Maeyer (peter-de-maeyer) wrote :

Using some of the above suggestions, I am still unable to identify the problematic character.

grep-status -r . | iconv -f utf-8 -t ucs-2le 1> /dev/null; echo $?
iconv: illegal input sequence at position 1907586
1

Sooo... What file do I need to investigate?

I don't have /var/lib/dpkg/source, and /var/lib/dpkg/available doesn't have a line number 1907586.
Possibly 1907586 is a character number rather than a line number, but I don't know how to seek to character number.

Revision history for this message
Peter De Maeyer (peter-de-maeyer) wrote :

It seems the odd characters do in fact have a valid UTF-8 encoding, but for some reason they have been encoded incorrectly. I was able to fix them as follows:

cat /var/lib/dpkg/status | iconv -c -f utf-8 -t utf-8 > /tmp/status.fixed
cat /var/lib/dpkg/available | iconv -c -f utf-8 -t utf-8 > /tmp/available.fixed

Now you still have to replace the originals with the fixed copies. In my case, there were about 100 offending packages:

hwdata ("Noël Köthe" -> "Noël Köthe")
shared-mime-info ("Sebastian Dröge" -> "Sebastian Dröge")
glines
...

I have the impression there is a structural root cause for this, it's not just about a rare and obscure package with a rogue character.

Revision history for this message
Neil Danziger (dnzgr) wrote :

I also was affected by this bug, caused by the third party package Scrivener (see comment #30 above by Rofko (lukejtmason)), and resolved it by editing removing the improperly encoded characters from /var/lib/dpkg/status.

Revision history for this message
g.bruno (g-bruno) wrote :

After upgrading from Ubuntu 12.04 LTS to 14.04.1 LTS I have the same error, slightly different:

root@amd8:/home/helmut# software-properties-gtk
Traceback (most recent call last):
  File "/usr/bin/software-properties-gtk", line 101, in <module>
    app = SoftwarePropertiesGtk(datadir=options.data_dir, options=options, file=file)
  File "/usr/lib/python3/dist-packages/softwareproperties/gtk/SoftwarePropertiesGtk.py", line 169, in __init__
    self.show_keys()
  File "/usr/lib/python3/dist-packages/softwareproperties/gtk/SoftwarePropertiesGtk.py", line 846, in show_keys
    for key in self.apt_key.list():
  File "/usr/lib/python3/dist-packages/softwareproperties/AptAuth.py", line 75, in list
    for line in p:
  File "/usr/lib/python3.4/codecs.py", line 313, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 1440: invalid start byte

I tried the methods described above with gedit and "cat /var/lib/dpkg/available | iconv -c -f utf-8 -t utf-8 > /tmp/available.fixed
root@amd8:/var/lib/dpkg# cat /var/lib/dpkg/available | iconv -c -f utf-8 -t utf-8 > /tmp/available.fixed, but the error is still present. I did not install davmail etc.

Can anyone help me? Ubuntu 14.04.1 is quite new, perhaps are there other reasons.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.