UnicodeDecodeError in fast-export when author name contains non-ascii characters

Bug #1647101 reported by Emily Klassen on 2016-12-03
32
This bug affects 5 people
Affects Status Importance Assigned to Milestone
python-fastimport
Fix Committed
Medium
Jelmer Vernooij
python-fastimport (Ubuntu)
Status tracked in Artful
Zesty
High
Mattia Rizzolo
Artful
Undecided
Unassigned

Bug Description

[ Impact ]
* dealing with objects with a non-ascii char in their name just fail

[ Test case ]
* just try to export a bzr branch with a non-char character, for example the ubuntu-dev-tools bzr repository.

[ Regression potential ]
* The change has been in stretch and artful for a while, with no regressions reported
* also the change itself is kind of small and easily auditable

[ Original description ]
I was able to decipher that it failed when it got to the part of history where the commit author's name was "Raúl Núñez". (git-bzr is http://github.com/termie/git-bzr-ng)

$ git bzr clone lp:sakura
You have not informed bzr of your Launchpad ID, and you must do this to
write to Launchpad or access private data. See "bzr help launchpad-login".
Branched 562 revisions.tching revisions:Finishing stream:Estimate 6041/7652
14:42:10 Calculating the revisions to include ...
14:42:10 Starting export of 743 revisions ...
bzr: ERROR: exceptions.UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128)

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/bzrlib/commands.py", line 930, in exception_to_return_code
    return the_callable(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/bzrlib/commands.py", line 1121, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.7/site-packages/bzrlib/commands.py", line 673, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/lib/python2.7/site-packages/bzrlib/commands.py", line 697, in run
    return self._operation.run_simple(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/bzrlib/cleanup.py", line 136, in run_simple
    self.cleanups, self.func, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/bzrlib/cleanup.py", line 166, in _do_with_cleanups
    result = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/bzrlib/plugins/fastimport/cmds.py", line 720, in run
    return exporter.run()
  File "/usr/lib/python2.7/site-packages/bzrlib/plugins/fastimport/exporter.py", line 240, in run
    self.emit_commit(revid, self.ref)
  File "/usr/lib/python2.7/site-packages/bzrlib/plugins/fastimport/exporter.py", line 358, in emit_commit
    self.print_cmd(self._get_commit_command(ref, mark, revobj, file_cmds))
  File "/usr/lib/python2.7/site-packages/bzrlib/plugins/fastimport/exporter.py", line 287, in print_cmd
    self.outf.write("%r\n" % cmd)
  File "/usr/lib/python2.7/site-packages/fastimport/commands.py", line 74, in __repr__
    return self.__bytes__()
  File "/usr/lib/python2.7/site-packages/fastimport/commands.py", line 186, in __bytes__
    return self.to_string(include_file_contents=True)
  File "/usr/lib/python2.7/site-packages/fastimport/commands.py", line 201, in to_string
    author_section = b'\nauthor ' + format_who_when(self.author)
  File "/usr/lib/python2.7/site-packages/fastimport/commands.py", line 504, in format_who_when
    name = utf8_bytes_string(name)
  File "/usr/lib/python2.7/site-packages/fastimport/helpers.py", line 104, in utf8_bytes_string
    return s.encode('utf8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128)

bzr 2.7.0 on python 2.7.12 (Linux-4.8.8-2-ARCH-x86_64-with-glibc2.2.5)
arguments: ['/usr/sbin/bzr', 'fast-export', '--plain', '--export-
    marks=/home/forivall/code/repos/sakura/.git/bzr/map/master-bzr', '--git-
    branch=bzr/master',
    '/home/forivall/code/repos/sakura/.git/bzr/repo/master']
plugins: bash_completion[2.7.0], changelog_merge[2.7.0],
    fastimport[0.14.0dev], grep[2.7.0], launchpad[2.7.0],
    netrc_credential_store[2.7.0], news_merge[2.7.0], po_merge[2.7.0],
    weave_fmt[2.7.0]
encoding: 'utf-8', fsenc: 'UTF-8', lang: 'en_CA.UTF-8'

*** Bazaar has encountered an internal error. This probably indicates a
    bug in Bazaar. You can help us fix it by filing a bug report at
        https://bugs.launchpad.net/bzr/+filebug
    including this traceback and a description of the problem.
ERROR:root:bzr export failed

I fixed it by changing `s.encode('utf8')` to `return unicode(s, 'utf8').encode('utf8')`, but that should probably be limited to only when `sys.getdefaultencoding()` returns 'ascii'

Vincent Ladeuil (vila) on 2016-12-04
affects: bzr → bzr-fastimport
Unit 193 (unit193) on 2016-12-06
affects: bzr-fastimport → python-fastimport
Jelmer Vernooij (jelmer) on 2016-12-06
affects: python-fastimport → bzr-fastimport
affects: python-fastimport (Ubuntu) → bzr-fastimport (Ubuntu)
affects: bzr-fastimport → python-fastimport
affects: bzr-fastimport (Ubuntu) → python-fastimport (Ubuntu)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in python-fastimport (Ubuntu):
status: New → Confirmed
Jelmer Vernooij (jelmer) on 2016-12-06
Changed in python-fastimport:
assignee: nobody → Jelmer Vernooij (jelmer)
importance: Undecided → Medium
status: New → Fix Committed
Unit 193 (unit193) wrote :

Been using this at least since Jan.

Mattia Rizzolo (mapreri) wrote :

Uploaded to Debian instead, as version 0.9.6-3.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python-fastimport - 0.9.6-3

---------------
python-fastimport (0.9.6-3) unstable; urgency=medium

  * Team upload.
  * Add patch from upstream to fix a crash with UTF-8 encoded author names.
    LP: #1647101

 -- Mattia Rizzolo <email address hidden> Sun, 23 Apr 2017 08:40:09 +0200

Changed in python-fastimport (Ubuntu):
status: Confirmed → Fix Released

Please SRU this to zesty.

Mattia Rizzolo (mapreri) wrote :

Sure.

description: updated
Changed in python-fastimport (Ubuntu Zesty):
status: New → In Progress
assignee: nobody → Mattia Rizzolo (mapreri)
importance: Undecided → High

Hello Jordan, or anyone else affected,

Accepted python-fastimport into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-fastimport/0.9.6-2ubuntu17.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in python-fastimport (Ubuntu Zesty):
status: In Progress → Fix Committed
tags: added: verification-needed

As part of a recent change in the Stable Release Update verification policy we would like to inform that for a bug to be considered verified for a given release a verification-done-$RELEASE tag needs to be added to the bug where $RELEASE is the name of the series the package that was tested (e.g. verification-done-xenial). Please note that the global 'verification-done' tag can no longer be used for this purpose.

Thank you!

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers