UnicodeDecodeError in fast-export when author name contains non-ascii characters

Bug #1647101 reported by Emily Klassen on 2016-12-03
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
python-fastimport
Fix Committed
Medium
Jelmer Vernooij
python-fastimport (Ubuntu)
Undecided
Unassigned
Zesty
High
Mattia Rizzolo
Artful
Undecided
Unassigned

Bug Description

[ Impact ]
* dealing with objects with a non-ascii char in their name just fail

[ Test case ]
* just try to export a bzr branch with a non-char character, for example the ubuntu-dev-tools bzr repository.

[ Regression potential ]
* The change has been in stretch and artful for a while, with no regressions reported
* also the change itself is kind of small and easily auditable

[ Original description ]
I was able to decipher that it failed when it got to the part of history where the commit author's name was "Raúl Núñez". (git-bzr is http://github.com/termie/git-bzr-ng)

$ git bzr clone lp:sakura
You have not informed bzr of your Launchpad ID, and you must do this to
write to Launchpad or access private data. See "bzr help launchpad-login".
Branched 562 revisions.tching revisions:Finishing stream:Estimate 6041/7652
14:42:10 Calculating the revisions to include ...
14:42:10 Starting export of 743 revisions ...
bzr: ERROR: exceptions.UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128)

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/bzrlib/commands.py", line 930, in exception_to_return_code
    return the_callable(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/bzrlib/commands.py", line 1121, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.7/site-packages/bzrlib/commands.py", line 673, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/lib/python2.7/site-packages/bzrlib/commands.py", line 697, in run
    return self._operation.run_simple(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/bzrlib/cleanup.py", line 136, in run_simple
    self.cleanups, self.func, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/bzrlib/cleanup.py", line 166, in _do_with_cleanups
    result = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/bzrlib/plugins/fastimport/cmds.py", line 720, in run
    return exporter.run()
  File "/usr/lib/python2.7/site-packages/bzrlib/plugins/fastimport/exporter.py", line 240, in run
    self.emit_commit(revid, self.ref)
  File "/usr/lib/python2.7/site-packages/bzrlib/plugins/fastimport/exporter.py", line 358, in emit_commit
    self.print_cmd(self._get_commit_command(ref, mark, revobj, file_cmds))
  File "/usr/lib/python2.7/site-packages/bzrlib/plugins/fastimport/exporter.py", line 287, in print_cmd
    self.outf.write("%r\n" % cmd)
  File "/usr/lib/python2.7/site-packages/fastimport/commands.py", line 74, in __repr__
    return self.__bytes__()
  File "/usr/lib/python2.7/site-packages/fastimport/commands.py", line 186, in __bytes__
    return self.to_string(include_file_contents=True)
  File "/usr/lib/python2.7/site-packages/fastimport/commands.py", line 201, in to_string
    author_section = b'\nauthor ' + format_who_when(self.author)
  File "/usr/lib/python2.7/site-packages/fastimport/commands.py", line 504, in format_who_when
    name = utf8_bytes_string(name)
  File "/usr/lib/python2.7/site-packages/fastimport/helpers.py", line 104, in utf8_bytes_string
    return s.encode('utf8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128)

bzr 2.7.0 on python 2.7.12 (Linux-4.8.8-2-ARCH-x86_64-with-glibc2.2.5)
arguments: ['/usr/sbin/bzr', 'fast-export', '--plain', '--export-
    marks=/home/forivall/code/repos/sakura/.git/bzr/map/master-bzr', '--git-
    branch=bzr/master',
    '/home/forivall/code/repos/sakura/.git/bzr/repo/master']
plugins: bash_completion[2.7.0], changelog_merge[2.7.0],
    fastimport[0.14.0dev], grep[2.7.0], launchpad[2.7.0],
    netrc_credential_store[2.7.0], news_merge[2.7.0], po_merge[2.7.0],
    weave_fmt[2.7.0]
encoding: 'utf-8', fsenc: 'UTF-8', lang: 'en_CA.UTF-8'

*** Bazaar has encountered an internal error. This probably indicates a
    bug in Bazaar. You can help us fix it by filing a bug report at
        https://bugs.launchpad.net/bzr/+filebug
    including this traceback and a description of the problem.
ERROR:root:bzr export failed

I fixed it by changing `s.encode('utf8')` to `return unicode(s, 'utf8').encode('utf8')`, but that should probably be limited to only when `sys.getdefaultencoding()` returns 'ascii'

Vincent Ladeuil (vila) on 2016-12-04
affects: bzr → bzr-fastimport
Unit 193 (unit193) on 2016-12-06
affects: bzr-fastimport → python-fastimport
Jelmer Vernooij (jelmer) on 2016-12-06
affects: python-fastimport → bzr-fastimport
affects: python-fastimport (Ubuntu) → bzr-fastimport (Ubuntu)
affects: bzr-fastimport → python-fastimport
affects: bzr-fastimport (Ubuntu) → python-fastimport (Ubuntu)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in python-fastimport (Ubuntu):
status: New → Confirmed
Jelmer Vernooij (jelmer) on 2016-12-06
Changed in python-fastimport:
assignee: nobody → Jelmer Vernooij (jelmer)
importance: Undecided → Medium
status: New → Fix Committed
Unit 193 (unit193) wrote :

Been using this at least since Jan.

Mattia Rizzolo (mapreri) wrote :

Uploaded to Debian instead, as version 0.9.6-3.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python-fastimport - 0.9.6-3

---------------
python-fastimport (0.9.6-3) unstable; urgency=medium

  * Team upload.
  * Add patch from upstream to fix a crash with UTF-8 encoded author names.
    LP: #1647101

 -- Mattia Rizzolo <email address hidden> Sun, 23 Apr 2017 08:40:09 +0200

Changed in python-fastimport (Ubuntu):
status: Confirmed → Fix Released

Please SRU this to zesty.

Mattia Rizzolo (mapreri) wrote :

Sure.

description: updated
Changed in python-fastimport (Ubuntu Zesty):
status: New → In Progress
assignee: nobody → Mattia Rizzolo (mapreri)
importance: Undecided → High

Hello Jordan, or anyone else affected,

Accepted python-fastimport into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-fastimport/0.9.6-2ubuntu17.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in python-fastimport (Ubuntu Zesty):
status: In Progress → Fix Committed
tags: added: verification-needed

As part of a recent change in the Stable Release Update verification policy we would like to inform that for a bug to be considered verified for a given release a verification-done-$RELEASE tag needs to be added to the bug where $RELEASE is the name of the series the package that was tested (e.g. verification-done-xenial). Please note that the global 'verification-done' tag can no longer be used for this purpose.

Thank you!

The fix for this bug has been awaiting testing feedback in the -proposed repository for zesty for more than 90 days. Please test this fix and update the bug appropriately with the results. In the event that the fix for this bug is still not verified 15 days from now, the package will be removed from the -proposed repository.

tags: added: removal-candidate

The version of python-fastimport in the proposed pocket of Zesty that was purported to fix this bug report has been removed because the bugs that were to be fixed by the upload were not verified in a timely (105 days) fashion.

Changed in python-fastimport (Ubuntu Zesty):
status: Fix Committed → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers