UDD changelog importer gets confused by unicode characters

Bug #1159246 reported by Matt Fischer
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu Distributed Development
New
Undecided
Unassigned

Bug Description

When trying to import xsane, the importer gets confused by the unicode character "u'\xf6'" (ö).

Line 181 of the changelog has this submitter, which seems to be the cause
  -- Reinhard Fössmeier "Aisano" <email address hidden> Thu, 20 May 2010>

Here is the full log of the failure:
http://package-import.ubuntu.com/status/xsane.html#2012-07-04 14:37:44.048044

Here is the traceback portion:

Traceback (most recent call last):
  File "/srv/package-import.canonical.com/new/scripts/bin/import-package", line 5, in <module>
    sys.exit(main(sys.argv))
  File "/srv/package-import.canonical.com/new/scripts/udd/scripts/import_package.py", line 1174, in main
    only_before=options.only_before)
  File "/srv/package-import.canonical.com/new/scripts/udd/scripts/import_package.py", line 1081, in _import_package
    bstore, push, possible_transports=possible_transports)
  File "/srv/package-import.canonical.com/new/scripts/udd/scripts/import_package.py", line 640, in import_package
    use_time_from_changelog=True)
  File "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py", line 1218, in import_package
    cl = self.get_changelog_from_source(extractor.extracted_debianised)
  File "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py", line 1086, in get_changelog_from_source
    cl.parse_changelog(content, strict=False, max_blocks=max_blocks)
  File "/usr/lib/python2.7/dist-packages/debian/changelog.py", line 403, in parse_changelog
    "for %s: %s" % (state, line), strict)
  File "/usr/lib/python2.7/dist-packages/debian/changelog.py", line 238, in _parse_error
    warnings.warn(message)
  File "/usr/lib/python2.7/warnings.py", line 29, in _show_warning
    file.write(formatwarning(message, category, filename, lineno, line))
  File "/usr/lib/python2.7/warnings.py", line 38, in formatwarning
    s = "%s:%s: %s: %s\n" % (filename, lineno, category.__name__, message)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 77: ordinal not in range(128)

Revision history for this message
Matt Fischer (mfisch) wrote :

I've been digging into this more and although that character is what it's complaining about there is a Cyrillic name a full hundred lines above it that should have also failed. I'm not so sure about my analysis anymore.

Revision history for this message
Sam Hanes (elemecca) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.