Changelog notes do not preserve non-UTF8 characters from original changelog entries

Bug #1888254 reported by Robie Basak
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
git-ubuntu
Triaged
Low
Unassigned

Bug Description

Some old changelog entries contain non-UTF8 characters, such as:

    autoconf2.13 2.13-55
    autogen 1:5.8.3-2
    dbconfig-common 1.8.17
    dpatch 2.0.10
    dput 0.9.2.16ubuntu1
    emacs-goodies-el 24.9-2
    evolution 2.10.1-0ubuntu2
    gmp 2:4.2.2+dfsg-3ubuntu1
    gnome-session 2.17.92-0ubuntu2
    gnupg 1.4.3-2ubuntu1
    iptables 1.3.5.0debian1-1ubuntu1
    jadetex 3.13-2.1ubuntu2
    llvm-toolchain-3.9 1:3.9.1-4ubuntu3~14.04.2

git-ubuntu will soon start to successfully import such changelog entries into changelog notes, but will do so by losing fidelity. The original characters will be lost. This is unfortunate but is due to pygit2 currently not supporting any way to "pass through" the non-unicode code points. We can't for example use "errors=surrogateescape" since pygit2.Repository.create_note() requires a str. I filed https://github.com/libgit2/pygit2/issues/1021 to track this upstream. It's possible to make progress there by writing a patch, but it doesn't seem worth doing that right now for limited benefit in git-ubuntu.

This bug tracks the loss in fidelity in git-ubuntu changelog notes.

Related branches

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.