Emoji that should be wide, according to Unicode 9, display incorrectly in the terminal

Bug #1665140 reported by Rob Speer
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
gnome-terminal (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

When editing Unicode in the terminal, it is important for the terminal and applications to agree on the width of characters. Otherwise, display glitches will occur.

In prior versions of Unicode, the character width of emoji was "ambiguous". As of Unicode 9, they are supposed to be double-wide and take up two character cells. gnome-terminal displays them as overlapping, single-wide characters, like it has for a long time. This is a problem because many applications I use on Ubuntu now treat emoji as double-wide. The behavior of gnome-terminal conflicts with the cursor positioning in vim, for example.

I've long considered gnome-terminal to have the least buggy Unicode implementation among terminal applications, and this may no longer be the case.

The *desired behavior* of emoji is that they are double-wide: they take up two character cells, advance the cursor by two cells when displayed, and text-editing applications move the cursor by two cells when moving past them. The *actual behavior* is inconsistent: typically, the emoji advance the cursor only by one character cell, even though the text-editing cursor moves by two character cells when moving past them, causing the cursor to be misaligned with the text.

I hope this bug reporting interface lets me attach screenshots so I can show the example with vim.

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: gnome-terminal 3.18.3-1ubuntu1
ProcVersionSignature: Ubuntu 4.4.0-57.78-generic 4.4.35
Uname: Linux 4.4.0-57-generic x86_64
NonfreeKernelModules: nvidia_uvm nvidia_drm nvidia_modeset nvidia
ApportVersion: 2.20.1-0ubuntu2.5
Architecture: amd64
CurrentDesktop: Unity
Date: Wed Feb 15 16:13:31 2017
InstallationDate: Installed on 2016-08-09 (190 days ago)
InstallationMedia: Ubuntu 16.04.1 LTS "Xenial Xerus" - Release amd64 (20160719)
JournalErrors:
 Error: command ['journalctl', '-b', '--priority=warning', '--lines=1000'] failed with exit code 1: Hint: You are currently not seeing messages from other users and the system.
       Users in the 'systemd-journal' group can see all messages. Pass -q to
       turn off this notice.
 No journal files were opened due to insufficient permissions.
SourcePackage: gnome-terminal
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Rob Speer (rspeer) wrote :
Revision history for this message
Rob Speer (rspeer) wrote :
Revision history for this message
Rob Speer (rspeer) wrote :
Revision history for this message
Egmont Koblinger (egmont-gmail) wrote :

You've reported this bug against Ubuntu 16.04, whereas Unicode 9.0 was released in June 2016. The expected behavior in that distro is hence Unicode 8.0's.

gnome-terminal takes the character width from glib (package: libglib2.0-0), which has surprisingly upgraded to Unicode 9.0 in a micro version, namely 2.50.1. Again surprisingly, Yakkety has released this version as an update (2.50.2), not in the original yakkety package (2.50.0), which again arguably shouldn't have happened.

Most terminal emulators and apps take the character width from glibc (package: libc6), which has not updated to Unicode 9.0 yet (as of the just released version 2.25). At least Fedora, however, has already patched it forward: https://fedoraproject.org/wiki/Changes/Unicode_9.0. Maybe Ubuntu too, or is planning to, I don't know.

Vim, as far as I know, has its own separate built-in database.

gnome-terminal being different from most of the apps is already causing quite a bit of a headache, see e.g.:

- https://bugzilla.gnome.org/show_bug.cgi?id=772812 (and follow the links from there as well to see some of the problems it causes),

- https://bugzilla.gnome.org/show_bug.cgi?id=772890 (a recommendation for gnome-terminal to use glibc's database instead - no work has been done yet).

vim using its own database adds yet another twist to the story. I don't know what version it ships.

The bug you reported is indeed a valid and (presumably especially in yakkety (with updates) and zesty) a quite serious one. Strictly speaking, though, I don't think it's a gnome-terminal bug. I think it's a much higher level issue that should be brought up in some core developer mailing list.

I believe it's the distro's responsibility to ship glib, glibc and vim so that all use the same Unicode version.

(Or even better, but it takes more time and a heavy refactoring: There should be a single core Unicode library that glibc, glib, vim etc. all depend on. Unfortunately I find it unlikely to get implemented.)

Revision history for this message
Egmont Koblinger (egmont-gmail) wrote :

Also note that in gnome-terminals' Profile Preferences, under the Compatibility tab you can choose whether you want ambiguous width characters to be narrow or wide.

If, apparently, your system has Unicode 8.0 (where these characters are still ambiguous) and vim handles them as wide, changing gnome-terminal to also treat them as wide might be a sensible workaround for you. (This, however, goes against glibc's database which is in turn used by ncurses etc.).

Revision history for this message
Egmont Koblinger (egmont-gmail) wrote :

Just for the record:

glibc has just upgraded to Unicode 9.0 in git (forthcoming 2.26 release):
https://sourceware.org/bugzilla/show_bug.cgi?id=20313

Revision history for this message
Rob Speer (rob-speer) wrote :

> I believe it's the distro's responsibility to ship glib, glibc and vim so that all use the same Unicode version.

Sounds right, and it's good to hear that there's progress on this front. I was afraid the situation was stagnant and it would be up to applications to fix it.

> Or even better, but it takes more time and a heavy refactoring: There should be a single core Unicode library that glibc, glib, vim etc. all depend on. Unfortunately I find it unlikely to get implemented.

Working a lot with Unicode myself, the thing is that the number of things a Unicode library could do is enormous and no application needs all of them. Many developers consider linking to, for example, ICU to be unacceptable bloat, and I don't blame them.

> Also note that in gnome-terminals' Profile Preferences, under the Compatibility tab you can choose whether you want ambiguous width characters to be narrow or wide.

I wouldn't want to set *all* ambiguous characters to be wide, like a Japanese OS -- that would break many more things. It's only the emoji that changed.

Thanks for the responses.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in gnome-terminal (Ubuntu):
status: New → Confirmed
Revision history for this message
Lino Mastrodomenico (l-mastrodomenico) wrote :

FYI, Unicode 9 only adds support for some characters (e.g. U+1F920, 🤠, is correctly marked as wide starting from Unicode 9) but others require an update to Unicode 10 (e.g. U+1F92F, 🤯) to be marked as wide.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.