system-services: non-ascii layout/encoding problems at "login" line

Bug #273189 reported by Gökdeniz Karadağ on 2008-09-22
Affects Status Importance Assigned to Milestone
finish-install (Ubuntu)
upstart (Ubuntu)
util-linux (Ubuntu)

Bug Description

Binary package hint: console-setup

The default console font was problematic with Turkish language characters (latin5, iso-8859-9)
I changed the console-setup settings file to the one attached to this bug, then I run "update-initramfs -u". I also did the same using "dpkg-reconfigure console-setup".

edit: Colin Watson points out that this happens in all non-ascii layouts/encodings

When I log into root account on console, Turkish characters show up correctly, in the font I chose.
But in the "$HOSTNAME login: " line of the console some characters are problematic, specifically these two chars "ğş".

The problem is strange and I will describe it in detail. When I press the problematic keys, nothing is displayed the first time, if I press a second time or press another char, a diamond is displayed. After that, backspace and delete keys do not work, backspace produces three diamonds and delete produces escape code "[3~".
If I press enter and fail the login, the next login prompt displays those characters fine, but this time I can delete the login prompt text itself . For example, by pressing 'ğ' three times and pressing backspace six times gives me "$HOSTNAME logi". After this deletion I can type username and password, and login succeeds. Prompt reads like "$HOSTNAME logiUSERNAME"

When I cause a `maximum number of tries exceeded error' the process starts from the beginning, the problematic keys display diamonds.

This seems like a corner case but I want to help solve this problem, If any more info or testing is needed please ask.

Ubuntu version: Hardy 8.04.1
console-setup version: 1.21ubuntu8

Gökdeniz Karadağ (gokdeniz) wrote :
Colin Watson (cjwatson) wrote :

Yes, I've been meaning to look into this for a while. It's not Turkish-specific at all; anything with non-ASCII characters will have the same effect (for instance in the British layout I use it happens with the £ sign).

Changed in console-setup:
importance: Undecided → High
status: New → Confirmed
description: updated
description: updated

I'm seeing the same thing in bash, and it really seems to be at the application level. Here's an strace of it trying to read the £ sign:

read(0, "\302", 1) = 1
write(2, "\302", 1) = 1
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
read(0, "\243", 1) = 1
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0

In other words, the console is feeding it the right characters, but it's dropping the second of the two bytes in the UTF-8 representation.

Or ğ:

read(0, "\304", 1) = 1
write(2, "\302", 1) = 1
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
read(0, "\237", 1) = 1
write(2, "\304\237", 2) = 2
rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0

Now, this is actually in X, not at the console, but bash at the console produced the exact same trace with the £ sign. dash doesn't exhibit quite the same effects, although it does fail to correctly backspace over UTF-8 characters.

I know this isn't quite your original problem, but bash is a lot easier to debug interactively than login. I'm hoping that if I can see the systemic problem this way then login will be an easier target.

Colin Watson (cjwatson) wrote :

Ah! bash was a red herring and entirely my fault. I had this in .inputrc, left over from a previous era:

  Meta-#: "\C-v£"

This mapped byte 0xA3 to C-v £, which caused great confusion. Removing it fixed both £ and ğ (I think the latter was because I'd got bash's internal state in a muddle).

Back to the drawing board regarding login ...

Colin Watson (cjwatson) wrote :
Download full text (3.2 KiB)

OK, I've tracked this down to a combination of a bug in getty, which is shipped by util-linux, and a bug in the way we start getty. Here's the relevant code in get_logname:

            if (op->eightbits) {
                ascval = c;
            } else if (c != (ascval = (c & 0177))) { /* "parity" bit on */
                for (bits = 1, mask = 1; mask & 0177; mask <<= 1)
                    if (mask & ascval)
                        bits++; /* count "1" bits */
                cp->parity |= ((bits & 1) ? 1 : 2);
            /* Do erase, kill and end-of-line processing. */

            switch (ascval) {
            case CR:
            case NL:
                *bp = 0; /* terminate logname */
                cp->eol = ascval; /* set end-of-line char */
            case BS:
            case DEL:
            case '#':
                cp->erase = ascval; /* set erase character */
                if (bp > logname) {
                    (void) write(1, erase[cp->parity], 3);
            case CTL('U'):
            case '@':
                cp->kill = ascval; /* set kill character */
                while (bp > logname) {
                    (void) write(1, erase[cp->parity], 3);
            case CTL('D'):
                if (!isascii(ascval) || !isprint(ascval)) {
                     /* ignore garbage characters */ ;
                } else if (bp - logname >= sizeof(logname) - 1) {
                    error(_("%s: input overrun"), op->tty);
                } else {
                    (void) write(1, &c, 1); /* echo the character */
                    *bp++ = ascval; /* and store it */

There are two main problems here. Firstly, we aren't running getty in eight-bit-clean mode on the Linux console, and I think we should be. This is the responsibility of system-services; finish-install will then have to be careful to undo this for serial consoles. Outside eight-bit-clean mode, getty interprets the second byte of my test character (the £ sign) as # with the parity bit set, and thus emits an erase sequence (which gets broken because it's using the wrong parity, I think ...).

Secondly, even in eight-bit-clean mode, getty rejects anything that isn't matched by isascii() or isprint(); it's doing this in the C locale so characters such as ğ and ş aren't counted as printable. I'm not sure what the correct resolution for this is, but it would be nice to permit non-ASCII usernames. However, this will likely require fixes elsewhere (for example, the installer rejects usernames that don't match /^[a-z][-a-z0-9]*$/, and I imagine that most software that processes usernames makes no attempt to convert them between encodings), so I'm marking this part of the bug as wishlist. If the first part of this bug is fixed, then at least typing a non-ASCII c...


Changed in console-setup:
importance: High → Wishlist
status: Confirmed → Triaged
Changed in upstart:
importance: Undecided → Medium
status: New → Triaged
Changed in finish-install:
importance: Undecided → Medium
status: New → Triaged

Upstart isn't setting the system console correctly?

It runs this on /dev/console when it first starts to attempt to set it up in the usual way.

static void
reset_console (void)
        struct termios tty;

        tcgetattr (0, &tty);

        tty.c_cflag &= (CBAUD | CBAUDEX | CSIZE | CSTOPB | PARENB | PARODD);
        tty.c_cflag |= (HUPCL | CLOCAL | CREAD);

        /* Set up usual keys */
        tty.c_cc[VINTR] = 3; /* ^C */
        tty.c_cc[VQUIT] = 28; /* ^\ */
        tty.c_cc[VERASE] = 127;
        tty.c_cc[VKILL] = 24; /* ^X */
        tty.c_cc[VEOF] = 4; /* ^D */
        tty.c_cc[VTIME] = 0;
        tty.c_cc[VMIN] = 1;
        tty.c_cc[VSTART] = 17; /* ^Q */
        tty.c_cc[VSTOP] = 19; /* ^S */
        tty.c_cc[VSUSP] = 26; /* ^Z */

        /* Pre and post processing */
        tty.c_iflag = (IGNPAR | ICRNL | IXON | IXANY);
        tty.c_oflag = (OPOST | ONLCR);
        tty.c_lflag = (ISIG | ICANON | ECHO | ECHOCTL | ECHOPRT | ECHOKE);

        /* Set the terminal line and flush it */
        tcsetattr (0, TCSANOW, &tty);
        tcflush (0, TCIOFLUSH);

Could something else along the way be resetting it? sysvinit certainly attempts to reset the console at just about every available opportunity, Upstart doesn't do that because it crashes X! (I've no idea why sysvinit doesn't)

Changed in upstart:
status: Triaged → Incomplete
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package finish-install - 2.18ubuntu1

finish-install (2.18ubuntu1) intrepid; urgency=low

  * Remove -8 (if present) from getty options for serial terminals
    (LP: #273189).

 -- Colin Watson <email address hidden> Thu, 25 Sep 2008 13:59:33 +0100

Changed in finish-install:
status: Triaged → Fix Released
Gökdeniz Karadağ (gokdeniz) wrote :

I added "-8" manually to /etc/event.d/ttyX and it ignored non-ascii characters I typed, so the fix is OK.

But I only managed to find the following diff for this fix, and it "removes" the -8 for serial consoles. I think the real fix is elsewhere and I couldn't find it ? Does it getes applied at install-time only ?

Making getty system locale aware (maybe by reading /etc/default/locale ? ) may be a solution for the root of the problem. But a solution like that should take into account this problem:

Colin Watson (cjwatson) wrote :

Scott, unless getty is run with -8, it'll use the top bit for parity detection regardless of how upstart set up the console before it. There's a task on upstart because system-services owns /etc/event.d/tty* which governs how getty is started.

Gökdeniz, the bug is not yet fixed so you're not going to have a whole lot of luck trying to find the fix. All that my finish-install change did was to prepare the ground so that the change I propose to upstart wouldn't break serial console handling.

When did we start to need -8? I don't remember seeing that in inittab

Colin Watson (cjwatson) wrote :

I'm not convinced that we got this right in the sysvinit era either!

Uwe Geuder (ubuntulp-ugeuder) wrote :

getty and locales might be one problem. I have not looked into that.

But another problem are non-ASCII characters being corrupted on the console. I believe you might see a combination of both problems. The character corruption problem has been reported as

Ah, that makes sense then

Changed in upstart (Ubuntu):
status: Incomplete → Triaged
summary: - non-ascii layout/encoding problems at "login" line
+ event.d: non-ascii layout/encoding problems at "login" line
summary: - event.d: non-ascii layout/encoding problems at "login" line
+ system-services: non-ascii layout/encoding problems at "login" line
Changed in upstart (Ubuntu):
status: Triaged → In Progress
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package upstart - 0.3.10-2

upstart (0.3.10-2) karmic; urgency=low

  * debian/upstart.postinst: Use telinit u to re-exec, rather than
    kill just in case it's not Upstart that's running. LP: #92177.
  * debian/event.d/system-services/tty*: Run getty in 8-bit clean
    mode. LP: #273189.
  * debian/event.d/upstart-compat-sysv/rc-default:
    - Don't use grep -w, instead split on $IFS and iterate. LP: #385911.
    - Check for any valid runlevel, not just S. LP: #85014.
    - Make console owner, since it may spawn sulogin.
  * debian/event.d/upstart-compat-sysv/rcS:
    - Spawn sulogin if given -b or "emergency". LP: #193810.
  * debian/event.d/upstart-compat-sysv/rcS:
    - Make console owner. LP: #211402.
  * debian/event.d/upstart-compat-sysv/rcS-sulogin:
    - Place the telinit code in post-stop, checking $UPSTART_EVENT first so
      we don't change the runlevel if we were stopped due to a runlevel
      change. LP: #66002.

 -- Scott James Remnant <email address hidden> Thu, 18 Jun 2009 16:19:34 +0100

Changed in upstart (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments