Comment 5 for bug 9026

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-Id: <20040818055735.77AC7431@mctpc71>
Date: Wed, 18 Aug 2004 14:57:35 +0900
From: Miles Bader <email address hidden>
To: Debian Bug Tracking System <email address hidden>
Subject: gawk: Odd regexp matching problem if LANG=ja_JP

Package: gawk
Version: 1:3.1.4-1
Severity: normal

Executing the following line in a shell:

   echo -e '--- orig/lisp/ChangeLog\n+++ mod/lisp/ChangeLog' | LANG=ja_JP gawk '/[Cc]hangeLog/ { print }'

yields not the expected two lines of output, but instead only the first one:

   --- orig/lisp/ChangeLog

If the LANG-setting portion is changed to use C, then it works as
expected (others such as "de" seem to work too):

   echo -e '--- orig/lisp/ChangeLog\n+++ mod/lisp/ChangeLog' | LANG=C gawk '/[Cc]hangeLog/ { print }'

yields:

   --- orig/lisp/ChangeLog
   +++ mod/lisp/ChangeLog

I'm not sure if the actual encoding has any impact -- ja_JP, ja_JP.utf8,
and ja_JP.eucjp all exhibit the same problem.

Thanks,

-Miles

-- System Information:
Debian Release: 3.1
  APT prefers unstable
  APT policy: (500, 'unstable'), (101, 'experimental')
Architecture: i386 (i686)
Kernel: Linux 2.6.8.1
Locale: LANG=ja_JP.UTF-8, LC_CTYPE=ja_JP.UTF-8

Versions of packages gawk depends on:
ii libc6 2.3.2.ds1-16 GNU C Library: Shared libraries an

-- no debconf information