Message-Id: <20040818055735.77AC7431@mctpc71> Date: Wed, 18 Aug 2004 14:57:35 +0900 From: Miles Bader <email address hidden> To: Debian Bug Tracking System <email address hidden> Subject: gawk: Odd regexp matching problem if LANG=ja_JP
Package: gawk Version: 1:3.1.4-1 Severity: normal
Executing the following line in a shell:
echo -e '--- orig/lisp/ChangeLog\n+++ mod/lisp/ChangeLog' | LANG=ja_JP gawk '/[Cc]hangeLog/ { print }'
yields not the expected two lines of output, but instead only the first one:
--- orig/lisp/ChangeLog
If the LANG-setting portion is changed to use C, then it works as expected (others such as "de" seem to work too):
echo -e '--- orig/lisp/ChangeLog\n+++ mod/lisp/ChangeLog' | LANG=C gawk '/[Cc]hangeLog/ { print }'
yields:
--- orig/lisp/ChangeLog +++ mod/lisp/ChangeLog
I'm not sure if the actual encoding has any impact -- ja_JP, ja_JP.utf8, and ja_JP.eucjp all exhibit the same problem.
Thanks,
-Miles
-- System Information: Debian Release: 3.1 APT prefers unstable APT policy: (500, 'unstable'), (101, 'experimental') Architecture: i386 (i686) Kernel: Linux 2.6.8.1 Locale: LANG=ja_JP.UTF-8, LC_CTYPE=ja_JP.UTF-8
Versions of packages gawk depends on: ii libc6 2.3.2.ds1-16 GNU C Library: Shared libraries an
-- no debconf information
Message-Id: <20040818055735 .77AC7431@ mctpc71>
Date: Wed, 18 Aug 2004 14:57:35 +0900
From: Miles Bader <email address hidden>
To: Debian Bug Tracking System <email address hidden>
Subject: gawk: Odd regexp matching problem if LANG=ja_JP
Package: gawk
Version: 1:3.1.4-1
Severity: normal
Executing the following line in a shell:
echo -e '--- orig/lisp/ ChangeLog\ n+++ mod/lisp/ChangeLog' | LANG=ja_JP gawk '/[Cc]hangeLog/ { print }'
yields not the expected two lines of output, but instead only the first one:
--- orig/lisp/ChangeLog
If the LANG-setting portion is changed to use C, then it works as
expected (others such as "de" seem to work too):
echo -e '--- orig/lisp/ ChangeLog\ n+++ mod/lisp/ChangeLog' | LANG=C gawk '/[Cc]hangeLog/ { print }'
yields:
--- orig/lisp/ChangeLog
+++ mod/lisp/ChangeLog
I'm not sure if the actual encoding has any impact -- ja_JP, ja_JP.utf8,
and ja_JP.eucjp all exhibit the same problem.
Thanks,
-Miles
-- System Information: ja_JP.UTF- 8
Debian Release: 3.1
APT prefers unstable
APT policy: (500, 'unstable'), (101, 'experimental')
Architecture: i386 (i686)
Kernel: Linux 2.6.8.1
Locale: LANG=ja_JP.UTF-8, LC_CTYPE=
Versions of packages gawk depends on:
ii libc6 2.3.2.ds1-16 GNU C Library: Shared libraries an
-- no debconf information