Comment 20 for bug 120687

Revision history for this message
sordna (sordna) wrote :

Looks like this affects programs such as GNU grep and egrep ... note I'm using quotes around the A-Z character class to avoid any shell interference:
$ echo hello | grep '[A-Z]'
hello

The above behavior COMPLETELY WRONG AND UNACCEPTABLE. I am utterly shocked I have to worry change my default environment to do a simple task such as identifying upper case characters with grep. Note I'm using en_US.UTF-8 (not en_GB).

LC_COLLATE should default to C under all circumstances, unless the user explicitly wants grep and other programs to behave in a totally weird and unexpected way. Even better, perhaps libc should treat an undefined LC_COLLATE same as being C.

Either way, regular expressions should be honored in a sane linux / unix operating system. Users should not have to jump through hoops to make a fresh installed system behave in a normal, unsuprising way.