find regex does not work properly
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
findutils (Ubuntu) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: findutils
I would appear to be a similar bug to #58883
Some have said this isn't a bug at all but here goes.
When in an empty directory using gnome-terminal:
user@ubuntukarm
user@ubuntukarm
user@ubuntukarm
a b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
user@ubuntukarm
.
./a
user@ubuntukarm
LANG=en_US.UTF-8
GDM_LANG=
user@ubuntukarm
user@ubuntukarm
.
./a
./b
./c
./d
./e
./f
./g
./h
./i
./j
./k
./l
./m
./n
./o
./p
./q
./r
./s
./t
./u
./v
./w
./x
./y
./z
user@ubuntukarm
find (GNU findutils) 4.4.2
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Eric B. Decker, James Youngman, and Kevin Dalley.
Built using GNU gnulib version e5573b1bad88bfa
Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION FTS() CBO(level=0)
This is further discussed at http://
Hello,
1) I cannot reproduce this on current Debian.
2) The respective code is not located in find, re_match() is part of libc.
3) The fact that regex are locale dependent is expected behavior, e.g. in the Estonian alphabet Z is not the last letter and therefore e.g Y is not in 'A-Z'.
4) To matching upper case letters you should use the respect collation sequence ([[:upper:]] instead of [A-Z]) or reset LC_COLLATE to C.
Given all that, afaict from Google it looks like some in some versions of libc '[A-Z]' includes lower case letters in en_US.UTF-8 locale while in others it does not. See also #120687