huge performance hit for -i with UTF-8 locales
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
grep |
Unknown
|
Unknown
|
|||
grep (Ubuntu) |
Incomplete
|
Undecided
|
Unassigned |
Bug Description
On a source tree with 28MB of .c and .h files (Mesa), grep is slow with -i and fast without it with the default Ubuntu locale settings (LANG=en_US.UTF-8, no LC_ variables set). Actually, even some [Vv] style patterns are much faster with LANG=C, so this is even more like
https:/
My box is a core 2 duo (2.4GHz), which makes a beast like gnome feel almost as snappy as fluxbox :) Everything is in the disk cache, so I/O isn't a factor. Neither is memory bandwidth. The machine was otherwise idle. I'm running AMD64 Edgy.
peter@tesla:
LANG=en_US.UTF-8
LC_CTYPE=
... (all the same)
(times are measured for the second run in a row, so the CPU core it runs on is at full clock speed the whole time.)
time find -name '*.[ch]' | xargs grep -i 'volatile_s3tc'
real 0m3.498s; user 0m3.483s; sys 0m0.023s
time find -name '*.[ch]' | xargs grep 'volatile.*s3tc'
real 0m0.076s; user 0m0.050s; sys 0m0.023s
Non UTF-8 locales are just as fast as without -i
time find -name '*.[ch]' | LANG=C xargs grep -i 'volatile.*s3tc'
real 0m0.083s; user 0m0.067s; sys 0m0.020s
time find -name '*.[ch]' | LANG=en_CA xargs grep -i 'volatile.*s3tc'
real 0m0.079s; user 0m0.050s; sys 0m0.027s
Making a case insensitive pattern takes more time, but is not really slow. However, it probably doesn't really match everything that grep -i would on input that wasn't all 7 bit ASCII:
time find -name '*.[ch]' | xargs grep '[Vv][Oo]
real 0m0.340s; user 0m0.313s; sys 0m0.027s
It is affected by locale settings, too.
time find -name '*.[ch]' | LANG=C xargs grep '[Vv][Oo]
real 0m0.096s; user 0m0.080s; sys 0m0.027s
Was going to log a new bug, but this one looks very similar. Heres the details I found.
Peter try setting LANG to something matching under /usr/lib/locale and run your test to see if we have the same bug. probably en_CA.utf8
------- ------- ------- ------- ------- ------- ------- ------- ------- ------- ------
Ive noticed a small performance hit arising from locale settings. Ubunutu (gnome- language- selector) sets lang in a format like:
LANG=en_AU.UTF-8 "en_AU. UTF-8" "en_AU. UTF-8" "en_AU. UTF-8" "en_AU. UTF-8" "en_AU. UTF-8" "en_AU. UTF-8" "en_AU. UTF-8" "en_AU. UTF-8" "en_AU. UTF-8" "en_AU. UTF-8" "en_AU. UTF-8" ON="en_ AU.UTF- 8"
LANGUAGE=en_AU:en
LC_CTYPE=
LC_NUMERIC=
LC_TIME=
LC_COLLATE=
LC_MONETARY=
LC_MESSAGES=
LC_PAPER=
LC_NAME=
LC_ADDRESS=
LC_TELEPHONE=
LC_MEASUREMENT=
LC_IDENTIFICATI
When binaries are executed, the LANG is looked up in /usr/lib/locale, which on my system looks like:
xxx@xxx: /usr/lib/ locale$ ls
en_AU.utf8 en_DK.utf8 en_IE.utf8 en_PH.utf8 en_ZA.utf8
en_BW.utf8 en_GB.utf8 en_IN en_SG.utf8 en_ZW.utf8
en_CA.utf8 en_HK.utf8 en_NZ.utf8 en_US.utf8
They dont match up, en_AU.utf8 vs en_AU.UTF-8
With the default LANG, running strace across various binaries, 'ls' for example gives many messages such as: usr/lib/ locale/ en_AU.UTF- 8/LC_CTYPE" , O_RDONLY) = -1 ENOENT (No such file or directory)
open("/
With LANG=en_AU.UTF-8
strace -c ls
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.000062 4 14 read
0.00 0.000000 0 1 write
0.00 0.000000 0 105 77 open
0.00 0.000000 0 29 close
0.00 0.000000 0 1 execve
0.00 0.000000 0 12 12 access
0.00 0.000000 0 3 brk
0.00 0.000000 0 2 2 ioctl
0.00 0.000000 0 6 munmap
0.00 0.000000 0 1 mprotect
0.00 0.000000 0 1 _sysctl
0.00 0.000000 0 2 rt_sigaction
0.00 0.000000 0 1 rt_sigprocmask
0.00 0.000000 0 1 getrlimit
0.00 0.000000 0 42 mmap2
0.00 0.000000 0 29 fstat64
0.00 0.000000 0 2 getdents64
0.00 0.000000 0 1 fcntl64
0.00 0.000000 0 2 futex
0.00 0.000000 0 1 set_thread_area
0.00 0.000000 0 1 set_tid_address
------ ----------- ----------- --------- --------- ----------------
100.00 0.000062 257 91 total
With LANG=en_AU.utf8, to match the directory in /usr/lib/locale
strace -c ls
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
nan 0.000000 0 14 read
nan 0.000000 0 3 write
nan 0.000000 ...