some extended Korn shell globs are really slow

Bug #625164 reported by Thorsten Glaser on 2010-08-27
This bug affects 1 person
Affects Status Importance Assigned to Milestone

Bug Description

Attaching a testcase.

The first glob is decently fast, the second one is noticeable (about a second) on a 3 GHz Athlon.
The third glob needs to be killed with SIGKILL out of all things.

Also, I have another shell script, where replacing
[[ $foo = *@(x)* ]] with [[ $foo = *'x'* ]] and
[[ $foo = @(1|2………),* ]] with [[ $foo = 1,* || $foo = 2………,* ]
made it noticeable faster.

The probable culprit is gmatchx, do_gmatch, and friends, mostly from misc.c.

This bug serves as documentation for now, because I have no idea how to tackle it, but if someone takes it up and submits patches, be my guest.

Thorsten Glaser (mirabilos) wrote :
Thorsten Glaser (mirabilos) wrote :

Partial fix committed:

${foo/bar/baz} made bar into *bar* which will become very slow if bar = *@(foo)
mksh now partially optimises extglobs:

Pass 1 ⇒ replace all @(foo) with foo if foo doesn’t contain a pattern separator (‘|’)

Pass 2 ⇒ collapse all adjacent asterisk wildcards (‘*’)

This speeds up a lot of things, up to the point of preventing apparent freezes even on multi-Gigahertz-machines. There’s still a lot to do (cache optimised regexps, especially for the ${foo/bar/baz} case; optimise things as reported above) though.

Changed in mksh:
status: New → In Progress
Thorsten Glaser (mirabilos) wrote :

mksh’s globbing really sucks. Way out: parse them as special kind of regex (NFA with making $KSH_MATCH an array, possibly).




To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments