wanted: FIND improvements with strings

Bug #410122 reported by Karol Swietlicki on 2009-08-07
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Wishlist
Nikodemus Siivola

Bug Description

Now that SLOT-VALUE works well, I have a second item on my wishlist that I would like to see improved.
This should be pretty easy, I think. Of course I may be misunderstanding something too, even though I spent a big part of the day reading the hyperspec.

We don't do a very good job with this:

(declaim (optimize (speed 3) (space 0) (debug 0) (safety 0) (compilation-speed 0)))
(defun slow-find (str char)
  (declare (type simple-string str))
  (find char str))

(this is sbcl 1.0.30)

The array element type is unknown at compile time, we cannot inline specialized find code for strings. However, we probably can do better than what we currently do. I have played around a little bit and if my measurements are correct, we can get a nice speedup by doing type dispatch at runtime, like this:

(defun %simple-base-string-find (str char)
  (declare (type simple-base-string str))
  (find char str))

(defun %character-simple-string-find (str char)
  (declare (type (simple-array character) str))
  (find char str))

(defun fast-find (str char)
  (declare (type simple-string str))
    (if (eq (array-element-type str) 'base-char) ; There are only two possibilities here, character and base-char.
 (%simple-base-string-find str char)
 (%character-simple-string-find str char)))

In my testing (and to my surprise) I find that by doing this the runtime decreases rather nicely. Also, probably the same applies to POSITION as well, although I did not do any testing on that one.

CL-USER> (time (dotimes (x 1000000) (slow-find "That is not dead which can eternal lie..." #\X)))
Evaluation took:
  2.002 seconds of real time
  1.772111 seconds of total run time (1.724108 user, 0.048003 system)
  88.51% CPU
  3,402,739,956 processor cycles
  0 bytes consed

CL-USER> (time (dotimes (x 1000000) (fast-find "That is not dead which can eternal lie..." #\X)))
Evaluation took:
  0.232 seconds of real time
  0.148009 seconds of total run time (0.148009 user, 0.000000 system)
  63.79% CPU
  394,403,128 processor cycles
  0 bytes consed

Of course there is more to that task than what I wrote above, we have to handle all the keyword arguments too, this can complicate things a bit.

I may give this a stab, but don't hold your breath, I'm not so good at this. For all I know, I may be pretty wrong on one of my points above, making this all invalid.

Karol Swietlicki

Changed in sbcl:
status: New → Confirmed
importance: Undecided → Wishlist
tags: added: easy library optimization
Karol Swietlicki (abcdef123456) wrote :

I think I have something good, and it even passes the tests. It was surprisingly easy, so I'm worried that it might be buggy. It does seem to run well though.

CL-USER> (time (dotimes (x 1000000) (slow-find "That is not dead which can eternal lie..." #\X)))
Evaluation took:
  0.182 seconds of real time

Not sure how it looks from the cache perspective, it ends up inlining both variants of the find function, one for the simple base string version and one for a normal simple string. I'm guessing that it still would be a win in most cases. It seems to be a huge win in my code.

Should I describe the changes like I did for SLOT-VALUE, or should I send in a patch? Keep in mind, I know nothing of the style, the patch might need major rework.

Karol Swietlicki

Karol Swietlicki (abcdef123456) wrote :

This is the code that solves this problem for me.

Karol Swietlicki

Changed in sbcl:
assignee: nobody → Nikodemus Siivola (nikodemus)
Nikodemus Siivola (nikodemus) wrote :

in 1.0.31.7.

Changed in sbcl:
status: Confirmed → Fix Committed
Changed in sbcl:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Patches