Perf: sbcl vs ccl/lispworks on particular (short) problem

Bug #1712130 reported by il71 on 2017-08-21
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Undecided
Unassigned

Bug Description

#|

NOT A BUG! Perf issue.

You may find it interesting. It shows HUGE diff in perf: SBCL vs CCL/Lispworks on this particular problem.

Old code, I don't want to revisit it, at that time I was learning CL.

|#

;;;;;;;;;

#|
This is the solution to projecteuler #172
It's not the solution that's interesting but the
time difference between SBCL, CCL, Lispworks.
|#

;;;;------------ Standard stuff for memoizing
(defun memo (fn test)
  (let ((cache (make-hash-table :test test)))
    #'(lambda (&rest args)
        (multiple-value-bind (val win) (gethash args cache)
          (if win
            val
            (setf (gethash args cache)
                  (apply fn args)))))))

(defun memoize (fn &key (test #'equal))
  (setf (symbol-function fn)
        (memo (symbol-function fn) test)))

(defmacro defun-memo (fn args &body body)
  `(memoize (defun ,fn ,args (declare (notinline ,fn)) ,@body)))
;;;;------------

(defun-memo f (n list)
  (if (= n 1)
      (count-if #'plusp list)
      (let ((res 0))
        (loop for i from 0 to 9 do
          (when (plusp (nth i list))
            (let ((c (copy-list list)))
              (decf (nth i c))
              (incf res (f (1- n) c)))))
        res)))

(defun ff (n)
  (decf n)
; 0 1 2 3 4 5 6 7 8 9
  (+ (f n (list 3 2 3 3 3 3 3 3 3 3))
     (f n (list 3 3 2 3 3 3 3 3 3 3))
     (f n (list 3 3 3 2 3 3 3 3 3 3))
     (f n (list 3 3 3 3 2 3 3 3 3 3))
     (f n (list 3 3 3 3 3 2 3 3 3 3))
     (f n (list 3 3 3 3 3 3 2 3 3 3))
     (f n (list 3 3 3 3 3 3 3 2 3 3))
     (f n (list 3 3 3 3 3 3 3 3 2 3))
     (f n (list 3 3 3 3 3 3 3 3 3 2))))

;; answer is 227485267000992000
(time (print (ff 18)))

#|
ccl is MUCH faster here. Looks like it has super quick equal hash.

sbcl 1.1.7
227485267000992000
Evaluation took:
  1703.445 seconds of real time
  1694.794864 seconds of total run time (1694.264461 user, 0.530403 system)
  [ Run times consist of 0.950 seconds GC time, and 1693.845 seconds non-GC time. ]
  99.49% CPU
  4,784,492,656,389 processor cycles
  1,165,585,392 bytes consed

ccl 1.10.0-dev -- wow! How can it be that fast?
227485267000992000
(P (FF 18))
took 15,112,000 microseconds (15.112000 seconds) to run.
      4,217,525 microseconds ( 4.217525 seconds, 27.91%) of which was spent in GC.
During that period, and with 8 available CPU cores,
     14,960,496 microseconds (14.960496 seconds) were spent in user mode
         46,801 microseconds ( 0.046801 seconds) were spent in system mode
 1,139,141,600 bytes of memory allocated.

LispWorks7.0 prof 64bit, even faster:
User time = 12.402
System time = 0.124
Elapsed time = 12.564
Allocation = 1129611808 bytes
0 Page faults
|#

il71 (il71) wrote :

>Old code, I don't want to revisit it

Yeah, the code is probably dump, but time diff sbcl vs "competitors" still holds (:)

il71 (il71) wrote :

s/dump/dumb

Whatever.

Jan Moringen (scymtym) wrote :

This is caused by hash collisions because the keys are long-ish lists and EQUAL-HASH (via SXHASH) only traverses CONS trees up to depth 4. CCL seems to cut off at depth 6.

Changing SBCL's +MAX-HASH-DEPTHOID+ to 10 makes the computation finish in ~ 4 seconds on my machine.

il71 (il71) wrote :

Thanks for explanation, Jan!

Do you guys have some perf suite, so that you can change the defaults (there are probably many) and measure?

Douglas Katzman (dougk) wrote :

+MAX-HASH-DEPTHOID+ isn't a "default". Bad things would happen if it were changed in a running system.

As a practical matter, hash keys that are lists that are the same up to 10 items are probably bad hash keys. Nevertheless you could rebuild from source with a changed value.

But you can do better than that. For example - and I'm not suggesting that you use system-internal functions, I'm just showing what is possible, given this:
(defun myhashfn (x)
  (sb-int:named-let h ((x x))
   (typecase x
    (fixnum x)
    (list
     (if x
        (sb-int:mix (the fixnum (h (car x)))
                    (the fixnum (h (cdr x))))
        0)))))

(let ((cache (make-hash-table :test test :hash-function 'myhashfn))) ; do this
and then: (time(ff 18))
Evaluation took:
  3.414 seconds of real time

Two other things I'll mention:
- There are somewhere between dozens and hundreds of build-time parameters most of which should not have to be touched; I don't think the right place to start touching them is observing that some lists hash badly under the default SXHASH, and question what else you can tweak.
- I kind of agree that in general 4 as the cutoff is probably a value that hasn't been adjusted in 20 years, and makes the wrong trade-off in time to compute the hash versus time to compare keys.

But as you can see, implementing an abstract memoize function isn't as simple as just using a builtin thing unspecific to your problem.

Changed in sbcl:
status: New → Invalid
il71 (il71) wrote :

I see, thanks Douglas!

This is helpfull. Of course, there is no ultimate hash function for all cases.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers