SBCL

Perf: sbcl vs ccl/lispworks on particular (short) problem

Bug #1712130 reported by il71 on 2017-08-21

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	SBCL	Invalid	Undecided	Unassigned

Bug Description

NOT A BUG! Perf issue.

You may find it interesting. It shows HUGE diff in perf: SBCL vs CCL/Lispworks on this particular problem.

Old code, I don't want to revisit it, at that time I was learning CL.

;;;;;;;;;

#|
This is the solution to projecteuler #172
It's not the solution that's interesting but the
time difference between SBCL, CCL, Lispworks.
|#

;;;;------------ Standard stuff for memoizing
(defun memo (fn test)
  (let ((cache (make-hash-table :test test)))
    #'(lambda (&rest args)
        (multiple-value-bind (val win) (gethash args cache)
          (if win
            val
            (setf (gethash args cache)
                  (apply fn args)))))))

(defun memoize (fn &key (test #'equal))
(setf (symbol-function fn)
(memo (symbol-function fn) test)))

(defmacro defun-memo (fn args &body body)
`(memoize (defun ,fn ,args (declare (notinline ,fn)) ,@body)))
;;;;------------

(defun-memo f (n list)
  (if (= n 1)
      (count-if #'plusp list)
      (let ((res 0))
        (loop for i from 0 to 9 do
          (when (plusp (nth i list))
            (let ((c (copy-list list)))
              (decf (nth i c))
              (incf res (f (1- n) c)))))
        res)))

(defun ff (n)
  (decf n)
; 0 1 2 3 4 5 6 7 8 9
  (+ (f n (list 3 2 3 3 3 3 3 3 3 3))
     (f n (list 3 3 2 3 3 3 3 3 3 3))
     (f n (list 3 3 3 2 3 3 3 3 3 3))
     (f n (list 3 3 3 3 2 3 3 3 3 3))
     (f n (list 3 3 3 3 3 2 3 3 3 3))
     (f n (list 3 3 3 3 3 3 2 3 3 3))
     (f n (list 3 3 3 3 3 3 3 2 3 3))
     (f n (list 3 3 3 3 3 3 3 3 2 3))
     (f n (list 3 3 3 3 3 3 3 3 3 2))))

;; answer is 227485267000992000
(time (print (ff 18)))

#|
ccl is MUCH faster here. Looks like it has super quick equal hash.

sbcl 1.1.7
227485267000992000
Evaluation took:
  1703.445 seconds of real time
  1694.794864 seconds of total run time (1694.264461 user, 0.530403 system)
  [ Run times consist of 0.950 seconds GC time, and 1693.845 seconds non-GC time. ]
  99.49% CPU
  4,784,492,656,389 processor cycles
  1,165,585,392 bytes consed

ccl 1.10.0-dev -- wow! How can it be that fast?
227485267000992000
(P (FF 18))
took 15,112,000 microseconds (15.112000 seconds) to run.
      4,217,525 microseconds ( 4.217525 seconds, 27.91%) of which was spent in GC.
During that period, and with 8 available CPU cores,
     14,960,496 microseconds (14.960496 seconds) were spent in user mode
         46,801 microseconds ( 0.046801 seconds) were spent in system mode
1,139,141,600 bytes of memory allocated.

LispWorks7.0 prof 64bit, even faster:
User time = 12.402
System time = 0.124
Elapsed time = 12.564
Allocation = 1129611808 bytes
0 Page faults
|#

Revision history for this message

il71 (il71) wrote on 2017-08-21:

>Old code, I don't want to revisit it

Yeah, the code is probably dump, but time diff sbcl vs "competitors" still holds (:)

Revision history for this message

il71 (il71) wrote on 2017-08-21:

s/dump/dumb

Whatever.

Revision history for this message

Jan Moringen (scymtym) wrote on 2017-08-21:

This is caused by hash collisions because the keys are long-ish lists and EQUAL-HASH (via SXHASH) only traverses CONS trees up to depth 4. CCL seems to cut off at depth 6.

Changing SBCL's +MAX-HASH-DEPTHOID+ to 10 makes the computation finish in ~ 4 seconds on my machine.

Revision history for this message

il71 (il71) wrote on 2017-08-21:

Thanks for explanation, Jan!

Do you guys have some perf suite, so that you can change the defaults (there are probably many) and measure?

Revision history for this message

Douglas Katzman (dougk) wrote on 2017-08-22:

+MAX-HASH-DEPTHOID+ isn't a "default". Bad things would happen if it were changed in a running system.

As a practical matter, hash keys that are lists that are the same up to 10 items are probably bad hash keys. Nevertheless you could rebuild from source with a changed value.

But you can do better than that. For example - and I'm not suggesting that you use system-internal functions, I'm just showing what is possible, given this:
(defun myhashfn (x)
  (sb-int:named-let h ((x x))
   (typecase x
    (fixnum x)
    (list
     (if x
        (sb-int:mix (the fixnum (h (car x)))
                    (the fixnum (h (cdr x))))
        0)))))

(let ((cache (make-hash-table :test test :hash-function 'myhashfn))) ; do this
and then: (time(ff 18))
Evaluation took:
3.414 seconds of real time

Two other things I'll mention:
- There are somewhere between dozens and hundreds of build-time parameters most of which should not have to be touched; I don't think the right place to start touching them is observing that some lists hash badly under the default SXHASH, and question what else you can tweak.
- I kind of agree that in general 4 as the cutoff is probably a value that hasn't been adjusted in 20 years, and makes the wrong trade-off in time to compute the hash versus time to compare keys.

But as you can see, implementing an abstract memoize function isn't as simple as just using a builtin thing unspecific to your problem.

Changed in sbcl:
status:	New → Invalid

Revision history for this message

il71 (il71) wrote on 2017-08-22:

I see, thanks Douglas!

This is helpfull. Of course, there is no ultimate hash function for all cases.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.