Comment 3 for bug 309443

Gustavo (gugamilare) wrote :

I question the validity of this bug. The code in the description is not a good detector of hash collisions because it uses only small fixnums. For instance, the following code gives no collisions at all.

(let ((hits (make-hash-table)))
  (dotimes (i 2500)
    (dotimes (j 2500)
      (let* ((ij (cons (random most-positive-fixnum)
                       (random most-positive-fixnum)))
             (newlist (push ij (gethash (sxhash ij) hits))))
        (when (cdr newlist)
          (format t "~&collision: ~S~%" newlist))))))

What causes the amount of collisions in the example is that SBCL's implementation of SXHASH for fixnums has the obvious drawback that the lower bits of the fixnums only affect the lower bits of the result and MIX doesn't do a good job in changing that.

Of course, if you want to make a better case for small numbers (which are probably more used in the code), then what needs to be done is to bring the lower bits higher in the hash. For instance, the following solves the "problem" completely:

(433494437 is a prime number, I got it from Wikipedia)

--- a/src/code/target-sxhash.lisp
+++ b/src/code/target-sxhash.lisp
@@ -67,7 +67,7 @@
   ;; (These are vaguely like the ideas used in many cryptographic
   ;; algorithms, but we're not pushing them hard enough here for them
   ;; to be cryptographically strong.)
- (let* ((xy (+ (* x 3) y)))
+ (let* ((xy (+ (logand most-positive-fixnum (* x 433494437)) y)))
     (logand most-positive-fixnum
             (logxor 441516657
                     xy