optimization: to avoid &rest allocation for :TEST and :KEY parameters

Bug #310005 reported by Tobias C. Rittweiler
4
Affects Status Importance Assigned to Milestone
SBCL
Fix Released
Low
Unassigned

Bug Description

Passing functions that take a &REST list as :TEST or :KEY parameters
results in consing, even though we know that a &rest list does not
have to be allocated for such cases.

The compiler could transform something like

  (find #\Space seq :test #'char=)

into

  (find #\Space seq :test #'(lambda (x y)
                              (declare (inline char=))
         (char= x y)))

for all functions that are either /known/ or /inlineable/, and that
take a &REST list.

The same applies to predicate arguments for FOO-IF, FOO-IF-NOT, and
also SORT, and STABLE-SORT.

Test case:

  (declaim (optimize speed))

  (defun make-randomized-vector (n)
    (declare (fixnum n))
    (let ((vector (make-array n)))
      (loop for i from 0 below n do (setf (aref vector i) (random n)))
      vector))

  (defun sort-vector-1 (vector)
    (sort vector #'<))

  (defun sort-vector-2 (vector)
    (sort vector #'(lambda (x y) (< x y))))

You'll see that SORT-VECTOR-2 is twice as fast as SORT-VECTOR-1:

  CL-USER> (let ((v (make-randomized-vector 1000000)))
      (time (sort-vector-1 v))
      nil)
  Evaluation took:
    2.292 seconds of real time
    2.292143 seconds of total run time (2.288143 user, 0.004000 system)
    100.00% CPU
    4,266,494,958 processor cycles
    11,006,752 bytes consed

  NIL

  CL-USER> (let ((v (make-randomized-vector 1000000)))
      (time (sort-vector-2 v))
      nil)
  Evaluation took:
    1.056 seconds of real time
    1.048065 seconds of total run time (1.048065 user, 0.000000 system)
    99.24% CPU
    1,966,816,029 processor cycles
    228,656 bytes consed

  NIL

description: updated
Revision history for this message
Nikodemus Siivola (nikodemus) wrote :

I cannot replicate the consing on 1.0.23.51 on x86/darwin, but the performance effect is similar (~70% speedup.)

An alternative is to transform to global two-arg versions for such functions:

(defun two-arg-< (x y)
  (< x y))

(defun sort-vector-3 (vector)
  (sort vector #'two-arg-<))

Performs just as well as the one with a lambda, but is takes less space in large applications.

Finally, if we do take the inline lambda transform, whatever is known about the sequence argument can be used to drive optimization of :TEST and :KEY. In

 (sort (the (simple-array single-float (*)) vector) :test #'<)

#'< can be replaced with (lambda (x y) (declare (single-float x y)) (< x y)) -- which is likely to be a significant win over the generic version.

Changed in sbcl:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Tobias C. Rittweiler (tcr) wrote :

I wouldn't have though that the consing could possibly be platform-dependent,
but the test output in my original report came from SBCL 1.0.23.21 on Linux x86-32.

  -T.

Revision history for this message
Nikodemus Siivola (nikodemus) wrote :

Stepping down to LOW importance as local workarounds are possible.

Changed in sbcl:
importance: Medium → Low
Revision history for this message
Paul Khuong (pvk) wrote :

Not so recent optimisations in &rest lists have made this quicker, fwiw. I now see no additional consing and a ~30% slowdown.

Revision history for this message
Stas Boukarev (stassats) wrote :

&rest no longer conses and things like char= or > are transformed to their two-arg variants now.

Changed in sbcl:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.