optimizers for LDB result in pessimal code
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
SBCL |
New
|
Undecided
|
Unassigned |
Bug Description
(defun foo ()
(let ((x (random (ash 1 128))))
(loop repeat 1000 sum (loop for i below 128 by 8 sum (ldb (byte 8 i) x)))))
(defun bar ()
(declare (notinline sb-kernel:%ldb)) ; add this line
(let ((x (random (ash 1 128))))
(loop repeat 1000 sum (loop for i below 128 by 8 sum (ldb (byte 8 i) x)))))
BAR is 50% faster and does not cons.
* (time(foo))
Evaluation took:
0.001 seconds of real time
0.001063 seconds of total run time (0.001063 user, 0.000000 system)
100.00% CPU
1,280,780 processor cycles
360,432 bytes consed
* (time(bar))
Evaluation took:
0.001 seconds of real time
0.000000 seconds of total run time (0.000000 user, 0.000000 system)
0.00% CPU
774,976 processor cycles
0 bytes consed
This is because LDB wants to become ASH and other stuff, which operate on bignums, but the full call to %LDB has a special case for producing a fixnum when possible. But if you declare LDB notinline then you're in worse shape because you cons a BYTE spec.