parallel assignment optimization
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
SBCL |
Confirmed
|
Wishlist
|
Unassigned |
Bug Description
SBCL is missing some optimizations when parallel assignments involve swapping or rotating values.
IMO, the following three functions should compile the same (as in test2); the explicit temporary shouldn't be needed.
(defun test1 (x y)
(declare (type fixnum x y))
(setf (values x y)
(values y x))
(list x y))
(defun test2 (x y)
(declare (type fixnum x y))
(let ((tmp y))
(setf x y
y tmp))
(list x y))
(defun test3 (x y)
(declare (type fixnum x y))
(psetf x y ;; or psetq
y x)
(list x y))
(prog ()
(declare (optimize (speed 3)))
(compile 'test1)
(compile 'test2)
(compile 'test3))
;; disassembly truncated to only show the region of interest
(disassemble #'test1)
; disassembly for TEST1
; 04077487: 488BCE MOV RCX, RSI ; no-arg-parsing entry point
; 48A: 498BF0 MOV RSI, R8
; 48D: 4C8BC1 MOV R8, RCX
; 490: 488BD6 MOV RDX, RSI
; 493: 498BF8 MOV RDI, R8
(disassemble #'test2)
; disassembly for TEST2
; 04077357: 4C8BC6 MOV R8, RSI ; no-arg-parsing entry point
; 5A: 498BD0 MOV RDX, R8
; 5D: 488BFE MOV RDI, RSI
(disassemble #'test3)
; disassembly for TEST3
; 04179BE7: 488BCE MOV RCX, RSI ; no-arg-parsing entry point
; BEA: 498BF0 MOV RSI, R8
; BED: 4C8BC1 MOV R8, RCX
; BF0: 488BD6 MOV RDX, RSI
; BF3: 498BF8 MOV RDI, R8
Changed in sbcl: | |
status: | New → Confirmed |
importance: | Undecided → Low |
tags: | added: code-generation compiler-ir2 |
Changed in sbcl: | |
importance: | Low → Wishlist |
If the (let ((tmp y)) ...) in test2 is changed to (let ((tmp x)) ...)
to more closely parallel test1 and test3 then the compiler generates
identical code for all three functions.
That said, this does appear non-optimal, and I can think of two
possible improvements. The first is that a three-MOV sequence for
swapping two registers by way of a temporary should be quite possible.
The second is that all five MOVs could be eliminated by swapping the
variable names at compile time instead of their contents at runtime.
Unfortunately, while I can see what effect would help, I haven't the
slightest idea how to make the compiler do that.
On Sun, Nov 8, 2009 at 1:03 AM, Daniel Herring /bugs.launchpad .net/bugs/ 478073
<email address hidden> wrote:
> Public bug reported:
>
> SBCL is missing some optimizations when parallel assignments involve
> swapping or rotating values.
>
> IMO, the following three functions should compile the same (as in
> test2); the explicit temporary shouldn't be needed.
>
> (defun test1 (x y)
> (declare (type fixnum x y))
> (setf (values x y)
> (values y x))
> (list x y))
> (defun test2 (x y)
> (declare (type fixnum x y))
> (let ((tmp y))
> (setf x y
> y tmp))
> (list x y))
> (defun test3 (x y)
> (declare (type fixnum x y))
> (psetf x y ;; or psetq
> y x)
> (list x y))
>
> (prog ()
> (declare (optimize (speed 3)))
> (compile 'test1)
> (compile 'test2)
> (compile 'test3))
>
> ;; disassembly truncated to only show the region of interest
> (disassemble #'test1)
> ; disassembly for TEST1
> ; 04077487: 488BCE MOV RCX, RSI ; no-arg-parsing entry point
> ; 48A: 498BF0 MOV RSI, R8
> ; 48D: 4C8BC1 MOV R8, RCX
> ; 490: 488BD6 MOV RDX, RSI
> ; 493: 498BF8 MOV RDI, R8
> (disassemble #'test2)
> ; disassembly for TEST2
> ; 04077357: 4C8BC6 MOV R8, RSI ; no-arg-parsing entry point
> ; 5A: 498BD0 MOV RDX, R8
> ; 5D: 488BFE MOV RDI, RSI
> (disassemble #'test3)
> ; disassembly for TEST3
> ; 04179BE7: 488BCE MOV RCX, RSI ; no-arg-parsing entry point
> ; BEA: 498BF0 MOV RSI, R8
> ; BED: 4C8BC1 MOV R8, RCX
> ; BF0: 488BD6 MOV RDX, RSI
> ; BF3: 498BF8 MOV RDI, R8
>
> ** Affects: sbcl
> Importance: Undecided
> Status: New
>
> --
> parallel assignment optimization
> https:/
> You received this bug notification because you are a member of SBCL
> hackers, which is subscribed to SBCL.
>
> Status in Steel Bank Common Lisp: New
>
> Bug description:
> SBCL is missing some optimizations when parallel assignments involve swapping or rotating values.
>
> IMO, the following three functions should compile the same (as in test2); the explicit temporary shouldn't be needed.
>
> (defun test1 (x y)
> (declare (type fixnum x y))
> (setf (values x y)
> (values y x))
> (list x y))
> (defun test2 (x y)
> (declare (type fixnum x y))
> (let ((tmp y))
> (setf x y
> y t...