parallel assignment optimization

Bug #478073 reported by Daniel Herring
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Confirmed
Wishlist
Unassigned

Bug Description

SBCL is missing some optimizations when parallel assignments involve swapping or rotating values.

IMO, the following three functions should compile the same (as in test2); the explicit temporary shouldn't be needed.

(defun test1 (x y)
   (declare (type fixnum x y))
   (setf (values x y)
         (values y x))
   (list x y))
(defun test2 (x y)
   (declare (type fixnum x y))
   (let ((tmp y))
     (setf x y
          y tmp))
   (list x y))
(defun test3 (x y)
  (declare (type fixnum x y))
  (psetf x y ;; or psetq
         y x)
  (list x y))

(prog ()
   (declare (optimize (speed 3)))
   (compile 'test1)
   (compile 'test2)
   (compile 'test3))

;; disassembly truncated to only show the region of interest
(disassemble #'test1)
; disassembly for TEST1
; 04077487: 488BCE MOV RCX, RSI ; no-arg-parsing entry point
; 48A: 498BF0 MOV RSI, R8
; 48D: 4C8BC1 MOV R8, RCX
; 490: 488BD6 MOV RDX, RSI
; 493: 498BF8 MOV RDI, R8
(disassemble #'test2)
; disassembly for TEST2
; 04077357: 4C8BC6 MOV R8, RSI ; no-arg-parsing entry point
; 5A: 498BD0 MOV RDX, R8
; 5D: 488BFE MOV RDI, RSI
(disassemble #'test3)
; disassembly for TEST3
; 04179BE7: 488BCE MOV RCX, RSI ; no-arg-parsing entry point
; BEA: 498BF0 MOV RSI, R8
; BED: 4C8BC1 MOV R8, RCX
; BF0: 488BD6 MOV RDX, RSI
; BF3: 498BF8 MOV RDI, R8

Revision history for this message
Alastair Bridgewater (alastair-bridgewater) wrote : Re: [Bug 478073] [NEW] parallel assignment optimization
Download full text (4.3 KiB)

If the (let ((tmp y)) ...) in test2 is changed to (let ((tmp x)) ...)
to more closely parallel test1 and test3 then the compiler generates
identical code for all three functions.

That said, this does appear non-optimal, and I can think of two
possible improvements. The first is that a three-MOV sequence for
swapping two registers by way of a temporary should be quite possible.
 The second is that all five MOVs could be eliminated by swapping the
variable names at compile time instead of their contents at runtime.
Unfortunately, while I can see what effect would help, I haven't the
slightest idea how to make the compiler do that.

On Sun, Nov 8, 2009 at 1:03 AM, Daniel Herring
<email address hidden> wrote:
> Public bug reported:
>
> SBCL is missing some optimizations when parallel assignments involve
> swapping or rotating values.
>
> IMO, the following three functions should compile the same (as in
> test2); the explicit temporary shouldn't be needed.
>
> (defun test1 (x y)
> (declare (type fixnum x y))
> (setf (values x y)
> (values y x))
> (list x y))
> (defun test2 (x y)
> (declare (type fixnum x y))
> (let ((tmp y))
> (setf x y
> y tmp))
> (list x y))
> (defun test3 (x y)
> (declare (type fixnum x y))
> (psetf x y ;; or psetq
> y x)
> (list x y))
>
> (prog ()
> (declare (optimize (speed 3)))
> (compile 'test1)
> (compile 'test2)
> (compile 'test3))
>
> ;; disassembly truncated to only show the region of interest
> (disassemble #'test1)
> ; disassembly for TEST1
> ; 04077487: 488BCE MOV RCX, RSI ; no-arg-parsing entry point
> ; 48A: 498BF0 MOV RSI, R8
> ; 48D: 4C8BC1 MOV R8, RCX
> ; 490: 488BD6 MOV RDX, RSI
> ; 493: 498BF8 MOV RDI, R8
> (disassemble #'test2)
> ; disassembly for TEST2
> ; 04077357: 4C8BC6 MOV R8, RSI ; no-arg-parsing entry point
> ; 5A: 498BD0 MOV RDX, R8
> ; 5D: 488BFE MOV RDI, RSI
> (disassemble #'test3)
> ; disassembly for TEST3
> ; 04179BE7: 488BCE MOV RCX, RSI ; no-arg-parsing entry point
> ; BEA: 498BF0 MOV RSI, R8
> ; BED: 4C8BC1 MOV R8, RCX
> ; BF0: 488BD6 MOV RDX, RSI
> ; BF3: 498BF8 MOV RDI, R8
>
> ** Affects: sbcl
> Importance: Undecided
> Status: New
>
> --
> parallel assignment optimization
> https://bugs.launchpad.net/bugs/478073
> You received this bug notification because you are a member of SBCL
> hackers, which is subscribed to SBCL.
>
> Status in Steel Bank Common Lisp: New
>
> Bug description:
> SBCL is missing some optimizations when parallel assignments involve swapping or rotating values.
>
> IMO, the following three functions should compile the same (as in test2); the explicit temporary shouldn't be needed.
>
> (defun test1 (x y)
> (declare (type fixnum x y))
> (setf (values x y)
> (values y x))
> (list x y))
> (defun test2 (x y)
> (declare (type fixnum x y))
> (let ((tmp y))
> (setf x y
> y t...

Read more...

Revision history for this message
Paul Khuong (pvk) wrote :

On 2009-11-08, at 8:27 AM, Alastair Bridgewater wrote:
> That said, this does appear non-optimal, and I can think of two
> possible improvements. The first is that a three-MOV sequence for
> swapping two registers by way of a temporary should be quite possible.
> The second is that all five MOVs could be eliminated by swapping the
> variable names at compile time instead of their contents at runtime.
> Unfortunately, while I can see what effect would help, I haven't the
> slightest idea how to make the compiler do that.

We'd want something like an SSAisation pass, and a good phi function
generator.

Paul Khuong

Revision history for this message
Daniel Herring (dherring) wrote : Re: [Bug 478073] parallel assignment optimization 4 of 20)

On Sun, 8 Nov 2009, Alastair Bridgewater - <email address hidden> wrote:

> If the (let ((tmp y)) ...) in test2 is changed to (let ((tmp x)) ...)
> to more closely parallel test1 and test3 then the compiler generates
> identical code for all three functions.

Ouch. Bad typo on my part. Look at the assembly again; there's a good
chance I cried wolf.

> That said, this does appear non-optimal, and I can think of two
> possible improvements. The first is that a three-MOV sequence for
> swapping two registers by way of a temporary should be quite possible.
> The second is that all five MOVs could be eliminated by swapping the
> variable names at compile time instead of their contents at runtime.
> Unfortunately, while I can see what effect would help, I haven't the
> slightest idea how to make the compiler do that.

Ok. These tests were supposed to check a variable rotation inside some
numerical loops. If the swap were done at compile time, it would also
require a loop unroll.

Thanks for looking,
Daniel

Revision history for this message
Daniel Herring (dherring) wrote : Re: [Bug 478073] parallel assignment optimization

On Sun, 8 Nov 2009, Alastair Bridgewater - <email address hidden> wrote:

> If the (let ((tmp y)) ...) in test2 is changed to (let ((tmp x)) ...)
> to more closely parallel test1 and test3 then the compiler generates
> identical code for all three functions.

Ouch. Bad typo on my part. Look at the assembly again; there's a good chance
I cried wolf.

> That said, this does appear non-optimal, and I can think of two
> possible improvements. The first is that a three-MOV sequence for
> swapping two registers by way of a temporary should be quite possible.
> The second is that all five MOVs could be eliminated by swapping the
> variable names at compile time instead of their contents at runtime.
> Unfortunately, while I can see what effect would help, I haven't the
> slightest idea how to make the compiler do that.

Ok. These tests were supposed to check a variable rotation inside some
numerical loops. If the swap were done at compile time, it would also require
a loop unroll.

Thanks for looking,
Daniel

Changed in sbcl:
status: New → Confirmed
importance: Undecided → Low
tags: added: code-generation compiler-ir2
Douglas Katzman (dougk)
Changed in sbcl:
importance: Low → Wishlist
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.