Steel Bank Common Lisp

suboptimal x86-64 single-float-bits

Reported by Nikodemus Siivola on 2010-04-04
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Low
Unassigned

Bug Description

As noted by Lutz Euler on sbcl-devel:

"SINGLE-FLOAT-BITS may be problematic as in it reads 64 bits from the
single-stack which AMD advises against if only 32 bits have been written
to the same address beforehand (this is a "narrow-to-wide store-to-load
forwarding restriction"). I have not yet evaluated how often this case
occurs and whether it poses a performance problem.

The code SINGLE-FLOAT-BITS generates could be improved anyway.
For example, when the source is on the single-stack it generates

 MOV RDX, [RBP-8]
 SHL RDX, 32
 SAR RDX, 32

which is a funny way to sign-extend a value and where

 MOVSXD RDX, DWORD PTR [RBP-8]

would be better -- a change that happens to shorten the read from the
single-stack from 64 to 32 bits, too, thus addressing the issue at hand."

Paul Khuong (pvk) wrote :

I think this addresses most of my complaints about single-float-bits on x86-64.

(define-vop (single-float-bits)
  (:args (float :scs (single-reg descriptor-reg)
                :load-if (not (sc-is float single-stack))))
  (:results (bits :scs (signed-reg any-reg)))
  (:arg-types single-float)
  (:result-types signed-num)
  (:translate single-float-bits)
  (:policy :fast-safe)
  (:vop-var vop)
  (:generator 4
    (sc-case float
      (single-reg
       (inst movd bits float)
       (inst movsxd bits (reg-in-size bits :dword))
       (when (sc-is bits any-reg)
         (inst shl bits n-fixnum-tag-bits)))
      (single-stack
       (inst movsxd bits (make-ea :dword ; c.f. ea-for-sf-stack
                                  :base rbp-tn
                                  :disp (frame-byte-offset (tn-offset float))))
       (when (sc-is bits any-reg)
         (inst shl bits n-fixnum-tag-bits)))
      (descriptor-reg
       (move bits float)
       (cond ((sc-is bits any-reg)
              (inst sar bits (- 32 n-fixnum-tag-bits))
              (inst and bits (lognot fixnum-tag-mask)))
             (t
              (inst sar bits 32)))))))

Nathan Froyd (froydnj) wrote :

I'm not enthused about the ANY-REG testing; let representation selection insert the necessary move VOPs instead of handling fixnumifying here. But the rest of the VOP looks fine.

Paul Khuong (pvk) wrote :

SAR followed by SHL is a pet peeve of mine.

Paul Khuong (pvk) wrote :

commit 9f5c04e93072a77a73af8fbf8a96d8a127db3d83
Author: Paul Khuong <email address hidden>
Date: Sat Jun 11 00:09:11 2011 -0400

   Improve SINGLE-FLOAT-BITS on x86-64

    Avoid narrow-store-to-wide-load hazards, and generally emit sane
   MOVSXD for sign extension.

    Fixes lp#555201.

Changed in sbcl:
status: Confirmed → Fix Committed
Changed in sbcl:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers