Steel Bank Common Lisp

Bug in LOGAND (AMD64)

Reported by Stas Boukarev on 2012-07-19
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
High
Unassigned

Bug Description

By Eric Marsden from sbcl-devel@

More bugs from random-integer testing.

This is SBCL 1.0.57.59-97527a4-dirty, an implementation of ANSI Common Lisp.
* (defun bug (x)
    (declare (optimize (space 3))
             (type (integer 12417236377505266230 12417274239874990070) x))
    (logand 8459622733968096971 x))
* (bug 12417237222845306758)
11836165733894624898 ;; <- incorrect, should be 2612793697039849090

Stas Boukarev (stassats) on 2012-07-19
Changed in sbcl:
importance: Undecided → High
status: New → Triaged

bisect result (have not investigated further, but certainly seems plausible):

52b1041d3a14eaa4e45f6d8edfbdc0dec4292239 is the first bad commit
commit 52b1041d3a14eaa4e45f6d8edfbdc0dec4292239
Author: Christophe Rhodes <email address hidden>
Date: Thu Apr 5 19:55:05 2012 +0100

    Fix bug in unsigned modular arithmetic using a signed implementation

    If we aim to be clever by implementing an unsigned modular arithmetic
    computation using signed arithmetic, we need to make sure that we
    don't accidentally contaminate the computation with any extraneous
    high bits. This means that we must be sure to cut constants to the
    appropriate width, as well as computations, so do so; this fixes
    bug #974406 from Paul Dietz. (In addition the change from cutting
    to the requested width to the implementation width fixes #903821,
    so Go Team!)

    Test cases. Minimally horrible test case for #903821; far worse
    suggestions were made on #sbcl IRC...

:100644 100644 509ca7c5043b162d723f6a688119439e774bd065 328934f43ae6875bab69a443ce7567722a07114f M NEWS
:040000 040000 b57dbe6c0b63fc1dd48a5840188001d35f048e0f a0fbdf585fe5f186ae594a8730c34b9ebadf25da M src
:040000 040000 c9d018018bdd0753fe721f2575e5dbec4f0de6e0 a196a935833031531cb583eb9297499ec65f7b8f M tests
bisect run success

David Lichteblau <email address hidden> writes:

> bisect result (have not investigated further, but certainly seems
> plausible):

Yes, completely.

The problem is that my oh-so-clever bug fix for the previous bugs
involving using signed modular arithmetic to implement unsigned
arithmetic is incomplete. The transformation that needs to be done is
something like

   (logand <e1> <e2>)
=> (logand (mask-signed-field 63 <e1>) (mask-signed-field 63 <e2>))

with precondition that <e1> and <e2> are known positive quantities, and
that the logand itself is known to return a positive fixnum.

Before the patch identified, this transformation was done for <e1>, <e2>
being known modular-type expressions (i.e. involving mostly bitwise
logical and arithmetic operators). After the patch identified, this
transformation was also done for constants. However, simple variables
such as x don't have this applied, which causes the bug. An example of
something that is now correctly optimized which wasn't previously is
  (defun not-bug (x)
    (declare (optimize (space 3))
             (type (integer 12417236377505266230 12417274239874990070)
             x))
    (logand 8459622733968096971 (logior 1 x)))

Now, it's fairly easy to wrap x in the correct (mask-signed-field ...)
in the compiler, using (I think) filter-lvar. What isn't easy is to
tell the compiler not to recurse infinitely in modular arithmetic
optimization, optimizing away infinitely; I think this or something like
it is the "right" solution, however. Options that might fix the problem
but would be a long way from the right solution are:

  1. disable using signed modular arithmetic to implement unsigned
     computations
     (a) and call it a day;
     (b) and also implement positive-fixnum modular arithmetic variants;
  2. having some kind of non-modular-optimizeable
     %mask-signed-field-but-dont-recurse to mark some generated forms as
     being "finished with": though judging by my comments regarding
     previous work in src/compiler/srctran.lisp around the area, maybe
     that might be right after all if all the details are sorted.

I'm not going to get to any of these options for at least three weeks.

Christophe

Eric Marsden (eric-marsden) wrote :

Hi,

May I suggest disabling this optimization temporarily? It would seem preferable to silently returning incorrect results, which is still happening.

Eric

Eric Marsden <email address hidden> writes:

> May I suggest disabling this optimization temporarily? It would seem
> preferable to silently returning incorrect results, which is still
> happening.

I think you're probably right :-/ I just worry about the "temporarily"
:-(

C.

Paul Khuong (pvk) wrote :

I think the fix below works.

1. mask-signed-field that casts an unsigned word into a fixnum is ir2-converted into a word-to-fixnum move
2. in cut-to-width, only cut a reference to a variable to width if the destination isn't an operation that cuts to width itself. This way, we avoid inserting m-s-f between m-s-f and a lambda var, or logand between logand and a lambda-var.

At first, I thought 2. might be subsumed by flattening chains of logand/m-s-f, but that doesn't seem likely (not that flattening wouldn't be a nice improvement).

Paul Khuong (pvk) on 2013-05-18
Changed in sbcl:
status: Triaged → Fix Committed
Changed in sbcl:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers