Bug in LOGAND (AMD64)

Bug #1026634 reported by Stas Boukarev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Fix Released
High
Unassigned

Bug Description

By Eric Marsden from sbcl-devel@

More bugs from random-integer testing.

This is SBCL 1.0.57.59-97527a4-dirty, an implementation of ANSI Common Lisp.
* (defun bug (x)
    (declare (optimize (space 3))
             (type (integer 12417236377505266230 12417274239874990070) x))
    (logand 8459622733968096971 x))
* (bug 12417237222845306758)
11836165733894624898 ;; <- incorrect, should be 2612793697039849090

Stas Boukarev (stassats)
Changed in sbcl:
importance: Undecided → High
status: New → Triaged
Revision history for this message
David Lichteblau (david-lichteblau) wrote :

bisect result (have not investigated further, but certainly seems plausible):

52b1041d3a14eaa4e45f6d8edfbdc0dec4292239 is the first bad commit
commit 52b1041d3a14eaa4e45f6d8edfbdc0dec4292239
Author: Christophe Rhodes <email address hidden>
Date: Thu Apr 5 19:55:05 2012 +0100

    Fix bug in unsigned modular arithmetic using a signed implementation

    If we aim to be clever by implementing an unsigned modular arithmetic
    computation using signed arithmetic, we need to make sure that we
    don't accidentally contaminate the computation with any extraneous
    high bits. This means that we must be sure to cut constants to the
    appropriate width, as well as computations, so do so; this fixes
    bug #974406 from Paul Dietz. (In addition the change from cutting
    to the requested width to the implementation width fixes #903821,
    so Go Team!)

    Test cases. Minimally horrible test case for #903821; far worse
    suggestions were made on #sbcl IRC...

:100644 100644 509ca7c5043b162d723f6a688119439e774bd065 328934f43ae6875bab69a443ce7567722a07114f M NEWS
:040000 040000 b57dbe6c0b63fc1dd48a5840188001d35f048e0f a0fbdf585fe5f186ae594a8730c34b9ebadf25da M src
:040000 040000 c9d018018bdd0753fe721f2575e5dbec4f0de6e0 a196a935833031531cb583eb9297499ec65f7b8f M tests
bisect run success

Revision history for this message
Christophe Rhodes (csr21-cantab) wrote : Re: [Bug 1026634] Re: Bug in LOGAND (AMD64)

David Lichteblau <email address hidden> writes:

> bisect result (have not investigated further, but certainly seems
> plausible):

Yes, completely.

The problem is that my oh-so-clever bug fix for the previous bugs
involving using signed modular arithmetic to implement unsigned
arithmetic is incomplete. The transformation that needs to be done is
something like

   (logand <e1> <e2>)
=> (logand (mask-signed-field 63 <e1>) (mask-signed-field 63 <e2>))

with precondition that <e1> and <e2> are known positive quantities, and
that the logand itself is known to return a positive fixnum.

Before the patch identified, this transformation was done for <e1>, <e2>
being known modular-type expressions (i.e. involving mostly bitwise
logical and arithmetic operators). After the patch identified, this
transformation was also done for constants. However, simple variables
such as x don't have this applied, which causes the bug. An example of
something that is now correctly optimized which wasn't previously is
  (defun not-bug (x)
    (declare (optimize (space 3))
             (type (integer 12417236377505266230 12417274239874990070)
             x))
    (logand 8459622733968096971 (logior 1 x)))

Now, it's fairly easy to wrap x in the correct (mask-signed-field ...)
in the compiler, using (I think) filter-lvar. What isn't easy is to
tell the compiler not to recurse infinitely in modular arithmetic
optimization, optimizing away infinitely; I think this or something like
it is the "right" solution, however. Options that might fix the problem
but would be a long way from the right solution are:

  1. disable using signed modular arithmetic to implement unsigned
     computations
     (a) and call it a day;
     (b) and also implement positive-fixnum modular arithmetic variants;
  2. having some kind of non-modular-optimizeable
     %mask-signed-field-but-dont-recurse to mark some generated forms as
     being "finished with": though judging by my comments regarding
     previous work in src/compiler/srctran.lisp around the area, maybe
     that might be right after all if all the details are sorted.

I'm not going to get to any of these options for at least three weeks.

Christophe

Revision history for this message
Eric Marsden (eric-marsden) wrote :

Hi,

May I suggest disabling this optimization temporarily? It would seem preferable to silently returning incorrect results, which is still happening.

Eric

Revision history for this message
Christophe Rhodes (csr21-cantab) wrote :

Eric Marsden <email address hidden> writes:

> May I suggest disabling this optimization temporarily? It would seem
> preferable to silently returning incorrect results, which is still
> happening.

I think you're probably right :-/ I just worry about the "temporarily"
:-(

C.

Revision history for this message
Paul Khuong (pvk) wrote :

I think the fix below works.

1. mask-signed-field that casts an unsigned word into a fixnum is ir2-converted into a word-to-fixnum move
2. in cut-to-width, only cut a reference to a variable to width if the destination isn't an operation that cuts to width itself. This way, we avoid inserting m-s-f between m-s-f and a lambda var, or logand between logand and a lambda-var.

At first, I thought 2. might be subsumed by flattening chains of logand/m-s-f, but that doesn't seem likely (not that flattening wouldn't be a nice improvement).

Paul Khuong (pvk)
Changed in sbcl:
status: Triaged → Fix Committed
Changed in sbcl:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.