libatlas3gf-sse2 zgemv function gives wrong result

Bug #406520 reported by FP
24
This bug affects 2 people
Affects Status Importance Assigned to Milestone
atlas (Debian)
Fix Released
Unknown
atlas (Ubuntu)
Fix Released
High
Morten Kjeldgaard

Bug Description

Binary package hint: libatlas3gf-sse2

The blas matrix-vector complex product is completely broken if using sse2 even for very small matrix.

The consequences for lapack, python numpy.linalg functions, etc. are catastrophic.

Here a very simple t.f FORTRAN program showing the bug

      program sse2_bug
      complex*16 cmone
      complex*16 a(2,2)
      complex*16 x(2)
      complex*16 y(2)
      cmone = (1.,0.)
      a(1,1) = (1.,0.)
      a(2,1) = (0.,0.)
      a(1,2) = (0.,0.)
      a(2,2) = (1.,0.)
      x(1) = (0.,3.)
      x(2) = (1.,0.)
      y(1) = (0.,0.)
      y(2) = (1.,0.)
      call zgemv('n',2,1,cmone,a,2,x,1,cmone,y,1)
      write(*,*) y(1)," == 3j"
      end

Compilation :

  gfortran -o t t.f -lblas

Buggy lib:

  ldd ./t | grep libblas.so
 libblas.so.3gf => /usr/lib/sse2/atlas/libblas.so.3gf (0xb7a8a000)
  ./t
   ( 0.0000000000000000 ,-1.25759005142687227E-038) == 3j

Non-sse2 libs give correct result:

  (export LD_LIBRARY_PATH=/usr/lib/sse/atlas ; ./t)
   ( 0.0000000000000000 , 3.0000000000000000 ) == 3j
  (export LD_LIBRARY_PATH=/usr/lib/atlas ; ./t)
   ( 0.0000000000000000 , 3.0000000000000000 ) == 3j

Description: Ubuntu 9.04
Release: 9.04

libatlas3gf-sse2:
  Installed: 3.6.0-22ubuntu2
  Candidate: 3.6.0-22ubuntu2
  Version table:
 *** 3.6.0-22ubuntu2 0
        500 http://ftp.free.org jaunty/universe Packages
        100 /var/lib/dpkg/status

Related branches

Revision history for this message
FP (fabrice-pardo) wrote :

An simpler program, same output:

      program sse2_bug
      complex*16 cmone
      complex*16 a(2)
      complex*16 x
      complex*16 y
      cmone = (1.,0.)
      a(1) = (1.,0.)
      a(2) = (0.,0.)
      x = (0.,3.)
      y = (0.,0.)
      call zgemv('n',1,1,cmone,a,1,x,1,cmone,y,1)
      write(*,*) y," == 3j"
      end

Revision history for this message
FP (fabrice-pardo) wrote :

1) Same bug on these 3 processors:

model name : Intel(R) Core(TM)2 CPU T5500 @ 1.66GHz
model name : Intel(R) Pentium(R) D CPU 3.00GHz
model name : Intel(R) Pentium(R) 4 CPU 2.40GHz

2) The wrong near zero result is random.

Revision history for this message
FP (fabrice-pardo) wrote :

Binaries from intrepid and debian/lenny are OK:
  libatlas3gf-sse2_3.6.0-22_i386
  libatlas3gf-sse2_3.6.0-22ubuntu1_i386

Binaries from jaunty and debian/squeeze are WRONG:
  libatlas3gf-sse2_3.6.0-24_i386
  libatlas3gf-sse2_3.6.0-22ubuntu2_i386

Revision history for this message
Fernando Perez (fdo.perez) wrote :

This is indeed a really serious problem for numerical use of Ubuntu.

Note: I suspect this bug is a dupe of the older:

https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510

Morten Kjeldgaard (mok0)
Changed in atlas (Ubuntu):
assignee: nobody → Morten Kjeldgaard (mok0)
Revision history for this message
Scott Howard (showard314) wrote :

Hey - found this upstream bug related to this:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=517826

Changed in atlas (Debian):
status: Unknown → New
Revision history for this message
Dimitrios Symeonidis (azimout) wrote :

Ubuntu includes a very old version of libatlas (3.6.0 from Dec 2003, as opposed to 3.9.16 from Oct 2009)

Has anyone tried the latest upstream version?

Changed in atlas (Ubuntu):
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Dimitrios Symeonidis (azimout) wrote :

confirming, importance high

Revision history for this message
Pauli Virtanen (pauli-virtanen) wrote :

Are these fixed in David C's PPA Atlas packages? IIRC, he managed to fix some SSE2-specific errors:

https://launchpad.net/~david-ar/+archive/ppa/+packages

Revision history for this message
Xavier Gnata (xavier-gnata-gmail) wrote :

Please remove this package from ubuntu. It's buggy. It does provide wrong numerical results.
Wrong results (without a warning) is the worst thing you can get when you use atlas (old and supposibly robust).
http://mail.scipy.org/pipermail/numpy-discussion/2010-January/047766.html

You can remove this packqge because you provide a non sse version which is ok (maybe slower but bug free).

Revision history for this message
Scott Howard (showard314) wrote :

Since Ubuntu gets this package from debian, and from comment #3 it appears debian has the same bug, has someone discussed this with the debian maintainers [1]? We have already merged 3.6-0-24 into Ubuntu, and want to avoid this bug propagating any further.

Binaries from intrepid and debian/lenny are OK:
  libatlas3gf-sse2_3.6.0-22_i386
  libatlas3gf-sse2_3.6.0-22ubuntu1_i386

Binaries from jaunty and debian/squeeze are WRONG:
  libatlas3gf-sse2_3.6.0-24_i386
  libatlas3gf-sse2_3.6.0-22ubuntu2_i386

[1] <email address hidden>

Revision history for this message
Sylvestre Ledru (sylvestre) wrote :

FYI, this bug is fixed with Debian Experimental ATLAS packages.

Changed in atlas (Debian):
status: New → Fix Released
Revision history for this message
Pauli Virtanen (pauli-virtanen) wrote :

Seems to be finally fixed in the 3.8.3-22ubuntu2 packages in Maverick.

Revision history for this message
Pauli Virtanen (pauli-virtanen) wrote :

Getting a fix also in the Lucid LTS would be nice, however.

Changed in atlas (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.