Ubuntu

libatlas3gf-sse2 zgemv function gives wrong result

Reported by FP on 2009-07-29
24
This bug affects 2 people
Affects Status Importance Assigned to Milestone
atlas (Debian)
Fix Released
Unknown
atlas (Ubuntu)
High
Morten Kjeldgaard

Bug Description

Binary package hint: libatlas3gf-sse2

The blas matrix-vector complex product is completely broken if using sse2 even for very small matrix.

The consequences for lapack, python numpy.linalg functions, etc. are catastrophic.

Here a very simple t.f FORTRAN program showing the bug

      program sse2_bug
      complex*16 cmone
      complex*16 a(2,2)
      complex*16 x(2)
      complex*16 y(2)
      cmone = (1.,0.)
      a(1,1) = (1.,0.)
      a(2,1) = (0.,0.)
      a(1,2) = (0.,0.)
      a(2,2) = (1.,0.)
      x(1) = (0.,3.)
      x(2) = (1.,0.)
      y(1) = (0.,0.)
      y(2) = (1.,0.)
      call zgemv('n',2,1,cmone,a,2,x,1,cmone,y,1)
      write(*,*) y(1)," == 3j"
      end

Compilation :

  gfortran -o t t.f -lblas

Buggy lib:

  ldd ./t | grep libblas.so
 libblas.so.3gf => /usr/lib/sse2/atlas/libblas.so.3gf (0xb7a8a000)
  ./t
   ( 0.0000000000000000 ,-1.25759005142687227E-038) == 3j

Non-sse2 libs give correct result:

  (export LD_LIBRARY_PATH=/usr/lib/sse/atlas ; ./t)
   ( 0.0000000000000000 , 3.0000000000000000 ) == 3j
  (export LD_LIBRARY_PATH=/usr/lib/atlas ; ./t)
   ( 0.0000000000000000 , 3.0000000000000000 ) == 3j

Description: Ubuntu 9.04
Release: 9.04

libatlas3gf-sse2:
  Installed: 3.6.0-22ubuntu2
  Candidate: 3.6.0-22ubuntu2
  Version table:
 *** 3.6.0-22ubuntu2 0
        500 http://ftp.free.org jaunty/universe Packages
        100 /var/lib/dpkg/status

Related branches

FP (fabrice-pardo) wrote :

An simpler program, same output:

      program sse2_bug
      complex*16 cmone
      complex*16 a(2)
      complex*16 x
      complex*16 y
      cmone = (1.,0.)
      a(1) = (1.,0.)
      a(2) = (0.,0.)
      x = (0.,3.)
      y = (0.,0.)
      call zgemv('n',1,1,cmone,a,1,x,1,cmone,y,1)
      write(*,*) y," == 3j"
      end

FP (fabrice-pardo) wrote :

1) Same bug on these 3 processors:

model name : Intel(R) Core(TM)2 CPU T5500 @ 1.66GHz
model name : Intel(R) Pentium(R) D CPU 3.00GHz
model name : Intel(R) Pentium(R) 4 CPU 2.40GHz

2) The wrong near zero result is random.

FP (fabrice-pardo) wrote :

Binaries from intrepid and debian/lenny are OK:
  libatlas3gf-sse2_3.6.0-22_i386
  libatlas3gf-sse2_3.6.0-22ubuntu1_i386

Binaries from jaunty and debian/squeeze are WRONG:
  libatlas3gf-sse2_3.6.0-24_i386
  libatlas3gf-sse2_3.6.0-22ubuntu2_i386

Fernando Perez (fdo.perez) wrote :

This is indeed a really serious problem for numerical use of Ubuntu.

Note: I suspect this bug is a dupe of the older:

https://bugs.launchpad.net/ubuntu/+source/atlas/+bug/363510

Morten Kjeldgaard (mok0) on 2009-10-06
Changed in atlas (Ubuntu):
assignee: nobody → Morten Kjeldgaard (mok0)
Scott Howard (showard314) wrote :

Hey - found this upstream bug related to this:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=517826

Changed in atlas (Debian):
status: Unknown → New
Dimitrios Symeonidis (azimout) wrote :

Ubuntu includes a very old version of libatlas (3.6.0 from Dec 2003, as opposed to 3.9.16 from Oct 2009)

Has anyone tried the latest upstream version?

Changed in atlas (Ubuntu):
status: New → Confirmed
importance: Undecided → High
Dimitrios Symeonidis (azimout) wrote :

confirming, importance high

Pauli Virtanen (pauli-virtanen) wrote :

Are these fixed in David C's PPA Atlas packages? IIRC, he managed to fix some SSE2-specific errors:

https://launchpad.net/~david-ar/+archive/ppa/+packages

Please remove this package from ubuntu. It's buggy. It does provide wrong numerical results.
Wrong results (without a warning) is the worst thing you can get when you use atlas (old and supposibly robust).
http://mail.scipy.org/pipermail/numpy-discussion/2010-January/047766.html

You can remove this packqge because you provide a non sse version which is ok (maybe slower but bug free).

Scott Howard (showard314) wrote :

Since Ubuntu gets this package from debian, and from comment #3 it appears debian has the same bug, has someone discussed this with the debian maintainers [1]? We have already merged 3.6-0-24 into Ubuntu, and want to avoid this bug propagating any further.

Binaries from intrepid and debian/lenny are OK:
  libatlas3gf-sse2_3.6.0-22_i386
  libatlas3gf-sse2_3.6.0-22ubuntu1_i386

Binaries from jaunty and debian/squeeze are WRONG:
  libatlas3gf-sse2_3.6.0-24_i386
  libatlas3gf-sse2_3.6.0-22ubuntu2_i386

[1] <email address hidden>

Sylvestre Ledru (sylvestre) wrote :

FYI, this bug is fixed with Debian Experimental ATLAS packages.

Changed in atlas (Debian):
status: New → Fix Released

Seems to be finally fixed in the 3.8.3-22ubuntu2 packages in Maverick.

Getting a fix also in the Lucid LTS would be nice, however.

Changed in atlas (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.