matrix multiplication with libatlas gives wrong result for big matrices
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
atlas (Ubuntu) |
Incomplete
|
High
|
Morten Kjeldgaard |
Bug Description
Binary package hint: libatlas3gf-sse2
Complex matrix multiplication via zgemm gives a wrong result when using /usr/lib/
sample code:
--------- atlas-test.cpp -------
/* compile:
g++ -Wall -o atlas-test atlas-test.cpp -lblitz -llapack
run:
LD_LIBRARY_
or
LD_LIBRARY_
LD_PRELOAD=
with
./atlas-test
*/
#include <iostream>
#include <blitz/array.h>
typedef std::complex<
typedef blitz::
#define F77NAME(x) x ## _
/* SUBROUTINE ZGEMM ( TRANSA, TRANSB, M, N, K, ALPHA, A, LDA, B, LDB,
BETA, C, LDC ) */
extern "C" void F77NAME(zgemm)
(const char*,const char*,const int&,const int&,const int&,const Complex&,
const Complex*,const int&,const Complex*,const int&,const Complex&,
const Complex*,const int&);
int main() {
blitz::firstIndex ii;
blitz:
const int M = 139, N = 128;
cMatrix prod( M, N );
prod = 0.1* ii + 0.4* jj;
cMatrix C( M, M );
char A_op = 'T', B_op = 'N';
int M1 = M, N1 = N;
F77NAME(zgemm)( &A_op, &B_op, M1, M1, N1, 1.0, prod.data(), N, prod.data(), N, 0.0, C.data(), M );
std::cout << C( 0, 0 ) << " correct result = " << 0.4*0.4*
return 0;
}
----------------- end of atlas-test.cpp -------------
For M1 = 72 the program gives a correct result, for M1 > 72 not. Tested for other flavor of libatlas3gf (-base, -sse), similar behavior, but different results.
Ubuntu 8.04.2, package versions:
libatlas3gf-
libblitz0ldbl_
g++-4.2_
HP 6510b, cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz
stepping : 6
cpu MHz : 800.000
cache size : 6144 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 10
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm ida
bogomips : 4992.42
clflush size : 64
processor : 1 - the same as processor 0
Related branches
Changed in atlas (Ubuntu): | |
assignee: | nobody → Morten Kjeldgaard (mok0) |
On Karmic the results of the two experiments are the same: PATH=/usr/ lib;LD_ PRELOAD= /usr/lib/ liblapack. so.3gf; ./atlas- test PATH=/usr/ lib/sse2/ atlas;LD_ PRELOAD= /usr/lib/ sse2/atlas/ liblapack. so.3gf; ./atlas- test
LD_LIBRARY_
gives me
(8968.96,26706.4) correct result = 110541
just like
LD_LIBRARY_
Is this what you see, too?