Cholesky factorization error with ifort 15.0.0

Bug #1790481 reported by Andrés Aguado
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Siesta
Invalid
Undecided
Unassigned

Bug Description

Dear SIESTA team,

I have found a problem with trunk-723 version of SIESTA. I have compiled it in a cluster with the 15.0.0 version of ifort compiler. In this machine the code compiles without problems but then always give the following error when running the code:

rdiag: Error in Cholesky factorisation
Stopping Program from Node: 0

just before starting the SCF cycle. I have tried with many different input files and the problem occurs for all of them.

As I was surprised at first, I tried to compile the serial version with the gfortran.make and intel.make files provided with the SIESTA distribution. With the gfortran option the program runs without error, but with intel.make the error is there! (so there seems to have no relation with parallel libraries, etc.). In fact, I understand (correct me if I'm wrong) that when using the intel.make as arch.make I am not using any libraries other than those provided with SIESTA (libsiestaLAPACK and libsiestaBLAS).

I am even more surprised now because I guess that these arch.make example files are well tested, but there seems to be a problem with the 15.0.0 version of ifort.

I'm attaching an example of calculation (a Zn cluster, but it does not matter, the problem is general) with the pseudo and the output file so that you can see the versions of siesta, compiler, etc. and the error. I can add that exactly the same code with the same input files runs perfectly in the an older cluster with the 11.0 compiler. So please, any help?

Thanks a lot in advance.

Andres Aguado

Revision history for this message
Andrés Aguado (aaguado) wrote :
Revision history for this message
Andrés Aguado (aaguado) wrote :
Revision history for this message
Andrés Aguado (aaguado) wrote :
Revision history for this message
Nick Papior (nickpapior) wrote :

Could you try and add -fp-model source to your flags?

Revision history for this message
Andrés Aguado (aaguado) wrote : Re: [Bug 1790481] Re: Cholesky factorization error with ifort 15.0.0
  • salida Edit (23.6 KiB, TEXT/PLAIN; charset=US-ASCII; name=salida)

I took the intel.make and added "-fp-model source" to FFLAGS. The
serial code compiled bur the error is still there (see attached output
file; the input and other files are the same as before).

Thanks for your help,

Andrés

On Tue, 4 Sep 2018, Nick Papior wrote:

> Could you try and add -fp-model source to your flags?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1790481
>
> Title:
> Cholesky factorization error with ifort 15.0.0
>
> Status in Siesta:
> New
>
> Bug description:
> Dear SIESTA team,
>
> I have found a problem with trunk-723 version of SIESTA. I have
> compiled it in a cluster with the 15.0.0 version of ifort compiler. In
> this machine the code compiles without problems but then always give
> the following error when running the code:
>
> rdiag: Error in Cholesky factorisation
> Stopping Program from Node: 0
>
> just before starting the SCF cycle. I have tried with many different
> input files and the problem occurs for all of them.
>
> As I was surprised at first, I tried to compile the serial version
> with the gfortran.make and intel.make files provided with the SIESTA
> distribution. With the gfortran option the program runs without error,
> but with intel.make the error is there! (so there seems to have no
> relation with parallel libraries, etc.). In fact, I understand
> (correct me if I'm wrong) that when using the intel.make as arch.make
> I am not using any libraries other than those provided with SIESTA
> (libsiestaLAPACK and libsiestaBLAS).
>
> I am even more surprised now because I guess that these arch.make
> example files are well tested, but there seems to be a problem with
> the 15.0.0 version of ifort.
>
> I'm attaching an example of calculation (a Zn cluster, but it does not
> matter, the problem is general) with the pseudo and the output file so
> that you can see the versions of siesta, compiler, etc. and the error.
> I can add that exactly the same code with the same input files runs
> perfectly in the an older cluster with the 11.0 compiler. So please,
> any help?
>
> Thanks a lot in advance.
>
> Andres Aguado
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/siesta/+bug/1790481/+subscriptions
>

Revision history for this message
Nick Papior (nickpapior) wrote :

From the output you have attached, it does not seem like you have added the flag.

Could you please do a make clean, make before proceeding.

See FLAGS in the top, the flag is not present.

Revision history for this message
Andrés Aguado (aaguado) wrote :
  • salida Edit (23.7 KiB, TEXT/PLAIN; charset=US-ASCII; name=salida)

Oops, sorry my fault

I attached a wrong file. I'm attaching now the correct one. The problem
persists.

On Tue, 4 Sep 2018, Nick Papior wrote:

>> From the output you have attached, it does not seem like you have added
> the flag.
>
> Could you please do a make clean, make before proceeding.
>
> See FLAGS in the top, the flag is not present.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1790481
>
> Title:
> Cholesky factorization error with ifort 15.0.0
>
> Status in Siesta:
> New
>
> Bug description:
> Dear SIESTA team,
>
> I have found a problem with trunk-723 version of SIESTA. I have
> compiled it in a cluster with the 15.0.0 version of ifort compiler. In
> this machine the code compiles without problems but then always give
> the following error when running the code:
>
> rdiag: Error in Cholesky factorisation
> Stopping Program from Node: 0
>
> just before starting the SCF cycle. I have tried with many different
> input files and the problem occurs for all of them.
>
> As I was surprised at first, I tried to compile the serial version
> with the gfortran.make and intel.make files provided with the SIESTA
> distribution. With the gfortran option the program runs without error,
> but with intel.make the error is there! (so there seems to have no
> relation with parallel libraries, etc.). In fact, I understand
> (correct me if I'm wrong) that when using the intel.make as arch.make
> I am not using any libraries other than those provided with SIESTA
> (libsiestaLAPACK and libsiestaBLAS).
>
> I am even more surprised now because I guess that these arch.make
> example files are well tested, but there seems to be a problem with
> the 15.0.0 version of ifort.
>
> I'm attaching an example of calculation (a Zn cluster, but it does not
> matter, the problem is general) with the pseudo and the output file so
> that you can see the versions of siesta, compiler, etc. and the error.
> I can add that exactly the same code with the same input files runs
> perfectly in the an older cluster with the 11.0 compiler. So please,
> any help?
>
> Thanks a lot in advance.
>
> Andres Aguado
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/siesta/+bug/1790481/+subscriptions
>

Revision history for this message
Nick Papior (nickpapior) wrote :

Tested Intel versions:
intel/2015.0.090 - fails (your compiler version)
intel/2015.1.133 - fails
intel/2015.3.187 - works
intel/2016.1.0.423501 - works

Conclusion, upgrade the intel compiler. I suspect all compilers later than the latest tested works.

PS. I guess the reason is that your system has some very similar orbitals in which case the Cholesky is a bit more fragile. I.e. I have done tests on other systems for all Intel compilers (back to 2013) and they all work. But, again, this heavily depends on the basis sets.

Nick Papior (nickpapior)
Changed in siesta:
status: New → Confirmed
Nick Papior (nickpapior)
Changed in siesta:
status: Confirmed → Fix Released
Revision history for this message
Andrés Aguado (aaguado) wrote :
  • Au.fdf Edit (2.1 KiB, TEXT/PLAIN; charset=US-ASCII; name=Au.fdf)
  • salida Edit (27.4 KiB, TEXT/PLAIN; charset=US-ASCII; name=salida)

Dear Nick,

thanks for your message. Of course I felt enthusiastic about it and
immediately tried. Unfortunately, the error is still there (see attached
files).

Thanks for your help,

Andres

On Fri, 21 Sep 2018, Nick Papior wrote:

> Dear Andres,
>
> The bug is that intel MKL (later versions) does not obey the worksize
> documentation.
>
> However, Siesta has an option that enables you to circumvent this:
>
> Simply input:
>
> Diag.Memory 2.
>
> In your fdf file and it will work.
> Note, smaller values (but higher than 1) may work as well.
>
> I will consider this as solved.
>
> ** Changed in: siesta
> Status: Confirmed => Fix Released
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1790481
>
> Title:
> Cholesky factorization error with ifort 15.0.0
>
> Status in Siesta:
> Fix Released
>
> Bug description:
> Dear SIESTA team,
>
> I have found a problem with trunk-723 version of SIESTA. I have
> compiled it in a cluster with the 15.0.0 version of ifort compiler. In
> this machine the code compiles without problems but then always give
> the following error when running the code:
>
> rdiag: Error in Cholesky factorisation
> Stopping Program from Node: 0
>
> just before starting the SCF cycle. I have tried with many different
> input files and the problem occurs for all of them.
>
> As I was surprised at first, I tried to compile the serial version
> with the gfortran.make and intel.make files provided with the SIESTA
> distribution. With the gfortran option the program runs without error,
> but with intel.make the error is there! (so there seems to have no
> relation with parallel libraries, etc.). In fact, I understand
> (correct me if I'm wrong) that when using the intel.make as arch.make
> I am not using any libraries other than those provided with SIESTA
> (libsiestaLAPACK and libsiestaBLAS).
>
> I am even more surprised now because I guess that these arch.make
> example files are well tested, but there seems to be a problem with
> the 15.0.0 version of ifort.
>
> I'm attaching an example of calculation (a Zn cluster, but it does not
> matter, the problem is general) with the pseudo and the output file so
> that you can see the versions of siesta, compiler, etc. and the error.
> I can add that exactly the same code with the same input files runs
> perfectly in the an older cluster with the 11.0 compiler. So please,
> any help?
>
> Thanks a lot in advance.
>
> Andres Aguado
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/siesta/+bug/1790481/+subscriptions
>

Revision history for this message
Nick Papior (nickpapior) wrote :

My mistake. Sorry!

I had written in the wrong bug-report. I'll hide my comment to not confuse!

Changed in siesta:
status: Fix Released → Confirmed
Revision history for this message
Nick Papior (nickpapior) wrote :

Dear Andres,
Since this can *only* be reproduced with certain Intel MKL versions I regard this as a bug in those MKL versions. It can't be reproduced with other libraries and/or debugging options which further confirms the bug in MKL.

As such I will mark this as an "Invalid" bug.

Thanks for letting us know of the problem!

Changed in siesta:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.