Multiple versions of the Fortran standard in the source code
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
libGridXC |
Fix Committed
|
High
|
Nick Papior |
Bug Description
In connection with the heterogeneity of Fortran file extensions, LibGridXC 0.7.3 contains legacy Fortran 77 code. The following files use a severely outdated fixed-form file format:
cellsubs.f
ggaxc.F
ldaxc.F
m_fft_gpfa.F
m_io.f
minvec.f
radfft.f
sorting.f
This has a negative impact for the library at several levels:
- performance: legacy fixed-form Fortran 77 code is not optimized as much as modern free-form code by most compilers, the difference possibly spanning one or two orders of magnitude in terms of vectorization;
- reliability: Fortran 77 being underspecified and ill-designed for modern computer architectures, most compilers disable consistency checks that would otherwise cause many compile-time errors;
- usability: modern Fortran relies on modules and strictly defined calling interfaces, which is not possible with legacy Fortran 77;
- portability: object code produced by modern compilers from legacy Fortran 77 is subject to random variations;
- debuggability: legacy Fortran 77 cannot be tested, instrumented and profiled as completely and as automatically as modern Fortran;
- attractiveness: young researchers openly hate Fortran, and even more explicitly legacy Fortran 77, which has clearly become a severe handicap to get contributions and contributors.
Fortran 77 has been considered as technical debt in a steadily growing number of projects and communities for around 10 years. It is important to address this issue and remedy the associated code smells before software entropy becomes unmanageable.
The current Fortran standard version fully implemented in all compilers is now Fortran 2003 (apart from a couple of exotic language features that should not be used anyway). It greatly improves over Fortran 77 and allows a significant code refactoring, in addition to providing extended interoperability with other programming languages. Most electronic structure codes have been switching to Fortran 2003 since 2015. The Fortran interfaces of the Electronic Structure Library – in particular those of LibXC – are also guaranteed to be Fortran 2003-compliant.
It is extremely recommended that all source code of LibGridXC be made compliant with Fortran 2003 and all legacy Fortran 77 code be fully refactored. Most of this task can be performed automatically thanks to a few SED commands.
Related branches
- Yann Pouillon (community): Approve
-
Diff: 8917 lines (+4289/-4316)14 files modifieddocs/deprecated_routines/test_bessph.F90 (+26/-0)
src/bessph.F90 (+83/-0)
src/bessph.f (+0/-66)
src/cellsubs.F90 (+61/-55)
src/chkgmx.F90 (+101/-113)
src/ggaxc.F90 (+2642/-2701)
src/ldaxc.F90 (+706/-709)
src/m_io.F90 (+181/-182)
src/minvec.F90 (+131/-132)
src/precision.F90 (+25/-25)
src/radfft.F90 (+118/-121)
src/sorting.F90 (+186/-180)
src/sys.F90 (+28/-28)
version.info (+1/-4)
Changed in libgridxc: | |
importance: | Undecided → High |
assignee: | nobody → Yann Pouillon (pouillon) |
milestone: | none → 0.9 |
Changed in libgridxc: | |
status: | New → Confirmed |
Changed in libgridxc: | |
assignee: | Yann Pouillon (pouillon) → Nick Papior (nickpapior) |
Changed in libgridxc: | |
status: | Confirmed → Fix Committed |
These are just comments and questions to try to frame and improve my understanding of the issue.
I broadly agree with you on the need to update the code.
I agree that f2003 should be the target standard for new code. Procedure pointers,
allocatable strings, interoperability with C, allocatable array extensions, etc, are quite useful and let you think more freely about algorithms and data structures.
While I agree that the fixed-form file format is prone to errors, note that what you call "f77" is a subset of f2003, and hence valid (and I would say that fully supported) Fortran. Only deprecated and truly dangerous features such as arithmetic if, equivalence, etc, should be strictly avoided.
How do you plan to convert from 'f77' to 'f2003'? If you are just going to use 'sed' scripts, it seems that you are just changing the format. Note that a lot of code (in Siesta, for example) is fixed-form >f90. I guess you also plan to put things in modules, or create interface blocks.
What you say about the performance penalty for fixed-form (and f77) is completely new to me. Unless some compiler vendor has gone crazy and let a huge legacy of f77 code underperform, I cannot really see what the reason would be for the performance penalty. If anything, it used to be that "new" f90 compilers would have trouble optimizing code because of the new pointer facility, absent in f77.
Do you have references about this?