[J3] HPC rant against the standard committees

Anton Shterenlikht mexas at bristol.ac.uk
Fri Apr 20 05:36:01 EDT 2018


I got a copy of just published, 2018,
"Programming for Hybrid Multi/Manycore MPP systems"
by John Levesque, Aaron Vose (both Cray)
CRC Press, ISBN 978-1-4398-7371-7.

The conflict between performance and productivity
has been addressed enough in similar publications,
however, the language of this book is unusually strong.

I wanted to share some particularly relevant quotes in this list.

2.7 Productivity and performance portability

"The movement to C++ over the past 10 to 15 years
has created applications that achieve a lower
and lower percentage of peak performance on today's
supercomputers. Recent extensions to both C++ and Fortran
to address productivity have significantly contributed
to this movement."

and later...

"Much of the blame of this productivity movement has
to be placed on the language standards committees that
introduces [sic] semantics that make the application
developer more productive without thinking about compilation
issues of the efficiencies of executing the compiled
code on the target system. Considering the recent additions,
both Fortran and C++, it seems that the committee
really does not care about how difficult it might be
to optimize the language extensions and that their
principal goal is to make programmers more productive.
When users see an interesting new feature in the language,
they assume that the feature will be efficient; after all,
why would the language committee put the feature in the
language if it wouldn't run efficiently on the target systems?"

Seems Cray should really send somebody to the Fortran committee...

Then under

4.8 Fortran 2003 and inefficiencies

"With the development of Fortran 90, 95, 2003 and now 2008,
new semantics have been introduced into the language that
are difficult or even impossible to compile efficiently.
As a result, programmers are frequently disppointed
with the poor performance when these features are used.

"Following are a few of the newer Fortran features that
will cause most compilers to generate inefficient code
and should be avoided:

1. Array syntax.
2. Calling standard Fortran functions not linked to optimized libraries.
3. Passing array sections.
4. Using modules for local data.
5. Derived types: struct of arrays versus array of structs."

Ok, maybe for 2-5, but array syntax? Really?
Also, it is a bit weird in a 2018 book to call Fortran 90
array syntax a "newer Fortran feature".
But the authors insist:

4.8.1 Array Syntax

"Array syntax was first designed for the legacy memory-to-memory
vector processors like the Star 100 from Control Data Corporation.
The intent of the syntax was to give the compiler a form they
could convert directly into a memory-to-memory vector operation.
When these vector machines retired, array syntax was kept alive
by Thinking Machine's Connection Machine. Here the compiler
could generate SIMD parallel machine instruction that would
be executed by all the processors in a lock-step parallel fashion.

"Unfortunately, array syntax is still with us and while many
programmers feel it is easier to use than the standard old
Fortran DO loop, the real issue is that most Fortran compilers
cannot generate good cache efficient code from a series
of array assignment statements."

The bulk of the book is dedicated to loop optimisations
to enable vectorisation, efficient cache use, etc.
There is a lot of good advice.

The main emphasis of the book in on Fortran,
with very few C/C++ examples.
All Fortran examples 77 syntax, and indeed,
the authors never use array syntax, though to be fair,
most examples are from production codes.

Anton



More information about the J3 mailing list