[J3] Performance Portability and Fortran: Making Fortran cool again

Bill Long longb at cray.com
Wed Jan 16 12:24:56 EST 2019


Hi Ondrej,

This sort of insight is very valuable. Thanks for posting it. 

There seems to be a lot of focus on using GPU’s.  (Maybe that’s why they asked Gary -who works for NVIDIA - to participate?)

I would point out that a DO CONCURRENT construct has semantics that are quite compatible with execution on a GPU.   Typically, DO CONCURRENT constructs are threaded, using the same underlying infrastructure as OpenMP.  I’ve mentioned to our compiler developers about adding GPU support, but the chicken-egg problem is “no customer is asking for this”.  If customers, especially ones as large and visible as LANL, ask, you might get.   If the standard needs tweaks to better enable GPU execution of DO CONCURRENT, that is something we should look into. 

> On Jan 15, 2019, at 11:54 PM, Ondřej Čertík via J3 <j3 at mailman.j3-fortran.org> wrote:
> 
> 
> Probably not what you want to hear, but many people at my Lab are moving away from Fortran to C++/Kokkos, because Fortran currently doesn't have a clear path forward I am afraid. That's one reason I decided to be active here.
> 
> Kokkos allows to have the same code/loop run in parallel on a CPU and a GPU, and to switch array memory layout accordingly.
> 
> The closest that Fortran has is OpenACC/OpenMP which allows to have the same loop to run on both CPU and GPU, but it doesn't seem to have a mechamism to switch the memory layout like Kokkos does.
> 
> Another problem is that only a few Fortran compilers support OpenACC currently, while Kokkos runs on most major C++ compilers.

I think OpenACC is being replaced by the GPU features in the newer OpenMP specs.    The relevant metric here is OpenMP support.  Users should not be using OpenACC for new code. 

> 
> So I would say that currently there is no equivalent of Kokkos in Fortran, so we can't do performance portability in the Kokkos's sense. 
> 
> However, if you give up on having the same code base run on both CPU and GPU, then Fortran has Cuda Fortran, which I think very naturally extends Fortran with a few keywords and constructs to run on a GPU. I think the resulting code is simpler than Cuda C, and I would argue simpler than Kokkos. But it only runs on a GPU, which some people don't like, but I also know people who think that's the solution, to structure their code so that only a minor part has to be targeted to a GPU specifically using Cuda Fortran.
> 
> So you can mention Cuda Fortran in your slide.

Cuda tends to be not portable.  Will it work with the GPU’s from AMD or Intel?  

Cheers,
Bill


> 
> Ondrej

Bill Long                                                                       longb at cray.com
Principal Engineer, Fortran Technical Support &   voice:  651-605-9024
Bioinformatics Software Development                      fax:  651-605-9143
Cray Inc./ 2131 Lindau Lane/  Suite 1000/  Bloomington, MN  55425




More information about the J3 mailing list