(j3.2006) Lessons to be learned from HPF

Bill Long longb
Tue Jul 24 13:33:27 EDT 2007

Keith Bierman wrote:
>> 7. Single level of data parallelism too restrictive?
>>    [I'm not sure what this means but I assume co-arrays will suffer
>> the same problem.  If it means a flat image hierarchy, then co-arrays
>> do have that problem.  We are already seeing machines at LANL deeper
>> memory and processor hierarchies (RoadRunner).]
> I suspect that what this means is something like mixing MPI and OpenMP 
> .... little as some (myself included) like it, MPI has become the de 
> facto way to doing work on "clusters" of commodity processors with 
> (hopefully) fast links (e.g. myrinet or Infiniband).

As well as on systems with multiple non-commodity processors and 
(hopefully) links faster than myrinet and Infiniband.

> Now that commodity processors are on a path to being nontrivial SMP 
> machines in their own right (let's say the doubling of cores continued 
> at the Moore's Law growth rate, assisted by packaging tricks like 
> Intel's two dies per chip ;>) using a model well suited to the shared 
> memory, etc. of the SMP enables good performance per "node" and 
> managing large ensembles of nodes is more tractable if one doesn't 
> confuse the onchip shared resources with the off chip ones.

Such parallelism can go another layer deeper, with multiple hardware 
threads for each core.

> I can hear the screaming from here, it's more complicated. It's too 
> low level, etc.
> But I'm pretty sure it's what a class of consumer will strongly desire.

Yes, the market does seem to have adopted a model along the lines of 
OpenMP within the node and MPI between nodes, where a 'node' is 
increasingly a single multi-core chip.  It is a very easy transition to 
map node -> image and use co-arrays to affect references between images 
and run OpenMP within an image. Fortran also proposes DO CONCURRENT that 
can easily map to parallelism within an image, as well as vectorization 
of array operations within an image.  Compilers can also automatically 
detect local parallelism similar to what OpenMP provides.  
Alternatively, you could map an image to a core for a single-chip system. 

The combination of images and local parallelism enabled by Fortran 2008 
constructs, with the possible addition of OpenMP, provides a powerful 
set of tools to take advantage of the capabilities of the systems that 
will be relevant in the next few years.   This sort of adaptation is why 
Fortran has survived for so long.


Bill Long                                   longb at cray.com
Fortran Technical Support    &              voice: 651-605-9024
Bioinformatics Software Development         fax:   651-605-9142
Cray Inc., 1340 Mendota Heights Rd., Mendota Heights, MN, 55120


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://j3-fortran.org/pipermail/j3/attachments/20070724/ed0da291/attachment-0001.html 

More information about the J3 mailing list