(j3.2006) Integration of co-arrays with the intrinsic shift functions

Bill Long longb
Mon Jul 23 15:09:47 EDT 2007



Aleksandar Donev wrote:
>
>> How can you say that data parallel is not the place of Fortran?  It  
>> is already there!  Consider FORALL, WHERE, CSHIFT, EOSHIFT (perhaps  
>> even MATMULT), array notation, ...
>>     
> Yes, and it is still there on each image (which might have lots of internal 
> fine-grained parallelism). Data parallelism is great when it is easy to 
> express and when compilers can figure it out. The reason we are adding 
> *explicit* parallelism with explicit data distribution and flow control, is 
> to as to allow much more than that.  Also, IMHO, most of the above constructs 
> and intrinsics, other than array notation, are:
> 1) Not used heavily in real kernels due to differences in optimization among 
> compilers
> 2) Source of frustrations for the vendors trying to optimize bechmarks (matmul 
> being a famous monster) resulting in a waste of resources
> 3) Badly designed to do anything other than vanilla data-parallism (this is 
> why we added DO CONCURRENT even though we have FORALL).
> 4) Used most often as shortcuts for simple and short loops (this is a good 
> thing, but not a justification for a major change to co-arrays at this date).
>
>
>   
To expand on what Aleks is illuminating above, the basic problem with 
the simple data parallel programming model is that most problems are not 
easily adapted to this structure.  That's the fundamental reason why the 
CM-5 failed and one of the two major reasons HPF failed.  (The other was 
that no one could ever figure out what happened to a distributed array 
at a call site.)   People were just not willing to morph their problem 
into one of processing global arrays.  Even in cases (large rectangular 
data grids) where you might think global arrays are the clear answer, 
the programming is still simpler with the current co-array model.  
Craig's trivial example hides how ugly the reliance on co_shift would be 
for a REAL problem.  As an example, I've attached the kernel for the 
Himeno benchmark modified for co-arrays. (The conversion was 
straightforward once I figured out what all the original MPI calls 
did.)  Unlike the data parallel model, it uses shadow cells, a common 
technique proved to improve performance. I would contend that rewriting 
this using co_shift would result in something less readable, harder to 
maintain, and most likely worse performing.  None of these are near the 
top of my goals list.

Cheers,
Bill




-- 
Bill Long                                   longb at cray.com
Fortran Technical Support    &              voice: 651-605-9024
Bioinformatics Software Development         fax:   651-605-9142
Cray Inc., 1340 Mendota Heights Rd., Mendota Heights, MN, 55120

            

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://j3-fortran.org/pipermail/j3/attachments/20070723/06e7fddb/attachment.html 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: jacobi.f90
Url: http://j3-fortran.org/pipermail/j3/attachments/20070723/06e7fddb/attachment.pl 



More information about the J3 mailing list