(j3.2006) Integration of co-arrays with the intrinsic shift functions

Jim Xia jimxia
Mon Jul 16 15:42:49 EDT 2007


This piece of code seems odd.


subroutine update_caf(T, Tnew)
   real :: T(:)[*], Tnew(:)[*]
   integer :: i, il, ir, nmax

   nmax = co_ubound(T,1)                !<-- shouldn't nmax be ubound(T,1)
?

   if (this_image() == 1) then
     il = T(nmax)[num_images()] !<-- shouldn't il = num_images()?
   else
     il = this_image() - 1
   end if

   if (this_image == num_images()) then
     ir = T(1)[1]                               !<-- ir should be 1
   else
     ir = this_image() + 1
   end if

   sync all
   Tnew(1) = (T(nmax)[il] + T(1) + T(2))/3.
   Tnew(nmax) = (T(nmax-1) + T(nmax) + T(1)[ir])/3.
   sync all

   do i = 2, nmax-1
     Tnew(i) = (T(i-1) + T(i) + T(i+1))/3.
   end do

end subroutine update ! 22 lines


Although this code seems to approve Craig's point, I still agree with Bill 
that the language should be neutral with regard to memory models 
underneath the co-array.  It should not explicitly adopt one model versus 
another,  Allowing intrinisics like co_cshift may appear to someone that 
the language implicitly state what memory model it is supporting.

Cheers,

Jim Xia

XL Fortran Compiler Testing
IBM Toronto Lab at 8200 Warden Ave.
Phone (905) 413-3444  Tie-line 969-3444
D2/NAH/8200 /MKM



Craig Rasmussen <crasmussen at lanl.gov> 
Sent by: j3-bounces at j3-fortran.org
07/16/2007 02:06 PM
Please respond to
fortran standards email list for J3 <j3 at j3-fortran.org>


To
fortran standards email list for J3 <j3 at j3-fortran.org>
cc

Subject
Re: (j3.2006) Integration of co-arrays with the intrinsic       shift 
functions







On Jul 13, 2007, at 8:59 AM, Bill Long wrote:
>>
> Co-arrays basically provide two facilities:  a simple and efficient 
> way to access data on a different image, and ways to enforce 
> execution order between images.  That's about it.  There is no 
> prescription about what the data objects on different images mean 
> or are part of.  If Craig wants to partition a conceptually global 
> array across images ala a data parallel programming model, that's 
> fine. If Aleks wants to think of the co-dimensions as additional 
> planes in a higher dimension array, that's also fine.  The 
> important point is that co-arrays prescribes neither view, it just 
> provides a means to implement either.  I've written code using 
> Craig's model, and it worked quite well for that problem.  Most of 
> the time I employ a third approach.  This is the underlying power 
> of co-arrays.  Because it is fundamentally low level, it is 
> flexible enough to be used for a wide range of problems and 
> programming models.
>
> Given the intent and design of co-arrays, I think that Craig's 
> proposed intrinsics are not a good idea. (Sorry, Craig).   They are 
> really only useful in the context of a particular usage of co- 
> arrays, namely this HPF style view of data distribution.  That sort 
> of thing is a great idea for a separate library, and these 
> functions could be pretty easily written using the existing co- 
> array capabilities.  Things like this should not be enshrined in 
> the standard.
>

It WAS fine to say that one can conceptually view co-array 
distributions across images in anyway one wants.  But, I have 
identified at least one place in the standard that requires us to 
define precisely how one is to view a co-array distribution.  In most 
instances an agnostic view is fine, as co-arrays is "fundamentally 
low level" ( as you say) and provide  for  programming at a "level 
close to what assembly programming was for sequential 
languages" [Diaconescu and Zima].  However, the spirit of Fortran is 
not assembly language and to imply that the co-array spec is complete 
when it breaks existing Fortran (from the 90 standard) is just plain 
wrong in my opinion.

How can we say we have finished integrating a new type (co-arrays) 
into the standard, when it won't work properly with current features?

You mention that this sort of thing is "a great idea for a separate 
library, and these functions could be pretty easily written using the 
existing co-array capabilities."  This is true we must keep two 
things in mind:
      1. CSHIFT and EOSHIFT are already intrinsic functions.
      2. For performance reasons, it is critical that these functions 
are in the language in order for the compiler to optimize the 
operations.  For example (see my code example below), the compiler 
could inline these functions in a loop body and get rid of a 
temporary array copy.  The compiler could also use two-phase 
communication to interleave communication (prefetch "halo" cells) 
with computation (compute on interior of loop).  These optimization 
would not be possible with libraries.

So that everyone knows the programming models that Bill and I are 
referring to, I've included and example of a routine that updates by 
averaging over a 3 cell stencil (local cell plus 2 1D neighbors).  It 
is fine for a programmer to use either model, but I claim the data- 
parallel model provides the following advantages:
     1. Less code (30% reduction in code size in real code, much more 
in my simple example).
     2. Less complex and error prone (again as found in converting 
real LANL codes).  Consider how long it takes you to verify that 
Bill's example is correct.
     3. The data-parallel code is easier to move to heterogeneous 
processing units like GPUs.  Microsoft has obtained speed 
improvements of up to 17 times by moving code (written in data- 
parallel) off to a GPU.  LANL new "advanced architecture" machine is 
has heterogeneous processing units and we see heterogeneous 
architectures as ubiquitous in the future.

Regards,
Craig

----------------------  Data Parallel Code -----------------------

subroutine update_dp(T, Tnew)
   real :: T(:)[*], Tnew(:)[*]

   sync all
   Tnew = (co_cshift(T,-1) + T + co_cshift(T,+1))/3.
   sync all

end subroutine update_dp ! 6 lines

---------------------- standard CAF code (is there an error in 
code?)  -------------------------

subroutine update_caf(T, Tnew)
   real :: T(:)[*], Tnew(:)[*]
   integer :: i, il, ir, nmax

   nmax = co_ubound(T,1)

   if (this_image() == 1) then
     il = T(nmax)[num_images()]
   else
     il = this_image() - 1
   end if

   if (this_image == num_images()) then
     ir = T(1)[1]
   else
     ir = this_image() + 1
   end if

   sync all
   Tnew(1) = (T(nmax)[il] + T(1) + T(2))/3.
   Tnew(nmax) = (T(nmax-1) + T(nmax) + T(1)[ir])/3.
   sync all

   do i = 2, nmax-1
     Tnew(i) = (T(i-1) + T(i) + T(i+1))/3.
   end do

end subroutine update ! 22 lines





_______________________________________________
J3 mailing list
J3 at j3-fortran.org
http://j3-fortran.org/mailman/listinfo/j3

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://j3-fortran.org/pipermail/j3/attachments/20070716/e4bc4d52/attachment.html 



More information about the J3 mailing list