(j3.2006) (SC22WG5.3877) [MPI3 Fortran] MPI non-blocking transfers
Wed Jan 21 18:36:20 EST 2009
On Jan 21 2009, Bill Long wrote:
>> Firstly, it can happen in plain code, and at least some compilers do (or
>> used to). For example, consider vectorisable loops on possibly
>> non-contiguous or vector indexed arrays. A compiler is perfectly
>> entitled to copy more than it needs to contiguous workspace and copy
>> both the updated and untouched locations back, if the array is not
>> marked ASYNCHRONOUS and is not otherwise used. The reason is typically
>> alignment (as in many vector systems, SSE etc.)
>I assume the memory involved is the MPI buffer array.
Why? It is perfectly legal to pass sections like array(1:N:2) and
array(2:n:2) as two separate arguments, and use one for the MPU buffer
and the other for update.
> If the user has
>vectorizable code that references the array, and the processor hardware
>cannot handle non-stride-1 vectors, then it might make a copy. (More
>likely it would just load the upper and lower halves of the vector
>register with separate instructions, but for the sake of argument,
>suppose not.) But it certainly would not do a copy back, since that
>data was never modified. If the code is storing into the buffer array,
>which would require the copy back, then this is certainly a user error.
>The user is not allowed to write code the clobbers the buffer between
>initiation and completion of the MPI operation.
That is not true.
I said that it was the OPTIMISER that did it, and one reason that it
might copy more than it needs to (BOTH ways) is because of alignment and
the fact that the array is otherwise unused. Even if you have never seen
such generated code (I have), there is nothing in the standard that
forbids it, and you need to forbid it in order to make a non-attribute
>> Secondly, merely passing an argument does not count as a modification,
>> even if it passes an array section to an assumed-size dummy (which
>> forces copy-in/copy-out).
>Right, it does not count as a modification. That is exactly why this is
>the problem that we have to solve. It is not covered by the 'thou shall
>not clobber' rule imposed on the user.
Yes. And that is what the ASYCNHRONOUS attribute does at present, with
the proviso of an issue that I have in for interpretation.
>> This can apply both if the call is an
>> extraneous one, and when the call is an intermediate level between
>> where the buffer is defined and the MPI call.
>Agreed. But this is also the case for asynchronous variables in
>Fortran. So we do have some guidance on how proscribe coding
>requirements on the user. (Assuming it is allowed to force changes to
Precisely. And that is why an attribute is needed.
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email: nmm1 at cam.ac.uk
Tel.: +44 1223 334761 Fax: +44 1223 334679
More information about the J3