[J3] [EXTERNAL] Questions about DO CONCURRENT and locality

Clune, Thomas L. (GSFC-6101) thomas.l.clune at nasa.gov
Mon Jul 6 12:15:05 EDT 2020


Hi Ondrej,

If I understand correctly, merely “guaranteeing” that the loop can be parallelized in insufficient.  The compiler needs to “know” whether to treat any given variable as SHARED or LOCAL.    Small modifications of the example here can result in a different requirement in that regard to allow correct parallelization.

In this example the user could specify SHARED(A,B,T,K,L) which would enable parallelization.  (But would do unspeakable things if provided the data from your cases 4 & 5.)

I’m not sure what more you can be asking for.   The compilers cannot determine the proper locality at compile time due to insufficient information, and cannot do so _efficiently_ at run time.    Additional information must be provided by the programmer.

- Tom



On Jul 6, 2020, at 11:48 AM, Ondřej Čertík via J3 <j3 at mailman.j3-fortran.org<mailto:j3 at mailman.j3-fortran.org>> wrote:

Hi Bill and others,

On Mon, Jul 6, 2020, at 7:09 AM, Bill Long via J3 wrote:
DEFAULT(NONE) is a safety / programming style  issue, sort of like
IMPLICIT NONE. It helps catch typo issues in the loop, but I don’t
think it is ever required for program correctness or making it possible
for the compiler to figure out what is going on.   The idea of
specifically declaring “problem” variables in a SHARED, LOCAL, or
LOCAL_INIT specifier should be sufficient.

Thanks for discussing this.

There still seems to be a problem. If you take the original code from the link in Steve's email: https://urldefense.proofpoint.com/v2/url?u=https-3A__j3-2Dfortran.org_doc_year_19_19-2D134.txt&d=DwIFaQ&c=ApwzowJNAKKw3xye91w7BE1XMRKi2LN9kiMk5Csz9Zk&r=EDCdNzkccJ25Co3sjWrr1HlJQ3_CoIFWfekFE1ulcLI&m=2zu3zOVixoe4tfEiZGHRV2zXk_ziMtH4P2dKVc5bnMU&s=H1SgF1yc7i9K2PMlbRzyPvHmOaC4BWA6VEjKOIxz6y8&e= , which I am going to copy here for clarity:

SUBROUTINE FOO(N, A, B, T, K, L)
 IMPLICIT NONE
 INTEGER, INTENT(IN) :: N, K(N), L(N)
 REAL, INTENT(IN) :: A(N)
 REAL, INTENT(OUT) :: B(N)
 REAL, INTENT(INOUT) :: T(N)
 INTEGER :: J
 DO CONCURRENT (J=1:N)
   T(K(J)) = A(J)
   B(J) = T(L(J))
 END DO
END SUBROUTINE FOO


This can be parallelized efficiently for any of the following cases:

1. K=[1,2,3]; L=[4.5,6]
2. K=[1,2,3]; L=[1,5,6]
3. K=[1,2,3]; L=[1,2,3]

However, the parallel order will break the logic of the code for the following cases (the loop can be executed in any order, but if it is executed in parallel, the same element of T will be overwritten in parallel with different values and thus provide an incorrect answer), which should be invalidated:

4. K=[1,1,2]; L=[1,1,2]
5. K=[1,1,1]; L=[1,1,1]


You cannot currently do that with the shared, local, ... specifiers, because above you need to specify conditions on the elements of the arrays, not arrays as a whole.

I propose that users expect that if they write "do concurrent", they are telling the compiler that they are guaranteeing from the nature of the application problem that it can actually by executed in parallel (concurrently). Specifically in the case above they are guaranteeing that the content of the arrays K and L will be of the form 1., 2., or 3., and never of the form 4. and 5. Users can provide this guarantee, but compilers cannot figure this out.

See this issue for more discussion and background: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_j3-2Dfortran_fortran-5Fproposals_issues_62&d=DwIFaQ&c=ApwzowJNAKKw3xye91w7BE1XMRKi2LN9kiMk5Csz9Zk&r=EDCdNzkccJ25Co3sjWrr1HlJQ3_CoIFWfekFE1ulcLI&m=2zu3zOVixoe4tfEiZGHRV2zXk_ziMtH4P2dKVc5bnMU&s=IGLYRyp_h4gu6AGRXlhwACxgOexR9Qc--BBypWBDKX0&e=

Ondrej

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.j3-fortran.org/pipermail/j3/attachments/20200706/27db8c87/attachment.htm>


More information about the J3 mailing list