(j3.2006) (SC22WG5.5422) Preliminary result of the WG5 straw ballot on N2040

John Reid John.Reid
Sat Jan 17 06:55:43 EST 2015


WG5,

Here is a new version with Bob Corbett's vote included.

John.

John Reid wrote:
> WG5,
>
> Here is the preliminary result of the WG5 straw ballot on N2040. Several
> people that usually voted have not done so this time. If any of you do
> so very soon, I will include your vote.
>
> I have not decided what the result should be. There seem to be two
> alternatives:
>
> 1. Make a few more changes, checked by the email group, to address the
> comments and send the result to SC22.
>
> 2. Ask J3 to make a new version at its Feb. meeting and have a fresh
> informal WG5 ballot on the result.
>
> I favour the first alternative. I think the draft would be good enough
> for a first SC22 ballot. We need to know ASAP if any major changes are
> needed.
>
> Please tell me what you think.
>
> John.
-------------- next part --------------
                                         ISO/IEC JTC1/SC22/WG5 N2045-2

             Result of the WG5 straw ballot on N2040

                         John Reid

N2044 asked this question

Please answer the following question "Is N2040 ready for forwarding to 
SC22 as the DTS?" in one of these ways. 

1) Yes.
2) Yes, but I recommend the following changes. 
3) No, for the following reasons.
4) Abstain.

The numbers of answers in each category were:
3 for 1) Yes (Long, Reid, Whitlock)
2 for 2) Yes, but I recommend the following changes
         (Bader, Nagle)
2 for 3) No, for the following reasons 
        (Corbett, Maclaren)
0 for 4) Abstain 

Here are the comments and reasons. I have included an edited version 
of the comment of Anton Shterenlikht that appeared in comp-f90 because
he has noticed errors in A.3.1. 

Reinhold Bader

2) Yes, but I recommend the following changes.

[1:7] After "examples" add " that illustrate the semantics described"

Reason: Many of these examples are in the Annex.

[14:30] Replace "constuct" by "construct".

[14:30+] Add the following text: 
"Deallocation of coarrays is delayed until the statement that
 performs the deallocation on all active images of the current
 team has synchronized these images."

Reason: Avoid a race condition for definitions/references to 
 such coarrays on the stalled image (cf. [31:31-34], [36:11-15]). 
 It may be appropriate to also add a note that such a statement 
 must be a DEALLOCATE or a invocation of MOVE_ALLOC, either of
 which must have STAT= specified.

[17:7] Delete superfluous space after "ISO_FORTRAN_ENV".

[33:16+] Add missing bullet
"* extensions of image selector syntax and semantics provide the 
   capability to access coarray data across team boundaries;"

[33:19-20] Replace "provide low-level primitives ... computation;" by
" provide the ability to perform non-trivial operations across image
  boundaries on scalars of some intrinsic types in unordered segments;"

Reason: The text should describe what the atomics do, beyond the 
already existing ones.

[35:24-25] Replace by
"{In 4.5.6.2 The finalization process, replace the text of NOTE 4.48}

An implementation might need to ensure that when more than one coarray 
must be deallocated by execution of a single statement, they are 
deallocated in the same order on all images in the current team."

Reason: The term "event" now has a defined meaning that has nothing to
do with the NOTEs scenario.

[36:35-36] Consider the statement 

SYNC MEMORY

executed by all active images of the current team, one image of which 
has failed. According to the semantics defined here and in [38:25-26] 
error termination must be initiated on each executing image of the 
current team; in particular this involves cross-image activity 
that was not required by Fortran 2008. Was this intended? If not, is 
it sufficient to make the following edit to [36:35]:

Replace "If" by "Except in a SYNC MEMORY statement, if" ?

[37:14] Before "FORM TEAM", insert "\uwave{EVENT POST, EVENT WAIT,}".

Reason: Similar to locks, events only impose one-way segment ordering, 
and this ordering is already defined in [18:21-24], so a SYNC MEMORY
appears unnecessary. See 09-193r2 for the reasoning for LOCK/UNLOCK. 

[37:18+] Add the following text
"{In 8.5.2 Segments, edit the first sentence of NOTE 8.34 as follows}

The model upon which the interpretation of a program is based is that 
there is a permanent memory location for each coarray and that all 
images \uwave{on which it is established} can access it."

[38:13] Delete "on all images"

Reason: For each statement it is clear on which images it is 
executed; this may be a subset of all images.

[42:22] Replace "in the current team when the coarray was established"  
by "in the most remotely removed current or ancestor team in which 
the coarray is established."

Reason: The problem with the present wording is that the set of 
images on which a coarray is established may change throughout 
execution time (and also across images). To avoid ambiguity, I 
suggest looking at the establishment at the point (and the image) 
where the intrinsic is executed. This also seems appropriate 
for assuring composability of the coarray team concept - a huge
UCOBOUND that cannot be addressed by any means in the local 
context would not seem to make sense.

[44:15], [44:17] Replace "subcauses" by "subclauses", twice.
_____________________________________________________________________

Robert Corbett

My vote is 3) No, for the following reasons.

I am still concerned about the features described in Clause 5.9
I understand that allowing stalled images to resume execution
is a desired feature.  I am not convinced that the feature as
described in the DTS can be implemented without imposing a
severe performance penalty.  I understand that the ability to
resume stalled images is an optional feature.  I think that
even an optional feature should be required to be implementable.

I would change my vote if a description of how the feature could
be implemented is provided, assuming that the proposed
implementation is reasonable.  (Implementation via an interpreter,
for example, would not satisfy me.)  I would like the proposed
implementation to be based on hardware and systems software that
is commonly available.  A proposal for an implementation for
x86/x64 Linux would be fine.  The description of TS.  A separate
paper, not subject to approval would suffice.

One implementation proposal I shall not accept is that the
implementation should be the same as whatever the GCC
implementation of C++ does for exception handling.  I spoke
with a member of Oracle's C++ team, and he said that Oracle's
implementation of C++ exception handling could not do
everything I told him the DTS requires.

The DTS imposes some implicit requirements on processors.  For
example, some Fortran features require an implementation to
perform synchronization.  An implementation of a CRITICAL
construct, a SYNC ALL statement, a parallel reduction, or
input/output is likely to involve synchronization.  If an
image stalls on a data reference during the execution of a
CRITICAL construct within the scope of execution of a CHANGE
TEAM construct, I assume that the DTS assumes that a lock
held by the image as part of the synchronization done for
the CRITICAL construct must be released before execution of
of the stalled image resumes.

The DTS does not appear to impose a requirement that storage
allocated during execution of a stalled image be released before
execution of the stalled image resumes.  Is the possible memory
leak permitted?

Fortran processors often acquire system resources during execution.
For example, some operating systems allow a process to use at most
a fixed number of locks and events.  To avoid running out of the
system resources, the process must release resources it acquired
when it no longer needs them.  Is it intended that the DTS require
that a process release such resources as are no longer needed when
an associated stalled image resumes execution, or is it a quality
of implementation issue?
____________________________________________________________________

Nick Maclaren

3) No, for the reasons given in N2038, N2013 and other votes.  I need to
reiterate that neither response in N2039 even addresses my comments.  I
believe that incorporating the TS into the main standard will cause
serious harm to Fortran, because the (semantic) difficulties cannot be
resolved (let alone specified unambiguously) in the time available.
Indeed, it is not clear even that they ARE soluble, because this TS is
specifying a feature that is beyond the state of the art, and has been
for half a century.  I would be prepared to change my vote to abstain if
the decision to incorporate it were reversed.
___________________________________________________________________

Dan Nagle

Y2) Yes with comment.

Comments

in n2040,
change "functions" to "subroutines" at [31:3]
change "function" to "subroutine" at [31:17]
__________________________________________________

Anton Shterenlikht

A.3.1

In the first example, why x and y are defined as coarray variables? 
This fact seems to be completely unused.

Also, is it not possible for image P to read x_dot_y (line 8) from 
image Q, before this variable has been defined on image Q in line 7?
Is this what Note 7.4 is saying?

In the second example, line 17, j_max is undefined. I think what 
was meant is:

16	integer :: j_max, j_max_location
	j_max = j
17	call co_max(j_max)



More information about the J3 mailing list