[Coarray-ts] Teams

Bill Long longb at cray.com
Wed May 30 10:58:15 EDT 2012



On 5/30/12 9:33 AM, N.M. Maclaren wrote:
> On May 29 2012, Bill Long wrote:
>>
>> I've been "shopping" these new team ideas with users and got strong
>> feedback that two, different, concepts are desirable in different
>> situations. Perhaps we need to support both.
>>
>> 1) A system for partitioning the current set of all images into
>> subsets that are similar to the scheme currently under discussion. All
>> cosubscripts would be relative to the new partition containing the
>> local image, etc. This is particularly attractive for writing a
>> library routine that can be called collectively by the images in the
>> partition.
>
> I.e. just like MPI subset communicators. Yes.
>
>> It also addresses the "climate model" problem. The one irritant is how
>> to (or whether to allow) access to images in a different partition.
>
> Put those the other way round! How would such accesses be synchronised?
> We need to decide a synchronisation model as part of deciding whether
> to allow such accesses.

John's original longer post did bring this issue up. I agree that it is 
the hard part of this feature, and why I included the "or whether to 
allow" bit.


>
>> 2) A system for creating a team similar to the original proposal. This
>> creates teams with names, but keeps the current cosubscripts and image
>> set. The teams are used for SYNC TEAM and as arguments to collectives.
>> This intent is that these teams are fairly small, such as the images
>> in the local SMP domain, or the list of images along boundaries of an
>> array. The main goal of these teams is better performance.
>
> Right, but the consequence of this is that synchronisation of any
> image outside the team - and any memory on any of those images - or
> any memory on those images accessed from outside the team - is
> synchronised only at the next global SYNC ALL. Not nice.
>

Synchronization of images in the team with images outside the team would 
be accomplished with SYNC ALL or SYNC IMAGES.  The goal here is to 
provide a fast sync of the images within the team.  Faster, and more 
scalable, than SYNC IMAGES.  Note that this is much simpler than the 
partition case since there is no renumbering of the images.


>> On 5/29/12 10:20 AM, John Reid wrote:
>>
>>> There is an issue with allocatable arrays that are allocated within a
>>> team. Can these be supported in symmetric memory or are they are like
>>> allocatable components of a coarray? They can be supported in memory
>>> that is symmetric for the team provided they are all deallocated when
>>> execution reverts to the parent team. So we have two alternative
>>> requirements
>>>
>>> T7a. It should be possible to support allocatable arrays that are
>>> allocated within a team in memory that is symmetric for the team.
>>> T7b. It need not be possible to support allocatable arrays that are
>>> allocated within a team in memory that is symmetric for the team.
>>
>> This is a non-issue for type 2 "teams".
>
> I don't follow. If you can allocate within such a team, you have
> even worse synchronisation issues than for data accesses, because
> the allocation status can become inconsistent across the program.


The type 2 teams would not permit ALLOCATE over the team.  That's why it 
is a non-issue for that case.   ALLOCATE of a coarray would still 
require participation of all images 1 .. num_images().   Only in the 
partition case would this involve fewer images than for the whole program.



>
>> For type 1 "partitions", I think we have to keep T7a for the model to
>> make sense. A subprogram in a stand alone program with no partitions
>> looks the same as one called from a partition to the user. The
>> requirement needed for this to work is that if the partition set is
>> dissolved and you revert to the parent, then any allocations made by
>> each of the partitions need to be deallocated at the point where the
>> partition is dissolved. That way the symmetric heap returns to the
>> state it had when the previous partition occurred, and should still be
>> symmetric.
>
> I think that we still have the problem. This isn't easy.

There are added details.  For example, all of the partitions need to be 
terminated before the parent would be able to allocate a coarray.  Also, 
when allocating within a partition, an unprotected allocate that failed 
would need to kill the whole program, not just the local partition. 
There are plenty of worms in this can.

Cheers,
Bill


>
>
> Regards,
> Nick.
>
> _______________________________________________
> Coarray-ts mailing list
> Coarray-ts at j3-fortran.org
> http://j3-fortran.org/mailman/listinfo/coarray-ts

-- 
Bill Long                                           longb at cray.com
Fortran Technical Support    &                 voice: 651-605-9024
Bioinformatics Software Development            fax:   651-605-9142
Cray Inc./Cray Plaza, Suite 210/380 Jackson St./St. Paul, MN 55101




More information about the Coarray-ts mailing list