(j3.2006) Integration of co-arrays with the intrinsic shift functions
dick.hendrickson at att.net
Wed Jul 18 15:25:52 EDT 2007
No, I don't think we can or should move the sync into the functions. On
heterogeneous processors the order of evaluation can be different on
different images. So, Craig's example expression
x = co_shift(A,+1)) + co_shift(A,-1)) ! or whatever
might be doing a sync for the first shift on some images and the
second on others. Isn't this a recipe for deadlock?
And Craig responded
> >I don't see any deadlock issues?
And Michael responded
> Here's some pseudo-code ... that is,
> I'm not sure whether it's Fortran 2008 or not ...
> I'm guessing that an implementation that relies on the phase of the moon
> to decide which function to call first might have some deadlock problems.
> Seems like if it's a problem for the language to solve then it might already
> be a problem without changing the intrinsics?
I think the problem is that deep in our hearts, we all think there is only
one .exe or .o file and that it executes on every image. But that
isn't guaranteed or implied by the standard. It's reasonable to think
about a collection of DEC, IBM, Cray, Intel, SUN ... processors hooked up
into a network and to all execute program images. For this to work, the
various compilers will all have to do the right thing.
Fortran doesn't specify the order of evaluation for functions in an
expression; whereas, I think C specifies left to right evaluation of
externals. So, different Fortran compilers could do either the +1 or
-1 shift first. If the co_shift function has an implied sync, then
different processors will be doing different syncs and I think that is
a potential deadlock.
It's worse if the expression is something like
X = f(b) + g(c)
and the syncs happen in f and g. If every image is compiled with the same
compiler, it doesn't matter because everything will be in the same (unspecified)
order. If different images are compiled by different compilers, then they'll
have to agree on expression evaluation odder. I think that's surprising.
More information about the J3