(j3.2006) (SC22WG5.3880) [ukfortran] [MPI3 Fortran] MPI non-blocking transfers
Wed Jan 21 21:46:36 EST 2009
On Wed, 2009-01-21 at 16:17 -0800, Aleksandar Donev wrote:
> On Wednesday 21 January 2009 16:04, Van Snyder wrote:
> > or is there another cause for it?
> YES!!! Why do we have the ASYNCHRONOUS attribute instead of just having
> WAIT?!? The compiler must know that a variable can change behind its
> back, must now not to perform copy in/out and "optimizations", and the
> standard has several restrictions on ASYNCHRONOUS dummies (see chapter
> 12). These are all "other causes" and have no connection to WAIT what
> so ever.
The restrictions on ASYNCHRONOUS dummies are frequently to protect the
associated actual argument, but in any case can't apply to dummies that
aren't there. Isn't that the problem with MPI wait?
> And no, this is not a defect with MPI's interface. It is simply another
> way to do interfaces. Plenty of libraries save pointers and do not ask
> you to pass the same 100 arguments both for the "start" and the "end"
> of an operation. It is perfectly sensible design, especially in the C
It's sensible in the C world because C's pointer semantics paralyze the
optimizer. Code motion is very limited compared to the possibilities in
Fortran. It would be sensible in the Fortran world, too, because of the
rule that says the base object of an asynchronously referenced component
either gets the ASYNCHRONOUS attribute by magic or has to have it
explicitly declared (ASYNCHRONOUS is not in <component-attr-spec>). But
the Fortran compiler doesn't understand the guts of the opaque C struct
that MPI uses for its handle, so it can't know enough not to do code
motions involving stuff that might be referenced by the handle.
Fortran knows enough not to move accesses to variables with the
ASYNCHRONOUS attribute across a WAIT statement because it understands
the semantics of the WAIT statement. It also knows enough not to move
such accesses across a reference to a procedure where they appear as
actual arguments and the associated dummy argument has the ASYNCHRONOUS
attribute, because the called procedure might execute a WAIT statement.
When Fortran calls C and a variable isn't mentioned, the compiler can't
know enough not to move accesses to the unmentioned variable across the
procedure reference. If the variable were to have the ASYNCHRONOUS
attribute, and were to appear as an actual argument associated with a
dummy argument having the ASYNCHRONOUS attribute in a reference to a
procedure, the compiler wouldn't need to understand the semantics of the
procedure, and wouldn't need any extra magic to avoid moving accesses to
the variable across the reference.
I don't think Fortran needs a new/different attribute, or to understand
the semantics of the MPI wait procedure, or an annotation for the MPI
wait procedure to tell the compiler its semantics. Just write an
interface layer that includes a buffer dummy argument with the
ASYNCHRONOUS attribute, and within that interface layer call the MPI
wait routine and ignore the buffer. The interface layer probably has to
be written in C to avoid the possibility that a clever compiler would
inline it, optimize away the do-nothing layer, and then notice that the
buffer isn't passed to the MPI wait routine -- so it can move code
across the reference!
Maybe add text (maybe in a note) that explains when asynchronous objects
have to appear as actual arguments associated with asynchronous dummy
arguments -- some blah blah blah about appearing anywhere as an actual
argument to a procedure with BIND(C) if the procedure or one of its
lackeys might take a pointer to it, or in C_LOC or..., because once C
gets the address of it, Fortran can't otherwise know when to suppress
code motion. The alternative is that the optimizer NEVER moves access
to ANY interoperable variable with the ASYNCHRONOUS attribute across ANY
If the handle in the MPI wait reference were not opaque, and it had a
pointer TKR-compatible with the buffer, and the buffer were either a
target or pointer and had the ASYNCHRONOUS attribute, and the handle had
the ASYNCHRONOUS attribute, the compiler might know enough not to move
references to the buffer across the reference. But the handle is
opaque, and a C struct to boot, right?
Which raises a Fortran-only question, having nothing to do with MPI: If
a variable X has a pointer ultimate component Y, and the ASYNCHRONOUS
attribute, and appears as an actual argument in a procedure reference in
which the corresponding dummy argument has the ASYNCHRONOUS attribute,
would an optimizer refrain from moving accesses to ANY variable that has
the ASYNCHRONOUS attribute, and either the POINTER or TARGET attribute,
and is either TKR-compatible with X%...Y or has an ultimate component
that is TKR-compatible with X%...Y, across the reference? It probably
should, but also probably need not worry about other variables.
More information about the J3