(j3.2006) further thoughts on Aleks' proposal
Sun Jun 22 18:23:39 EDT 2008
I'll try to respond to the various comments on my
sketch of a proposal for a PERSISTENT attribute
for MPI-like subroutines. I think some of the
concerns were based on a misreading of what I was
trying to say.
I'm sorry I don?t have much more time to spend
on this. It's not a complete proposal. I think
it covers everything, but the wording is not up
to normal standardese.
1) Is there a problem to be solved.
I think Bill's comment, that he's never seen a
problem, is the hardest one to deal with. If there
is no problem to be solved, then there is no problem
to be solved. I was mostly reacting to a long
series of e-mails a few weeks ago and then to Aleks'
proposal; not to a problem I know about. As Van
often says, I don't have a dog in this horse race;
so, if there is no need or desire to do this, it's
OK with me. I've never written an MPI program in
my life and plan to continue to do so. However,
just because MPI can be made to work with existing
mechanisms doesn't mean MPII or whatever comes next
2) It's the subroutine call that causes the problems.
Contrary to Malcolm's claim that this is a hack, I
think the non-Fortran subroutine is the natural thing
to identify. The two problems (in my mind) are that
the compiler must use pass-by-address, not copy-in/out,
and must not do any optimizations or code motions
across the call. These are properties of specific
subroutines. Labeling the subroutine as PERSISTENT
is something that the module writer needs to do once.
Putting in explicit sync statements is something the
user must do for every call. I think labeling
subroutines as being non-traditional-Fortran in the
way they treat their arguments is less error prone
and, arguably, clearer.
We require the user to label an asynchronous I/O
statement, we don't ask the compiler to deduce
something from the I/O list attributes. To me,
requiring the user to label the unusual subroutine
There is (or was) a related problem of calling a
subroutine with a VOID* argument; essentially some
sort of typeless on no TKR checking mechanism.
Making that work will require tinkering with the
interface; why not attach all the attributes at
the same time and place?
3) What is the user's responsibility.
By labeling the subroutine, and constraining the
compilers actions, I think we reduce the problem
to one similar to asynchronous I/O. Look at
9.6.4 on 216
"5 For asynchronous output, a pending input/output
storage sequence affector (18.104.22.168) shall not be
redefined, become undefined, or have its pointer
association status changed.
6 For asynchronous input, a pending input/output
storage sequence affector shall not be referenced,
become defined, become undefined, become associated
with a dummy argument that has the VALUE attribute,
or have its pointer association status changed."
That's essentially what the user currently does for
an actual argument that will be modified by an
existing MPI routine. He needs to know which arguments
are potentially modified later on and not use them
or use them carefully, perhaps with locks or syncs,
or something. You might want to modify the rules
somewhat; but "don't look, don't touch" is the basic
rule and it's not enforceable by the compiler.
If a user calls a non-PERSISTENT routine, he'll have
to know whether or not that routine passes on it's
arguments to a PERSISTENT routine and understand how
to deal with that. Most likely, he'll label the
routine PERSISTENT, but other options are possible.
4) What is the compiler's responsibility.
The compiler needs to do very little that it isn't
It needs to learn a new keyword and a bunch of
interface-related constraints. The constraints
I proposed are intentionally simple and heavy-handed.
Basically, all arrays must be compile time contiguous,
no polymorphic stuff, no type bound procedures, and
no finalizers. All rules designed to make it easy
for the compiler to use simple F77 style pass by
The compiler must use pass-by-address as the calling
mechanism. No copy in/out, no dope vectors. But,
it already has to do this for calling implicit
interface (old F77) routines. The only difference
is that here, instead of doing copy in/out for an
array section, it will issue a compile time error
The compiler must not do any code motion past a
call to a PERSISTENT routine, nor can it keep
copies of variables in registers (which is a form
of code motion) across the call. It must effectively
dump all variables back to memory, reload the values
after the call, and re-evaluate any common
sub-expressions it might otherwise have kept in
registers. But it already does this for calls
to implicit interface routines. Any variable
in the argument list, in common, or in a module
must be dumped from registers to memory and
reloaded after the call. All a PERSISTENT
subroutine does is increase the list of "touched"
variables to include everything.
Otherwise, the compiler is free to optimize the
dickens out of the code. If it sees a variable
used in a call to a PERSISTENT routine and later
on sees that variable used in expressions, it can
do whatever optimizations it wants -- except move
things across another PERSISTENT call. The
compiler knows that the user is correctly doing
the sync stuff and using PERSISTENT calls as
barriers. Just as the compiler can optimize
use of ASYNCHRONOUS variables. It knows the
user is doing the right thing, provided the
compiler doesn't move code across a WAIT statement.
5) What does "shoehorn" mean
In my cultural heritage, it's not particularly
pejorative. I've used "shoehorn", "file to fit,
hammer into place", etc., as a way to describe
what needs to be done to make things work in a
non-perfect world. It's close to a badge of
honor to be able to shoehorn.
More information about the J3