(j3.2006) (SC22WG5.3646) [ukfortran] A comment on John Wallin's comments on Nick MacLaren's comments
N.M. Maclaren
nmm1
Fri Nov 7 05:18:29 EST 2008
Obviously, I agree with Aleksandar on this, but here are a few minor
details.
On Nov 7 2008, Aleksandar Donev wrote:
>
> This kind of thing plagued and still plagues array syntax: Nice syntax is
> great but if it cannot be intuitively mapped onto performance there is a
> problem.
Indeed, yes! One of my colleagues was VERY unhappy at the code generated by
every compiler he had access to - it was the amount of clearly unnecessary
argument copying and the dire performance of MATMUL that were the main
issues. Things have improved, but I still spend some time on this when I
give even an elementary course.
At least, with array syntax, I can explain in fairly simple terms when the
code will almost certainly be efficient, when it almost certainly won't be,
and when it will depend on the compiler and options. And back much of that
up with references to the standard.
> However, Nick's
> example (I paste it from e-mail below since the WG5 site is not working
> for me), it not that clear cut. I think mixing in the example about the
> spin loop in this discussion is a sidetrack---the example and explanation
> pasted below is IMO more illustrative.
But, to reiterate the statement, nobody is claiming that the restrictive
implementation I refer to is likely - the question in that example is only
whether such a processor is conforming. And I can guarantee that few (if
any) users will think that it is, but I can't find any wording to say that
it isn't.
> What Nick and I seemed to agree on is listed below as our "intent".
> Please consider this carefully instead of endlessly arguing about
> job/thread schedulers.
Yes. I have to accept blame for diverting the issue. Sorry.
> As a worst-case scenario, consider running n_images=2 as two threads on a
> single CPU with a single core. It seems reasonable users may want to do
> this for debugging, unless we warn them against it. Should we?
The way that schedulers come into it is that it is fairly common for a
program on a shared multi-core system to be run in that mode, either
because of affinity issues or because the other cores are tied up with
higher priority tasks. So it ISN'T just a problem for old machines.
Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email: nmm1 at cam.ac.uk
Tel.: +44 1223 334761 Fax: +44 1223 334679
More information about the J3
mailing list