[J3] Proposals for new intrinsic procedures
Jeff Hammond
jehammond at nvidia.com
Fri Jan 6 07:14:38 UTC 2023
A common use case is computing offsets into an array containing blocks of non-constant size. LLNL physics devs mentioned they use it regularly.
An intrinsic is definitely faster than what most users can write. Parallelizing scan is nontrivial eg http://www.mgarland.org/papers/2016/scan/.
You can also ask the OpenMP folks why they added it to OpenMP 5. It has been in C++ STL and MPI for decades at this point.
Jeff
Sent from my iPhone
On 6. Jan 2023, at 0.08, Long, Bill F <william.long at hpe.com> wrote:
External email: Use caution opening links or attachments
Do we still have the requirement that a new feature have a "use case"? Other than being part of sending HPF to extinction, I'm not aware of any use for prefix_sum. (And would an intrinsic be faster than a roll-your-own function?)
Cheers,
Bill
________________________________
From: J3 <j3-bounces at mailman.j3-fortran.org> on behalf of Clune, Thomas L. (GSFC-6101) via J3 <j3 at mailman.j3-fortran.org>
Sent: Thursday, January 5, 2023 2:54 PM
To: Jeff Hammond <jehammond at nvidia.com>; General J3 interest list <j3 at mailman.j3-fortran.org>
Cc: Clune, Thomas L. (GSFC-6101) <thomas.l.clune at nasa.gov>
Subject: Re: [J3] [EXTERNAL] Proposals for new intrinsic procedures
I’m misremembering the terminology. (Only needed this capability on one project.) A scan can be inclusive or exclusive, with the former apparently being the more common usage. And one can be derived from the other rather trivially. Not immediately finding what these are called for the special case of sums …
Given that MPI has not seen a need to implement both, it seems plausible that just one is sufficient.
* Tom
From: Jeff Hammond <jehammond at nvidia.com>
Date: Thursday, January 5, 2023 at 3:30 PM
To: j3 <j3 at mailman.j3-fortran.org>
Cc: "Clune, Thomas L. (GSFC-6101)" <thomas.l.clune at nasa.gov>
Subject: Re: [J3] [EXTERNAL] Proposals for new intrinsic procedures
Parallel prefix_sum / ex/in-scan are available in many programmable models, including MPI, OpenMP 5, and C++17. There is clear evidence that these are useful and implementable in parallel in shared (both CPU and GPU) and distributed memory.
I am aware of no prior art for postfix. I I’ve never even heard this term before today.
Can you please share the applications currently implementing their own postfix_sum in parallel that would benefit from this intrinsic?
Jeff
Sent from my iPhone
On 5. Jan 2023, at 22.06, Clune, Thomas L. (GSFC-6101) via J3 <j3 at mailman.j3-fortran.org> wrote:
External email: Use caution opening links or attachments
Brad,
If we do `prefix_sum()` then, IMO, we should also do `postfix_sum()`.
And then what about `prefix_cosum()` and `postfix_cosum()` . Slippery slopes and all that …
I believe such intrinsics go to JOR, but am far from certain. For the IEEE-754 I would hope the initial request could just be one paper, as it should focus on the “why”? Edits for the individual (new) IEEE functions might be separate papers for more pragmatic reasons.
A separate paper for the (misnamed) scan procedures.
Cheers,
1. To
From: J3 <j3-bounces at mailman.j3-fortran.org> on behalf of j3 <j3 at mailman.j3-fortran.org>
Reply-To: j3 <j3 at mailman.j3-fortran.org>
Date: Thursday, January 5, 2023 at 2:35 PM
To: j3 <j3 at mailman.j3-fortran.org>
Cc: Brad Richardson <everythingfunctional at protonmail.com>
Subject: [EXTERNAL] [J3] Proposals for new intrinsic procedures
Hi all,
Would proposals for new intrinsic functions be acceptable to submit for
the upcoming meeting? Specifically I've been asked to write up the
papers for:
* prefix sum (commonly known as scan, but of course that name's already
taken), both inclusive and exclusive
* All functions recommended by IEEE-754, most of which Fortran already
has, but the whole list is reproduced below
exp
expm1
exp2
exp2m1
exp10
exp10m1
log
log2
log10
logp1
log2p1
log10p1
hypot
rSqrt
compound
rootn
pown
sin
cos
tan
sinPi
cosPi
tanPi
asin
acos
atan
atan2
sinh
cosh
tanh
acosh
atanh
Some points of order questions also. Which subgroup is in charge of the
intrinsic procedures so that I can coordinate with them? Would it be
best to write a separate paper for each new procedure, put them all in
a single paper, or some scheme for grouping certain subsets? And whilst
we don't have final version of F2023 to draft the edits against, would
it be worthwhile to go ahead and detail out what they will look like?
Thanks in advance for any feedback.
Regards,
Brad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.j3-fortran.org/pipermail/j3/attachments/20230106/cb0093b9/attachment-0001.htm>
More information about the J3
mailing list