[J3] Questions about sqrt -- mostly for vendors
Van Snyder
Van.Snyder at jpl.nasa.gov
Tue Jan 29 17:01:50 EST 2019
Many applications need the reciprocal square root. For example,
celestial mechanics, electrodynamics, coulomb calculations, molecular
dynamics....
I have been told that many processors offer a reciprocal square root
instruction, because it's faster to compute it than to compute the
square root.
Rolf Strebel's PhD thesis points out that there are ten steps in the
Goldschmidt algorithm, but there are bubbles in the pipeline. He shows
how to compute a vector of 1/sqrt(x) in such a way as to fill the
pipeline.
Do Fortran processors exploit instructions to compute 1/sqrt(x) when
sqrt(x) appears in the denominator of a fraction?
Does elemental square root arrange interleaved calculations to fill
bubbles in the pipeline?
Would the standard benefit from an RSQRT intrinsic function, or can most
current optimizers use the RSQRT instruction directly, instead of
multiplying by x to get sqrt(x), and then dividing by sqrt(x)?
More information about the J3
mailing list