(j3.2006) Fast accurate NORM2
Bill Long
longb
Tue Dec 15 16:46:09 EST 2015
On Dec 15, 2015, at 3:26 PM, Van Snyder <Van.Snyder at jpl.nasa.gov> wrote:
> There's a bit more of interest in the article on the faithfully rounded
> l2 norm of n-vectors.
>
> The error of the netlib norm routine depends upon the vector length; the
> error of the reported routine does not.
>
> The reported routine is three times faster than the netlib routine,
> except for very short vectors.
>
> Since it's faster and more accurate than what many consider to be the
> best contemporary l2-norm routine, is it unreasonable to hope that the
> reported routine will be used for the intrinsic NORM2 function?
Alternatively, why has this not replaced the current NetLib routine? If it did, the benefits to NORM2 might be automatic.
Cheers,
Bill
>
> On Mon, 2015-12-14 at 13:25 -0800, Van Snyder wrote:
>> On Mon, 2015-12-14 at 13:57 -0700, Keith Bierman wrote:
>>>
>>> On Mon, Dec 14, 2015 at 1:49 PM, Van Snyder <Van.Snyder at jpl.nasa.gov>
>>> wrote:
>>> Efficient Calculations of Faithfully Rounded l2-Norms
>>> of n-Vectors."
>>>
>>> ?Sounds nice. Is there a copy of the sw online?
>>
>> From page 24:16 of the article:
>>
>> "The complete set of codes, together with testing and performance
>> measurement auxiliary sources, is available at
>>
>> http://www.christoph-lauter.org/faithfulnorm.tgz
>>
>> under an open source license.
>> We implemented and tested our faithfully-rounded division-free l2-norm
>> with faithful reporting of overflow and underflow....
>> We used IEEE754 binary64 as working precision and restricted ourselves
>> to an SIMD environment, targeting in particular Intel SSE/AVX units,
>> with or without the IEEE754 FMA insruction...."
>>
>> Incidentally, Jim Demmel's students have implemented Kulisch's method to
>> compute exact dot products. Their implementation runs six times faster
>> than a floating-point dot product, let alone a correctly-rounded one
>> that doesn't overflow or underflow.
>>
>>> A quick google peek turned up some slides which suggest that
>>> intermediate computations with twice the word size are required (but
>>> I'm not certain its the same work).
>>>
>>>
>>> If so, worked details for double-double with one and two roundings
>>> might be of interest for folks with platforms (e.g. POWER) that
>>> support it (usually much cheaper than "quad" precision).
>>>
>>>
>>> ?
>>>
>>>
>>> Keith Bierman
>>> khbkhb at gmail.com
>>> kbiermank AIM
>>> 303 997 2749
>>
>>
>> _______________________________________________
>> J3 mailing list
>> J3 at mailman.j3-fortran.org
>> http://mailman.j3-fortran.org/mailman/listinfo/j3
>
>
> _______________________________________________
> J3 mailing list
> J3 at mailman.j3-fortran.org
> http://mailman.j3-fortran.org/mailman/listinfo/j3
Bill Long longb at cray.com
Fortran Technical Support & voice: 651-605-9024
Bioinformatics Software Development fax: 651-605-9142
Cray Inc./ Cray Plaza, Suite 210/ 380 Jackson St./ St. Paul, MN 55101
More information about the J3
mailing list