(j3.2006) IEEE modules and NORM2

Van Snyder Van.Snyder
Thu Oct 9 13:51:36 EDT 2008


Dick Hanson sent me the following interesting result.  It confirms the
usefulness of the IEEE modules, and points out that when vendors get
around to implementing NORM2 they shouldn't just use level 1 reference
BLAS.

Van

        As an illustration for using the IEEE modules, I coded up DNRM2 of the
        BLAS.  The basic idea is simple:  Do the easy loop first and then
        check for exceptions and fix things up.  For that I used Jim Blue's
        1978 TOMS algorithm.  It prompted the question about the static
        constants.
        
        The basic conclusions are that this IEEE version is always more
        accurate.   It pulls away from Hammarling's LAPACK version at about
        n=40 and gets steadily faster, perhaps to factors or 30 or more.  This
        is on the IBM PowerStation.
        
        So this is good news for the IEEE module supporters, especially John
        Reid.  The example he gave in M, R and C for the planar length is
        always going to be a lot slower than simply scaling.  This is because
        there is overhead in the calls to get and set flags.  For DNRM2 that
        overhead time gets swamped out by compute time at about n=40.





More information about the J3 mailing list