(j3.2006) Loop fusion vs. vectorization
Van Snyder
Van.Snyder
Mon Oct 5 16:29:48 EDT 2009
I have a routine for bilinear interpolation of a 2-d array zOld onto a
1-d path z through the (x,y) basis for zOld, coded thus. Assume proper
declarations, where sizes of xNew, yNew, z, hx, hy are the same, and the
shape of zOld is (size(xOld),size(yOld)).
! Hunt for xNew(:), yNew(:) in xOld(:), yOld(:)
call hunt ( xOld, xNew, hx )
call hunt ( yOld, yNew, hy )
do i = 1, size(z)
ix = hx(i)
iy = hy(i)
dx = ( xNew(i) - xOld(ix) ) / ( xOld(ix+1) - xOld(ix) )
dy = ( yNew(i) - yOld(iy) ) / ( yOld(iy+1) - yOld(iy) )
z = &
& ( 1.0 - dy ) * (( 1.0 - dx ) * zOld(ix,iy) + dx * zOld(ix+1,iy)) + &
& dy * (( 1.0 - dx ) * zOld(ix,iy+1) + dx * zOld(ix+1,iy+1) )
end do
Again assuming properly dimensioned variables, and assuming a proposal I
made to allow subscripting a rank-k array zOld with a rank r array H,
where the extent of the first dimension of H is k, giving a rank r-1
object having the shape of the last r-1 dimensions of H, I could write
this as
integer :: H(2,size(z))
call hunt ( xOld, xNew, h(1,:) )
call hunt ( yOld, yNew, h(2,:) )
dx = ( xNew - xOld(h(1,:)) ) / ( xOld(h(1,:)+1) - xOld(h(1,:)) )
dy = ( yNew - yOld(h(2,:)) ) / ( yOld(h(2,:)+1) - yOld(h(2,:)) )
z = &
& ( 1.0 - dy ) * (( 1.0 - dx ) * zOld(h) + &
& dx * zOld(reshape((/h(1,:)+1,h(2,:)/),(/2,size(z)/))) + &
& dy * (( 1.0 - dx ) * zOld(reshape((/h(1,:),h(2,:)+1/),(/2,size(z)/))) + &
dx * zOld(reshape((/h(1,:)+1,h(2,:)+1/),(/2,size(z)/))) )
I write it the first way for two reasons
(1) because I have to
(2) because in the old days, caches were small enough and loop overhead
was large enough to recommend fusing the loops.
Again assuming the proposal, would version (2) vectorize sufficiently
well to make it preferable to version (1), or would I still be better
off fusing the loops as in version (1)?
--
Van Snyder | What fraction of Americans believe
Van.Snyder at jpl.nasa.gov | Wrestling is real and NASA is fake?
Any alleged opinions are my own and have not been approved or
disapproved by JPL, CalTech, NASA, the President, or anybody else.
More information about the J3
mailing list