[J3] Problem with contiguous

Bill Long longb at cray.com
Tue Dec 11 02:57:29 EST 2018


> On Dec 10, 2018, at 4:49 PM, Van Snyder via J3 <j3 at mailman.j3-fortran.org> wrote:
> 
> I have a routine that's more than 1000 lines.

OK

> 
> It needs more than a hundred work arrays.

Yikes.

> 
> The routine is invoked for many problems of many different sizes, but I
> know the maximum size (at run time, not compile time).
> 

If the max size is known at the beginning of the run, one option is to create a module with declaring the arrays as allocatable, with a contained INIT routine that is called before your routine is called the first time.  The INIT routine allocates the arrays.   At least you pay the allocation cost only once. 

> Originally, the arrays were pointers, local to that routine, explicitly
> allocated, for each problem.  This was before pointers could be declared
> to be contiguous.
> 
> Then, they were allocatable, local to that routine, explicitly
> allocated, for each problem.
> 
> Then, they were automatic, local to that routine.
> 

Local automatic arrays are typically “allocated” on the stack, which is usually a low cost activity.  You might check compiler options to make sure you are using the stack and not the heap for the automatic arrays. 

> A profiler told us we were were spending a significant fraction of run
> time, about 8% overall and much more for that routine alone, in the
> storage allocator.
> 
> So we made the routine internal to the one that repeatedly calls it with
> different problem sizes, made the work arrays automatic in the calling
> routine, with extents dependent upon the maximum problem size, and
> accessed them by host association.  Because they're automatic, they're
> simply contiguous.  Because they're in the calling routine, outside the
> loop of calls to the internal one, the storage-allocator cost was
> dramatically reduced.

Similar concept to the module solution. 

> 
> Then, we had to make the routine stand on its own again, so we added
> hundreds of assumed-shape dummy arguments for the work arrays.

Seems ill-advised on many performance grounds.  I’d prefer the module option and get rid of all the extra assumed-shape arguments. 

> 
> The work arrays are still automatic in the calling routine (now
> routines), still declared something like
> 
>  real(rp) :: Sps_Path(max_f,n_sps)
> 
> where max_f is the maximum problem size, and n_sps is the same for every
> problem.  We need to compute column-wise dot products, so we don't want
> to exchange the dimensions, which would make the whole array contiguous
> because n_sps is the same for every problem, but this is not helpful
> because rows are never contiguous.
> 
> The routine is referenced
> 
>  call one_frequency ( ..., sps_path(:npf,:), ... )
> 
> where npf is the current problem size.  The original code, in which
> Sps_Path etc. were accessed by host association, had sps_path(:npf,:) in
> every array reference, but now they're whole-array references with a
> requirement that the actual argument is just the relevant section.  It
> would be costly and error-prone to change them back.

Seems like you have made costly changes multiple times before.

Cheers,
Bill

 
> 
> We noticed a performance degradation because the dummy arguments are not
> contiguous.
> 
> They can't be declared CONTIGUOUS -- because they're not.  Well, we
> could declare them CONTIGUOUS at the expense of copy-in (and sometimes
> copy-out too).  VALUE would make the expense more obvious, but the
> copying expense would overwhelm the cost increment incurred by
> discontiguity.
> 
> But every section with stride = 1 in the first dimension is contiguous.
> For some four-dimensional arrays, it's every such section in the first
> three dimensions that are contiguous.
> 
> I suspect others have had the same or a similar problem.
> 
> Many years ago, I proposed a different notation for partly-contiguous
> assumed-shape arrays:
> 
>  real(rp), intent(in) :: sps_path(::1, :)
> 
> meaning "it's contiguous in the first dimension, but columns are not
> necessarily contiguous one to another."
> 
> More generally:
> 
>  real, intent(...) :: A(::1, ::1, :)
> 
> meaning "it's contiguous only in the first two dimensions, but not after
> that," etc.  There was a constraint that a dimension could not be
> declared to be contiguous unless all previous dimensions are contiguous.
> For example
> 
>  real, intent(in) :: A(:,:,::1)
> 
> would be prohibited (because it cannot be correct that an array is
> discontiguous in the first two dimensions but contiguous in the third
> dimension).
> 
> We rejected that notation in favor of CONTIGUOUS. which says the entire
> array is contiguous.
> 
> PROPOSAL:
> 
> Change CONTIGUOUS to
> 
>  R802 <attr-spec> <<is>> ...
>                   <<or>> CONTIGUOUS [ ( <scalar-int-constant-expr> ) ]
> 
> with a constraint that the value of <scalar-int-constant-expr> shall not
> exceed the rank of the array.
> 
> It would mean "the array is contiguous through the first
> <scalar-int-constant-expr> dimensions.  If <scalar-int-constant-expr>
> does not appear, it means the same as we have now: contiguous through
> all dimensions.
> 
> For the case of my four-dimensional arrays known to be contiguous in the
> first three dimensions, I also know that the first two dimension's
> bounds are constants (1:2).  Another proposal (that was rejected) was to
> allow to mix explicit and assumed shape (and contiguity) in one
> declaration, so I could declare an array
> 
>  complex(rp) :: IncOptDepth(2,2,::1,:)
> 
> This array is contiguous in the first three dimensions, but in addition
> the processor knows, by looking at the declaration, that the first two
> extents are constant.  Therefore, there is some hope that when such an
> array is an argument, e.g. to MATMUL:
> 
>  b = matmul(incOptDepth(:,:,i,j), c)
> 
> that the processor would optimize it.  We noticed a performance
> difference between this and
> 
>  b = matmul(incOptDepth(1:2,1:2,i,j), c)
> 
> so some processor was optimizing it.  In the current context, to mix the
> two proposals, I would write
> 
>  complex(rp), contiguous(3) :: IncOptDepth(2,2,:,:)
> 
> More generally, it is useful to be able to specify explicit extent after
> assumed extent, e.g.,
> 
>  real(rp) :: Buf(:,2)
> 
> in which an explicit-extent dimension following a discontiguous
> assumed-extent one does not imply contiguity.
> 
> But mixing assumed and explicit extent was a different proposal,
> requiring much more surgery, included here only for reference.
> 
> 

Bill Long                                                                       longb at cray.com
Principal Engineer, Fortran Technical Support &   voice:  651-605-9024
Bioinformatics Software Development                      fax:  651-605-9143
Cray Inc./ 2131 Lindau Lane/  Suite 1000/  Bloomington, MN  55425




More information about the J3 mailing list