[J3] Problem with contiguous
John Reid
John.Reid at stfc.ac.uk
Tue Dec 11 13:05:12 EST 2018
Van,
It sounds to me as if the problem is that your vendor is providing an
inefficient allocator. Should you not be moaning to the vendor, rather
than asking for a language extension?
Cheers,
John.
Van Snyder via J3 wrote:
> I have a routine that's more than 1000 lines.
>
> It needs more than a hundred work arrays.
>
> The routine is invoked for many problems of many different sizes, but I
> know the maximum size (at run time, not compile time).
>
> Originally, the arrays were pointers, local to that routine, explicitly
> allocated, for each problem. This was before pointers could be declared
> to be contiguous.
>
> Then, they were allocatable, local to that routine, explicitly
> allocated, for each problem.
>
> Then, they were automatic, local to that routine.
>
> A profiler told us we were were spending a significant fraction of run
> time, about 8% overall and much more for that routine alone, in the
> storage allocator.
>
> So we made the routine internal to the one that repeatedly calls it with
> different problem sizes, made the work arrays automatic in the calling
> routine, with extents dependent upon the maximum problem size, and
> accessed them by host association. Because they're automatic, they're
> simply contiguous. Because they're in the calling routine, outside the
> loop of calls to the internal one, the storage-allocator cost was
> dramatically reduced.
>
> Then, we had to make the routine stand on its own again, so we added
> hundreds of assumed-shape dummy arguments for the work arrays.
>
> The work arrays are still automatic in the calling routine (now
> routines), still declared something like
>
> real(rp) :: Sps_Path(max_f,n_sps)
>
> where max_f is the maximum problem size, and n_sps is the same for every
> problem. We need to compute column-wise dot products, so we don't want
> to exchange the dimensions, which would make the whole array contiguous
> because n_sps is the same for every problem, but this is not helpful
> because rows are never contiguous.
>
> The routine is referenced
>
> call one_frequency ( ..., sps_path(:npf,:), ... )
>
> where npf is the current problem size. The original code, in which
> Sps_Path etc. were accessed by host association, had sps_path(:npf,:) in
> every array reference, but now they're whole-array references with a
> requirement that the actual argument is just the relevant section. It
> would be costly and error-prone to change them back.
>
> We noticed a performance degradation because the dummy arguments are not
> contiguous.
>
> They can't be declared CONTIGUOUS -- because they're not. Well, we
> could declare them CONTIGUOUS at the expense of copy-in (and sometimes
> copy-out too). VALUE would make the expense more obvious, but the
> copying expense would overwhelm the cost increment incurred by
> discontiguity.
>
> But every section with stride = 1 in the first dimension is contiguous.
> For some four-dimensional arrays, it's every such section in the first
> three dimensions that are contiguous.
>
> I suspect others have had the same or a similar problem.
>
> Many years ago, I proposed a different notation for partly-contiguous
> assumed-shape arrays:
>
> real(rp), intent(in) :: sps_path(::1, :)
>
> meaning "it's contiguous in the first dimension, but columns are not
> necessarily contiguous one to another."
>
> More generally:
>
> real, intent(...) :: A(::1, ::1, :)
>
> meaning "it's contiguous only in the first two dimensions, but not after
> that," etc. There was a constraint that a dimension could not be
> declared to be contiguous unless all previous dimensions are contiguous.
> For example
>
> real, intent(in) :: A(:,:,::1)
>
> would be prohibited (because it cannot be correct that an array is
> discontiguous in the first two dimensions but contiguous in the third
> dimension).
>
> We rejected that notation in favor of CONTIGUOUS. which says the entire
> array is contiguous.
>
> PROPOSAL:
>
> Change CONTIGUOUS to
>
> R802 <attr-spec> <<is>> ...
> <<or>> CONTIGUOUS [ ( <scalar-int-constant-expr> ) ]
>
> with a constraint that the value of <scalar-int-constant-expr> shall not
> exceed the rank of the array.
>
> It would mean "the array is contiguous through the first
> <scalar-int-constant-expr> dimensions. If <scalar-int-constant-expr>
> does not appear, it means the same as we have now: contiguous through
> all dimensions.
>
> For the case of my four-dimensional arrays known to be contiguous in the
> first three dimensions, I also know that the first two dimension's
> bounds are constants (1:2). Another proposal (that was rejected) was to
> allow to mix explicit and assumed shape (and contiguity) in one
> declaration, so I could declare an array
>
> complex(rp) :: IncOptDepth(2,2,::1,:)
>
> This array is contiguous in the first three dimensions, but in addition
> the processor knows, by looking at the declaration, that the first two
> extents are constant. Therefore, there is some hope that when such an
> array is an argument, e.g. to MATMUL:
>
> b = matmul(incOptDepth(:,:,i,j), c)
>
> that the processor would optimize it. We noticed a performance
> difference between this and
>
> b = matmul(incOptDepth(1:2,1:2,i,j), c)
>
> so some processor was optimizing it. In the current context, to mix the
> two proposals, I would write
>
> complex(rp), contiguous(3) :: IncOptDepth(2,2,:,:)
>
> More generally, it is useful to be able to specify explicit extent after
> assumed extent, e.g.,
>
> real(rp) :: Buf(:,2)
>
> in which an explicit-extent dimension following a discontiguous
> assumed-extent one does not imply contiguity.
>
> But mixing assumed and explicit extent was a different proposal,
> requiring much more surgery, included here only for reference.
>
>
More information about the J3
mailing list