(j3.2006) What do typical processors do?
Clune, Thomas L. GSFC-6101
thomas.l.clune
Wed Jul 19 09:31:20 EDT 2017
You might also want to benchmark just to see how much overhead there would be in the worst case with extra temps. The operations aside from the *GEMM are trivial, so unless the arrays are particularly small the *GEMM will dominate the cost even with the extra trips to/from memory for the intermediate expressions. Of course if you are teetering on the edge of fitting in memory it may matter more ?
> On Jul 18, 2017, at 10:10 PM, Keith Bierman <khbkhb at gmail.com> wrote:
>
>
>
>
>> On Jul 18, 2017, at 6:33 PM, Van Snyder <Van.Snyder at jpl.nasa.gov> wrote:
>>
>>> On Tue, 2017-07-18 at 23:30 +0000, Bill Long wrote:
>>> I .....
>>
>> CGEMM can be told to use the conjugate transpose of its first or second
>> argument. Maybe someday processors will recognize
>>
>> C = matmul ( A, conjg(transpose(B)) )
>> .....
>> Would a processor recognize that and turn it into a call to CGEMM
>> without creating temps?
>
> Get the usage into a SPEC benchmark and processors will pattern match as appropriate. Without sufficient proof it is worth the trouble .... Processors will probably not. It isn't that it is hard, but there's almost always a virtually infinite list of transforms a compiler could do...And a pretty small number of people to do the work.
> _______________________________________________
> J3 mailing list
> J3 at mailman.j3-fortran.org
> http://mailman.j3-fortran.org/mailman/listinfo/j3
More information about the J3
mailing list