Might be of interest to some, to compare to coarrays.

Robert van de Geijn rvdg at cs.utexas.edu
May 11, 2019
> We are excited to announce that edX has opened registration for
> "LAFF-On Programming for High Performance" [1].  This free-to-audit,
> four-week, self-paced course developed by UT-Austin faculty Robert van
> de Geijn, Maggie Myers, and Devangi Parikh starts on June 4, 2019.
> The course uses the simple but important example of matrix-matrix
> multiplication to illustrate fundamental techniques for attaining
> high-performance on modern CPUs. A carefully designed sequence of
> exercises leads the learner from a naive implementation to one that
> effectively utilizes instruction level parallelism and culminates in a
> high-performance, multithreaded implementation. Along the way, it is
> discovered that careful attention to data movement is key to efficient
> computing.  In other words, learners are exposed to techniques for
> attaining high performance through carefully scaffolded exercises that
> illustrate how the BLAS-like Library Instantiation Software (BLIS) [2]
> implements dgemm, which is itself based on Goto's algorithm [3].
> We believe this course is appropriate for a novice yet of interest to
> an expert.  It may be, for example, a great way to get a summer intern
> quite literally up to speed.  Others may want to use it as a component
> in a class they teach.  Some learners may merely come to the
> conclusion that they should be using high-performance libraries.
> Others may find they enjoy low level optimization.
> Please help us spread the word!
