(j3.2006) Mask / Pack / Vector subscript vs. WHERE
Van Snyder
vsnyder
Wed Jun 11 22:28:41 EDT 2014
I asked a few hours ago whether a method using mask, pack and a vector
subscript, with an allocated array for the vector subscript, would be
faster or slower than a WHERE ... ELSE WHERE ... construct.
The consensus was that the WHERE construct would be faster.
I was surprised that it was not faster using two of three compilers I
commonly use. At least, it was faster with one of them.
I also created an elemental version, which was fastest by far. The
other two versions have rank-1 arguments. Since the elemental version
will accept any rank, it's nice it's also fastest.
I cannot post the code because it's still under review for publication.
If you want to try it yourself, let me know and I'll see what I can do
to let you have the codes.
Here are the results, without attribution:
Compiler A
Faddeyeva_Rev.f90 47.7787
Faddeyeva_Where.f90 63.6593
Faddeyeva_Elemental.f90 27.2149
Compiler A -O
Faddeyeva_Rev.f90 34.5308
Faddeyeva_Where.f90 38.4112
Faddeyeva_Elemental.f90 18.2692
Compiler B
Faddeyeva_Rev.f90 32.0141
Faddeyeva_Where.f90 35.6216
Faddeyeva_Elemental.f90 13.3000
Compiler B -O
Faddeyeva_Rev.f90 32.4971
Faddeyeva_Where.f90 35.6776
Faddeyeva_Elemental.f90 13.2630
Compiler C
Faddeyeva_Rev.f90 45.2341
Faddeyeva_Where.f90 41.8476
Faddeyeva_Elemental.f90 30.4304
Compiler C -O
Faddeyeva_Rev.f90 33.8579
Faddeyeva_Where.f90 30.5444
Faddeyeva_Elemental.f90 17.8903
The program consists of only a main program and two modules. In every
case, I compiled everything all at once, rather than compiling each file
individually and linking the .o files. I don't know whether this makes
a difference.
More information about the J3
mailing list