[J3] SYSTEM_CLOCK
Bill Long
longb at cray.com
Sat Jan 30 05:02:26 UTC 2021
> On Jan 29, 2021, at 9:20 PM, Vipul Parekh via J3 <j3 at mailman.j3-fortran.org> wrote:
>
>
>
> On Fri, Jan 29, 2021 at 11:00 AM John Reid via J3 <j3 at mailman.j3-fortran.org> wrote:
> ..
> Your idea of basing everything on a real R for the count rate on an
> image would not work for an old code using default integers on an image
> that has a fast clock. ..
>
> John,
>
> I am very much bothered by the sentence, "If an image has more than one clock, the clock is determined by the kind of the integer arguments or is processor dependent if there are no integer arguments.”
As am I. Basically it says that the value returned for rate in
call system_clock (count_rate = rate)
is processor dependent. So could be any value, like 1.0 or 0,0, or even -1.0. That certainly harms the goal of portability. Indeed, I though the idea of adding the count_rate argument was enable portability. It’s not that hard to “get it right” and make system_clock more useful.
Cheers,
Bill
>
> All,
>
> My experience is limited to a *single* clock on an image that is generally fast but which the implementors might view as being too fast relative to 32-bit integer counts. So the processor might try to be "smart" and attempt some scaling of the tick counts to serve as a good timer for the calling program when the integer arguments are 32-bit and/or when they are of larger bit-size. Under the circumstances, it does seem the count rate becomes an arbitrary value such as 1,000 or 10,000 or 1,000,000 to simply help with the scaling. Conceptually such an arbitrary fixed value for the tick rate comes across as no different than CLOCKS_PER_SEC in <time.h> with the C processor that usually has an arbitrary constant value of 1,000,000,000. Or even the HZ constant with 'jiffies' on Linux kernel timers.
>
> Nonetheless, the processors has to ensure the "scaled" value of COUNT using the SYSTEM_CLOCK along with the COUNT_RATE are such that real(COUNT,kind=K)/real(COUNT_RATE,kind=K) (where kind K corresponds to a precision equal to or better than KIND(1D0)) provides an approximation of the "processor time". Or better yet, that real( COUNT2-COUNT1,kind=K)/real(COUNT_RATE,kind=K) - where COUNT2 and COUNT1 are 2 readings via consistent invocations of SYSTEM_CLOCK - provide a reasonable measure of the DELTA processor time elapsed between certain processor instructions. This appears only natural, for this supports the common use of the SYSTEM_CLOCK intrinsic toward instrumented code. This can be noticed with the code example using MATMUL below and where the program behavior using 2 different processors on WIndows are shown first:
>
> --- program output: gfortran ---
> Timer Count Before Count After Count Rate
> SYSTEM_CLOCK 32-bit 425628421 425629437 1000
> SYSTEM_CLOCK 64-bit 4256544112416 4256554369340 10000000
> Windows Tick Count 425628421 425629437 1000
>
> Timer MATMUL CPU Usage
> SYSTEM_CLOCK 32-bit 1.0160000000000000
> SYSTEM_CLOCK 64-bit 1.0256924000000001
> Windows Tick Count 1.0160000000000000
> --- end output ---
>
> --- program output: Intel Fortran ---
> Timer Ticks Before Ticks After Rate
> SYSTEM_CLOCK 32-bit 716841390 716857990 10000
> SYSTEM_CLOCK 64-bit 1611970172139000 1611970173799000 1000000
> Windows Tick Count 425542500 425544156 1000
>
> Timer MATMUL CPU Usage
> SYSTEM_CLOCK 32-bit 1.660000000000000
> SYSTEM_CLOCK 64-bit 1.660000000000000
> Windows Tick Count 1.656000000000000
> --- end output ---
>
> So what will be noticeable with this example is the common use of SYSTEM_CLOCK where some instrumentation toward a timing study is attempted and ultimately the value of interest is merely the DELTA "time" in a floating-point representation (e.g., real( COUNT2-COUNT1,kind=K)/real(COUNT_RATE,kind=K) ). On this basis, the distinction between a "32-bit clock" and "64-bit clock" (or a not-so-fast or fast clock) doesn't seem to matter all that much. Perhaps it may make a difference when the DELTA "time" is quite short but then it's difficult to read much into results when the elapsed times are too short anyway.
>
> Keeping this example in mind, some questions:
>
> 1) What all are the uses of SYSTEM_CLOCK besides the one shown above that helps approximate a DELTA "time" for some processor instructions?
>
> 2) Is there really more than one clock on an image with the processors currently in use, or is it just one which returns a certain tick count?
>
> 3) If it's just one clock but using which the Fortran processor scales the COUNTs relative to an arbitrary count rate, why introduce terminology in the standard about more than one clock? It will be more confusing to readers.
>
> 4) If the most common, or perhaps the only, use of SYSTEM_CLOCK is as a DELTA timer, and where the time values need to be in floating-point to be useful, why not orient the intrinsic toward the same? That is what the example in the current standard appears to attempt.
>
> 5) Can someone show an example of an existing code with default integers that does not work with more than one clock?
>
> Thanks,
> Vipul Parekh
>
> The results for SYSTEM_CLOCK shown above are based on this:
> --- begin code ---
> use, intrinsic :: iso_fortran_env, only : int32, int64, real64
>
> integer(int32) :: t0b, rate0b, t0a, rate0a
> integer(int64) :: t1b, rate1b, t1a, rate1a
> integer(int64) :: t2b, rate2b, t2a, rate2a
>
> integer, parameter :: N = 2048
> real(real64), allocatable :: a(:,:), b(:,:), c(:,:)
> character(len=*), parameter :: fmtg = "(g0,t25,g0,t45,g0,t65,g0)"
>
> allocate( a(N,N), b(N,N), c(N,N) )
> call random_number( a )
> call random_number( b )
>
> call system_clock( count=t0b, count_rate=rate0b )
> call system_clock( count=t1b, count_rate=rate1b )
> call sys_clock( ticks=t2b, rate=rate2b )
>
> c = matmul( a, b )
>
> call system_clock( count=t0a, count_rate=rate0a )
> call system_clock( count=t1a, count_rate=rate1a )
> call sys_clock( ticks=t2a, rate=rate2a )
>
> print fmtg, "Timer", "Count Before", "Count After", "Count Rate"
> print fmtg, "SYSTEM_CLOCK 32-bit",t0b, t0a, rate0b
> print fmtg, "SYSTEM_CLOCK 64-bit",t1b, t1a, rate1b
> print fmtg, "Windows Tick Count", t2b, t2a, rate2b
> print *
> print fmtg, "Timer", "MATMUL CPU Usage"
> print fmtg, "SYSTEM_CLOCK 32-bit", real((t0a-t0b), real64)/rate0b
> print fmtg, "SYSTEM_CLOCK 64-bit", real((t1a-t1b), real64)/rate1b
> print fmtg, "Windows Tick Count", real((t2a-t2b), real64)/rate2b
>
> contains
>
> subroutine sys_clock( ticks, rate )
>
> use, intrinsic :: iso_c_binding, only : c_long_long
>
> interface
> function GetTickCount64() result(ticks) bind(C, name="GetTickCount64")
> ! Microsoft API <sysinfoapi.h>
> ! ULONGLONG GetTickCount64();
> import :: c_long_long
> integer(c_long_long) :: ticks
> end function GetTickCount64
> end interface
>
> integer(c_long_long), parameter :: FREQUENCY = 1000_c_long_long ! milliseconds
>
> ! Argument list
> integer(c_long_long), intent(inout) :: ticks
> integer(c_long_long), intent(inout) :: rate
>
> ticks = GetTickCount64()
> rate = FREQUENCY
>
> return
>
> end subroutine
>
> end
> --- end code ---
>
> Compilation and execution:
> --- begin console output ---
> C:\Temp>gfortran -O3 t.f90 -o t.exe
>
> C:\Temp>t.exe
> Timer Count Before Count After Count Rate
> SYSTEM_CLOCK 32-bit 431543234 431544203 1000
> SYSTEM_CLOCK 64-bit 4315692302527 4315701959260 10000000
> Windows Tick Count 431543234 431544203 1000
>
> Timer MATMUL CPU Usage
> SYSTEM_CLOCK 32-bit 0.96899999999999997
> SYSTEM_CLOCK 64-bit 0.96567329999999996
> Windows Tick Count 0.96899999999999997
>
> C:\Temp>
> --- end console output ---
Bill Long longb at hpe.com
Engineer/Master , Fortran Technical Support & voice: 651-605-9024
Bioinformatics Software Development fax: 651-605-9143
Hewlett Packard Enterprise/ 2131 Lindau Lane/ Suite 1000/ Bloomington, MN 55425
More information about the J3
mailing list