Vipul Parekh parekhvs at gmail.com
Sat Jan 30 03:20:41 UTC 2021

On Fri, Jan 29, 2021 at 11:00 AM John Reid via J3 <j3 at mailman.j3-fortran.org>

> ..
> Your idea of basing everything on a real R for the count rate on an
> image would not work for an old code using default integers on an image
> that has a fast clock.  ..


I am very much bothered by the sentence, "If an image has more than one clock,
the clock is determined by the kind of the integer arguments or is
processor dependent if there are no integer arguments."


My experience is limited to a *single* clock on an image that is generally
fast but which the implementors might view as being too fast relative to
32-bit integer counts.  So the processor might try to be "smart" and
attempt some scaling of the tick counts to serve as a good timer for the
calling program when the integer arguments are 32-bit and/or when they are
of larger bit-size.  Under the circumstances, it does seem the count rate
becomes an arbitrary value such as 1,000 or 10,000 or 1,000,000 to simply
help with the scaling.  Conceptually such an arbitrary fixed value for the
tick rate comes across as no different than CLOCKS_PER_SEC in <time.h> with
the C processor that usually has an arbitrary constant value of
1,000,000,000.  Or even the HZ constant with 'jiffies' on Linux kernel

Nonetheless, the processors has to ensure the "scaled" value of COUNT using
the SYSTEM_CLOCK along with the COUNT_RATE are such that
real(COUNT,kind=K)/real(COUNT_RATE,kind=K) (where kind K corresponds to a
precision equal to or better than KIND(1D0)) provides an approximation of
the "processor time".  Or better yet, that real(
COUNT2-COUNT1,kind=K)/real(COUNT_RATE,kind=K) - where COUNT2 and COUNT1 are
2 readings via consistent invocations of SYSTEM_CLOCK - provide a
reasonable measure of the DELTA processor time elapsed between certain
processor instructions.  This appears only natural, for this supports the
common use of the SYSTEM_CLOCK intrinsic toward instrumented code.  This
can be noticed with the code example using MATMUL below and where the
program behavior using 2 different processors on WIndows are shown first:

--- program output: gfortran ---
Timer                   Count Before        Count After         Count Rate
SYSTEM_CLOCK 32-bit     425628421           425629437           1000
SYSTEM_CLOCK 64-bit     4256544112416       4256554369340       10000000
Windows Tick Count      425628421           425629437           1000

Timer                   MATMUL CPU Usage
SYSTEM_CLOCK 32-bit     1.0160000000000000
SYSTEM_CLOCK 64-bit     1.0256924000000001
Windows Tick Count      1.0160000000000000
--- end output ---

--- program output: Intel Fortran ---
Timer                   Ticks Before        Ticks After         Rate
SYSTEM_CLOCK 32-bit     716841390           716857990           10000
SYSTEM_CLOCK 64-bit     1611970172139000    1611970173799000    1000000
Windows Tick Count      425542500           425544156           1000

Timer                   MATMUL CPU Usage
SYSTEM_CLOCK 32-bit     1.660000000000000
SYSTEM_CLOCK 64-bit     1.660000000000000
Windows Tick Count      1.656000000000000
  --- end output ---

So what will be noticeable with this example is the common use of
SYSTEM_CLOCK where some instrumentation toward a timing study is attempted
and ultimately the value of interest is merely the DELTA "time" in a
floating-point representation (e.g.,  real(
COUNT2-COUNT1,kind=K)/real(COUNT_RATE,kind=K) ).  On this basis, the
distinction between a "32-bit clock" and "64-bit clock" (or a not-so-fast
or fast clock) doesn't seem to matter all that much.  Perhaps it may make a
difference when the DELTA "time" is quite short but then it's difficult to
read much into results when the elapsed times are too short anyway.

Keeping this example in mind, some questions:

1) What all are the uses of SYSTEM_CLOCK besides the one shown above that
helps approximate a DELTA "time" for some processor instructions?

2) Is there really more than one clock on an image with the processors
currently in use, or is it just one which returns a certain tick count?

3) If it's just one clock but using which the Fortran processor scales the
COUNTs relative to an arbitrary count rate, why introduce terminology in
the standard about more than one clock?  It will be more confusing to

4) If the most common, or perhaps the only, use of SYSTEM_CLOCK is as a
DELTA timer, and where the time values need to be in floating-point to be
useful, why not orient the intrinsic toward the same?  That is what the
example in the current standard appears to attempt.

5) Can someone show an example of an existing code with default integers
that does not work with more than one clock?

Vipul Parekh

The results for SYSTEM_CLOCK shown above are based on this:
--- begin code ---
   use, intrinsic :: iso_fortran_env, only : int32, int64, real64

   integer(int32) :: t0b, rate0b, t0a, rate0a
   integer(int64) :: t1b, rate1b, t1a, rate1a
   integer(int64) :: t2b, rate2b, t2a, rate2a

   integer, parameter :: N = 2048
   real(real64), allocatable :: a(:,:), b(:,:), c(:,:)
   character(len=*), parameter :: fmtg = "(g0,t25,g0,t45,g0,t65,g0)"

   allocate( a(N,N), b(N,N), c(N,N) )
   call random_number( a )
   call random_number( b )

   call system_clock( count=t0b, count_rate=rate0b )
   call system_clock( count=t1b, count_rate=rate1b )
   call sys_clock( ticks=t2b, rate=rate2b )

   c = matmul( a, b )

   call system_clock( count=t0a, count_rate=rate0a )
   call system_clock( count=t1a, count_rate=rate1a )
   call sys_clock( ticks=t2a, rate=rate2a )

   print fmtg, "Timer", "Count Before", "Count After", "Count Rate"
   print fmtg, "SYSTEM_CLOCK 32-bit",t0b, t0a, rate0b
   print fmtg, "SYSTEM_CLOCK 64-bit",t1b, t1a, rate1b
   print fmtg, "Windows Tick Count", t2b, t2a, rate2b
   print *
   print fmtg, "Timer", "MATMUL CPU Usage"
   print fmtg, "SYSTEM_CLOCK 32-bit", real((t0a-t0b), real64)/rate0b
   print fmtg, "SYSTEM_CLOCK 64-bit", real((t1a-t1b), real64)/rate1b
   print fmtg, "Windows Tick Count",  real((t2a-t2b), real64)/rate2b


   subroutine sys_clock( ticks, rate )

      use, intrinsic :: iso_c_binding, only : c_long_long

         function GetTickCount64() result(ticks) bind(C,
         ! Microsoft API <sysinfoapi.h>
         ! ULONGLONG GetTickCount64();
            import :: c_long_long
            integer(c_long_long) :: ticks
         end function GetTickCount64
      end interface

      integer(c_long_long), parameter :: FREQUENCY = 1000_c_long_long !

      ! Argument list
      integer(c_long_long), intent(inout) :: ticks
      integer(c_long_long), intent(inout) :: rate

      ticks = GetTickCount64()
      rate = FREQUENCY


   end subroutine

--- end code ---

Compilation and execution:
--- begin console output ---
C:\Temp>gfortran -O3 t.f90 -o t.exe

Timer                   Count Before        Count After         Count Rate
SYSTEM_CLOCK 32-bit     431543234           431544203           1000
SYSTEM_CLOCK 64-bit     4315692302527       4315701959260       10000000
Windows Tick Count      431543234           431544203           1000

Timer                   MATMUL CPU Usage
SYSTEM_CLOCK 32-bit     0.96899999999999997
SYSTEM_CLOCK 64-bit     0.96567329999999996
Windows Tick Count      0.96899999999999997

--- end console output ---
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.j3-fortran.org/pipermail/j3/attachments/20210129/506a17c5/attachment-0002.htm>

More information about the J3 mailing list