Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29386 Discusiones

/Qopenmp option with cpu_time routine for time calculation

Fortran10
Novato
1.857 Vistas

 

program time_check
  use omp_lib
  implicit none

  real (8),dimension(:,:,:),  allocatable :: A, C
  integer(4) :: n = 300, P=300, i
  real (8) :: t1, t2, t3, t4, dc_time, omp_time


  allocate ( A(n,n,P), C(n,n,P) )
  call random_number ( a )


  call cpu_time ( t1 )
  do concurrent ( i = 1:P ) 
     c(:,:,i) = MATMUL ( A(:,:,i), A(:,:,i) )
  end do
  call cpu_time ( t2 )


  call cpu_time ( t3 )
  !$OMP PARALLEL DO SHARED ( A,C,P )
  do i = 1,P
     c(:,:,i) = MATMUL ( A(:,:,i), A(:,:,i) )
  end do
  !$OMP END PARALLEL DO
  call cpu_time ( t4 )

  dc_time = t2 - t1
  omp_time= t4 - t3

  print*, "Do concurrent time = ", dc_time
  print*, "OpenMP time        = ", omp_time

End program time_check

 

I am using cpu_time routine to time the do concurrent and OpenMP construct. When I use /Qopenmp for compilation the time seems not correct. But without /Qopenmp the timing seems correct. 

Qopenmp.PNGnoopenmp.PNG

I am familiar with the OpenMP wall time routine i.e. 

omp_get_wtime()

Also MPI has it's own function i.e.

MPI_Wtime()

So if

call cpu_time()

is for serial time then do we have any specific routine or function only for do concurrent timing too ?

0 kudos
1 Solución
Barbara_P_Intel
Empleados
1.779 Vistas

When timing an application that has parallelization, use wall clock time. Why not use omp_get_wtime() since you're using OpenMP directives?

cpu_time() sums the time for all threads. Here's the reference in the DGR.

 

Ver la solución en mensaje original publicado

8 Respuestas
Steve_Lionel
Colaborador Distinguido III
1.825 Vistas

CPU_TIME may add up all the time in threads. I'll also comment that the test program runs the risk of not measuring what you want because the compiler could throw away much of the work since it is not used. I'll also note that for MATMUL, at least, the compiler can call an MKL optimized version that does its own multithreading. That may be an issue here.

JohnNichols
Colaborador Valioso III
1.809 Vistas

You are measuring time for a process that has millions of steps. 

One would fairly assume that your answers are the same, ie you have two values from a random Gaussian process that probably has a wide error.  

You  are hunting a furphy. 

 
 

 

 

JohnNichols
Colaborador Valioso III
1.809 Vistas

Screenshot 2023-10-17 174647.png

Post was cut off

Write a Monte Carlo shell and run it a million times and then repose the problem.  

 

JohnNichols
Colaborador Valioso III
1.799 Vistas

Screenshot 2023-10-17 175311.pngScreenshot 2023-10-17 175357.pngScreenshot 2023-10-17 175425.png

QED. Same computer nothing changed, nothing running that was not running 32 hours ago. 

JohnNichols
Colaborador Valioso III
1.792 Vistas

JohnNichols_0-1697584593609.png

Skewed Gaussian after 68 with a little log and maybe some logistic

For a full MC run with this code and 1000000 requires 7 weeks. 

If you can prove there is a statistical difference then your math is better than mine. 

The XLSX file includes up to about 200 

Barbara_P_Intel
Empleados
1.780 Vistas

When timing an application that has parallelization, use wall clock time. Why not use omp_get_wtime() since you're using OpenMP directives?

cpu_time() sums the time for all threads. Here's the reference in the DGR.

 

Fortran10
Novato
1.686 Vistas

I had no idea that cpu_time ( ) sums the time for all threads. This  CPU_TIME (intel.com) gives clarity, sadly (CPU_TIME (The GNU Fortran Compiler)) does not! 

 

 

TobiasK
Moderador
1.759 Vistas

@Fortran10 just another note about the timers, some time in the past I switched always to system_clock, as I don't see the benefit of calling either the OMP runtime or the MPI runtime. The only drawback of system_clock is it returns ticks and the precision is dependent on the kind of the integer that you use to call system_clock. So just make sure you are using INT64 when calling system_clock both for the count and the count_rate, if you need high precision.
https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/2023-2/system-clock.html

 

Responder