Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29035 Diskussionen

/Qopenmp option with cpu_time routine for time calculation

Fortran10
Anfänger
1.296Aufrufe

 

program time_check
  use omp_lib
  implicit none

  real (8),dimension(:,:,:),  allocatable :: A, C
  integer(4) :: n = 300, P=300, i
  real (8) :: t1, t2, t3, t4, dc_time, omp_time


  allocate ( A(n,n,P), C(n,n,P) )
  call random_number ( a )


  call cpu_time ( t1 )
  do concurrent ( i = 1:P ) 
     c(:,:,i) = MATMUL ( A(:,:,i), A(:,:,i) )
  end do
  call cpu_time ( t2 )


  call cpu_time ( t3 )
  !$OMP PARALLEL DO SHARED ( A,C,P )
  do i = 1,P
     c(:,:,i) = MATMUL ( A(:,:,i), A(:,:,i) )
  end do
  !$OMP END PARALLEL DO
  call cpu_time ( t4 )

  dc_time = t2 - t1
  omp_time= t4 - t3

  print*, "Do concurrent time = ", dc_time
  print*, "OpenMP time        = ", omp_time

End program time_check

 

I am using cpu_time routine to time the do concurrent and OpenMP construct. When I use /Qopenmp for compilation the time seems not correct. But without /Qopenmp the timing seems correct. 

Qopenmp.PNGnoopenmp.PNG

I am familiar with the OpenMP wall time routine i.e. 

omp_get_wtime()

Also MPI has it's own function i.e.

MPI_Wtime()

So if

call cpu_time()

is for serial time then do we have any specific routine or function only for do concurrent timing too ?

0 Kudos
1 Lösung
Barbara_P_Intel
Mitarbeiter
1.218Aufrufe

When timing an application that has parallelization, use wall clock time. Why not use omp_get_wtime() since you're using OpenMP directives?

cpu_time() sums the time for all threads. Here's the reference in the DGR.

 

Lösung in ursprünglichem Beitrag anzeigen

8 Antworten
Steve_Lionel
Geehrter Beitragender III
1.264Aufrufe

CPU_TIME may add up all the time in threads. I'll also comment that the test program runs the risk of not measuring what you want because the compiler could throw away much of the work since it is not used. I'll also note that for MATMUL, at least, the compiler can call an MKL optimized version that does its own multithreading. That may be an issue here.

JohnNichols
Geschätzter Beitragender III
1.248Aufrufe

You are measuring time for a process that has millions of steps. 

One would fairly assume that your answers are the same, ie you have two values from a random Gaussian process that probably has a wide error.  

You  are hunting a furphy. 

 
 

 

 

JohnNichols
Geschätzter Beitragender III
1.248Aufrufe

Screenshot 2023-10-17 174647.png

Post was cut off

Write a Monte Carlo shell and run it a million times and then repose the problem.  

 

JohnNichols
Geschätzter Beitragender III
1.238Aufrufe

Screenshot 2023-10-17 175311.pngScreenshot 2023-10-17 175357.pngScreenshot 2023-10-17 175425.png

QED. Same computer nothing changed, nothing running that was not running 32 hours ago. 

JohnNichols
Geschätzter Beitragender III
1.231Aufrufe

JohnNichols_0-1697584593609.png

Skewed Gaussian after 68 with a little log and maybe some logistic

For a full MC run with this code and 1000000 requires 7 weeks. 

If you can prove there is a statistical difference then your math is better than mine. 

The XLSX file includes up to about 200 

(Virenscan läuft...)
Barbara_P_Intel
Mitarbeiter
1.219Aufrufe

When timing an application that has parallelization, use wall clock time. Why not use omp_get_wtime() since you're using OpenMP directives?

cpu_time() sums the time for all threads. Here's the reference in the DGR.

 

Fortran10
Anfänger
1.125Aufrufe

I had no idea that cpu_time ( ) sums the time for all threads. This  CPU_TIME (intel.com) gives clarity, sadly (CPU_TIME (The GNU Fortran Compiler)) does not! 

 

 

TobiasK
Moderator
1.198Aufrufe

@Fortran10 just another note about the timers, some time in the past I switched always to system_clock, as I don't see the benefit of calling either the OMP runtime or the MPI runtime. The only drawback of system_clock is it returns ticks and the precision is dependent on the kind of the integer that you use to call system_clock. So just make sure you are using INT64 when calling system_clock both for the count and the count_rate, if you need high precision.
https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/2023-2/system-clock.html

 

Antworten