- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
program time_check
use omp_lib
implicit none
real (8),dimension(:,:,:), allocatable :: A, C
integer(4) :: n = 300, P=300, i
real (8) :: t1, t2, t3, t4, dc_time, omp_time
allocate ( A(n,n,P), C(n,n,P) )
call random_number ( a )
call cpu_time ( t1 )
do concurrent ( i = 1:P )
c(:,:,i) = MATMUL ( A(:,:,i), A(:,:,i) )
end do
call cpu_time ( t2 )
call cpu_time ( t3 )
!$OMP PARALLEL DO SHARED ( A,C,P )
do i = 1,P
c(:,:,i) = MATMUL ( A(:,:,i), A(:,:,i) )
end do
!$OMP END PARALLEL DO
call cpu_time ( t4 )
dc_time = t2 - t1
omp_time= t4 - t3
print*, "Do concurrent time = ", dc_time
print*, "OpenMP time = ", omp_time
End program time_check
I am using cpu_time routine to time the do concurrent and OpenMP construct. When I use /Qopenmp for compilation the time seems not correct. But without /Qopenmp the timing seems correct.
I am familiar with the OpenMP wall time routine i.e.
omp_get_wtime()
Also MPI has it's own function i.e.
MPI_Wtime()
So if
call cpu_time()
is for serial time then do we have any specific routine or function only for do concurrent timing too ?
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
When timing an application that has parallelization, use wall clock time. Why not use omp_get_wtime() since you're using OpenMP directives?
cpu_time() sums the time for all threads. Here's the reference in the DGR.
Link kopiert
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
CPU_TIME may add up all the time in threads. I'll also comment that the test program runs the risk of not measuring what you want because the compiler could throw away much of the work since it is not used. I'll also note that for MATMUL, at least, the compiler can call an MKL optimized version that does its own multithreading. That may be an issue here.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
You are measuring time for a process that has millions of steps.
One would fairly assume that your answers are the same, ie you have two values from a random Gaussian process that probably has a wide error.
You are hunting a furphy.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Post was cut off
Write a Monte Carlo shell and run it a million times and then repose the problem.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
QED. Same computer nothing changed, nothing running that was not running 32 hours ago.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Skewed Gaussian after 68 with a little log and maybe some logistic
For a full MC run with this code and 1000000 requires 7 weeks.
If you can prove there is a statistical difference then your math is better than mine.
The XLSX file includes up to about 200
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
When timing an application that has parallelization, use wall clock time. Why not use omp_get_wtime() since you're using OpenMP directives?
cpu_time() sums the time for all threads. Here's the reference in the DGR.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
I had no idea that cpu_time ( ) sums the time for all threads. This CPU_TIME (intel.com) gives clarity, sadly (CPU_TIME (The GNU Fortran Compiler)) does not!
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
@Fortran10 just another note about the timers, some time in the past I switched always to system_clock, as I don't see the benefit of calling either the OMP runtime or the MPI runtime. The only drawback of system_clock is it returns ticks and the precision is dependent on the kind of the integer that you use to call system_clock. So just make sure you are using INT64 when calling system_clock both for the count and the count_rate, if you need high precision.
https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/2023-2/system-clock.html

- RSS-Feed abonnieren
- Thema als neu kennzeichnen
- Thema als gelesen kennzeichnen
- Diesen Thema für aktuellen Benutzer floaten
- Lesezeichen
- Abonnieren
- Drucker-Anzeigeseite