- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
program time_check
use omp_lib
implicit none
real (8),dimension(:,:,:), allocatable :: A, C
integer(4) :: n = 300, P=300, i
real (8) :: t1, t2, t3, t4, dc_time, omp_time
allocate ( A(n,n,P), C(n,n,P) )
call random_number ( a )
call cpu_time ( t1 )
do concurrent ( i = 1:P )
c(:,:,i) = MATMUL ( A(:,:,i), A(:,:,i) )
end do
call cpu_time ( t2 )
call cpu_time ( t3 )
!$OMP PARALLEL DO SHARED ( A,C,P )
do i = 1,P
c(:,:,i) = MATMUL ( A(:,:,i), A(:,:,i) )
end do
!$OMP END PARALLEL DO
call cpu_time ( t4 )
dc_time = t2 - t1
omp_time= t4 - t3
print*, "Do concurrent time = ", dc_time
print*, "OpenMP time = ", omp_time
End program time_check
I am using cpu_time routine to time the do concurrent and OpenMP construct. When I use /Qopenmp for compilation the time seems not correct. But without /Qopenmp the timing seems correct.
I am familiar with the OpenMP wall time routine i.e.
omp_get_wtime()
Also MPI has it's own function i.e.
MPI_Wtime()
So if
call cpu_time()
is for serial time then do we have any specific routine or function only for do concurrent timing too ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When timing an application that has parallelization, use wall clock time. Why not use omp_get_wtime() since you're using OpenMP directives?
cpu_time() sums the time for all threads. Here's the reference in the DGR.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
CPU_TIME may add up all the time in threads. I'll also comment that the test program runs the risk of not measuring what you want because the compiler could throw away much of the work since it is not used. I'll also note that for MATMUL, at least, the compiler can call an MKL optimized version that does its own multithreading. That may be an issue here.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are measuring time for a process that has millions of steps.
One would fairly assume that your answers are the same, ie you have two values from a random Gaussian process that probably has a wide error.
You are hunting a furphy.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Post was cut off
Write a Monte Carlo shell and run it a million times and then repose the problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
QED. Same computer nothing changed, nothing running that was not running 32 hours ago.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Skewed Gaussian after 68 with a little log and maybe some logistic
For a full MC run with this code and 1000000 requires 7 weeks.
If you can prove there is a statistical difference then your math is better than mine.
The XLSX file includes up to about 200
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When timing an application that has parallelization, use wall clock time. Why not use omp_get_wtime() since you're using OpenMP directives?
cpu_time() sums the time for all threads. Here's the reference in the DGR.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I had no idea that cpu_time ( ) sums the time for all threads. This CPU_TIME (intel.com) gives clarity, sadly (CPU_TIME (The GNU Fortran Compiler)) does not!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Fortran10 just another note about the timers, some time in the past I switched always to system_clock, as I don't see the benefit of calling either the OMP runtime or the MPI runtime. The only drawback of system_clock is it returns ticks and the precision is dependent on the kind of the integer that you use to call system_clock. So just make sure you are using INT64 when calling system_clock both for the count and the count_rate, if you need high precision.
https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/2023-2/system-clock.html
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page