- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I am testing autoparallelization on ifort v. 11.1 20100401. (I know this is not the latest version. Our department will not upgrade the license until it expires in the Fall.)
I've tried using the test problem on http://software.intel.com/en-us/articles/automatic-parallelization-with-intel-compilers/ :
When I compile the program with
ifort -traceback $< -o test-serial
the code runs fine. It uses 100% of one cpu and exits normally.
I then compare that to
ifort -traceback $< -o test-parallel -parallel -par-report3
I see the following output when compiling:
ifort -traceback test.f90 -o test-parallel -parallel -par-report3
procedure: test
procedure: test
test.f90(4): (col. 7) remark: LOOP WAS AUTO-PARALLELIZED.
This time, the program uses 100% of all cpus and exits normally. The problem is that the parallelized version exits in the same amount of time as the serial version!
(parallel cputime) = (num cores) * (serial cputime).
This makes me sad. I have seen the same behavior on a multi-core Mac Pro. How can I utilize autoparallelization?
- Nooj
I've tried using the test problem on http://software.intel.com/en-us/articles/automatic-parallelization-with-intel-compilers/ :
- PROGRAMTEST
- PARAMETER(N=10000000)
- REALA,C(N)
- DOI=1,N
- A=2*I-1
- C(I)=SQRT(A)
- ENDDO
- PRINT*,N,C(1),C(N)
- END
When I compile the program with
ifort -traceback $< -o test-serial
the code runs fine. It uses 100% of one cpu and exits normally.
I then compare that to
ifort -traceback $< -o test-parallel -parallel -par-report3
I see the following output when compiling:
ifort -traceback test.f90 -o test-parallel -parallel -par-report3
procedure: test
procedure: test
test.f90(4): (col. 7) remark: LOOP WAS AUTO-PARALLELIZED.
This time, the program uses 100% of all cpus and exits normally. The problem is that the parallelized version exits in the same amount of time as the serial version!
(parallel cputime) = (num cores) * (serial cputime).
This makes me sad. I have seen the same behavior on a multi-core Mac Pro. How can I utilize autoparallelization?
- Nooj
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well I can clear up one point quickly: your license is NOT for a version. Licenses give you access to support and this includes access to the latest compilers. They need only go to https://registrationcenter.intel.com and get the latest compilers.
But getting back to your problem - how are you measuring time? Some timers add up all the individual core times, so it would look like zero speedup. Also, what resolution is the timer? But more to the point, I just ran this code and it's trivial - runs in a fraction of a second. The process startup and tear down is going to dominate, as the computation is down in the noise. Plus, the non-parallelized version does not have to set up a thread pool and then tear down the thread pool. Your costs to simply start the program on the system takes longer than the little loop - 10 million cycles on a cpu that can do 2 gigacycles per second - noise.
You will find a number of threads in this forum about timing and synthetic, homegrown 'benchmarks'. Bottom line - the only true way to test -parallel is to either use a high-precision timer like the one in IPP OR use a real application that takes more than a few seconds to run.
ron
But getting back to your problem - how are you measuring time? Some timers add up all the individual core times, so it would look like zero speedup. Also, what resolution is the timer? But more to the point, I just ran this code and it's trivial - runs in a fraction of a second. The process startup and tear down is going to dominate, as the computation is down in the noise. Plus, the non-parallelized version does not have to set up a thread pool and then tear down the thread pool. Your costs to simply start the program on the system takes longer than the little loop - 10 million cycles on a cpu that can do 2 gigacycles per second - noise.
You will find a number of threads in this forum about timing and synthetic, homegrown 'benchmarks'. Bottom line - the only true way to test -parallel is to either use a high-precision timer like the one in IPP OR use a real application that takes more than a few seconds to run.
ron

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page