Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Multithread question

dondilworth
New Contributor II
749 Views
I have implemented a multithread option in my code. It runs and gives the right answer, but it is many times slower than the single-thread version. How much overhead is wasted in creating and exiting from threads? That seems to take quite a while. Are there any references I can use for guidance?
0 Kudos
1 Solution
jimdempseyatthecove
Honored Contributor III
749 Views
When your loop iterations is relatively small and the work done is light weight

do i=1,100
a(i) = 0.0
end do

Then the overhead of threading the loop exceeds the runtime of the loop for one thread.

If you choose to use the auto-parallel feature, then add command line switches and/orcompiler dirrectives to control when and where parallelization is to occure.

It tends to be better to add parallization by way of OpenMP directives.

Jim Dermpsey

View solution in original post

0 Kudos
2 Replies
TimP
Honored Contributor III
749 Views
The overhead of creating threads tends to be higher in Windows than on linux, for which many of the references you will find are written. Threading errors such as false sharing are more likely to produce symptoms such as you describe. Those can be difficult to diagnose when you aren't familiar with the application.
It's always a goal of analysis such as Parallel Studio to help in such diagnosis. You might also turn off your own threading and see whether /Qparallel with /Qpar-report at various levels gives you any clues about where it can and cannot parallelize. That option may perform some loop interchanges to accomplish its job.
0 Kudos
jimdempseyatthecove
Honored Contributor III
750 Views
When your loop iterations is relatively small and the work done is light weight

do i=1,100
a(i) = 0.0
end do

Then the overhead of threading the loop exceeds the runtime of the loop for one thread.

If you choose to use the auto-parallel feature, then add command line switches and/orcompiler dirrectives to control when and where parallelization is to occure.

It tends to be better to add parallization by way of OpenMP directives.

Jim Dermpsey
0 Kudos
Reply