Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

FC 9.1 for Pentium D

kawai
Beginner
312 Views

Hi Hi,

I 've compiled FC9.1 on a cluster with Pentium D processor (3.4G) and the OS is Fedora Core 6. the cluster is used for calculation.

I 've compared the calculation time with the time from another cluster that i 've built up with P4 processor (3.4 G, with em64t) and the OS is ferdora core 4.

They are nearly thesame , it seems that i don't have any benefit from the dual-core technique.

is there somethingwrong?

Or

do i have to included any specific flag in order to gain benifit from the dual -core technique that should have in Pentium D processor?

Thanks!!

0 Kudos
2 Replies
TimP
Honored Contributor III
312 Views
The usual way of taking advantage of dual core nodes in a cluster would be to assign 2 processes to each node. If you are using MPI with an option to do so, you would use shared memory communication between the 2 cores. You have the option of running 2 threads per node instead, compiling with -parallel or -openmp (with OpenMP directives in your source code). Maybe you haven't explained what you mean.
0 Kudos
Steven_L_Intel1
Employee
312 Views
As Tim hints, a given application, if it is not written to take advantage of multiple threads, will not use additional cores. There are many ways of adding parallelism to your application, and I'd suggest reading the chapter on Parallel Programming in the Optimizing Applications manual for the details.

At a simple level, you can throw the -parallel switch and see if the compiler can identify parallel opportunities. For many applications, this won't do a lot, but it doesn't hurt to try. Turn on the optimization reports to see what prevented the compiler from parallelizing a loop.

Next is OpenMP, where you use directives to tell the compiler what sections of code to run in parallel and which variables should be shared across threads or made private. This takes good understanding of your application, but can yield excellent results.

MPI that Tim mentioned is another option, though it is better suited to "clusters" rather than multi-core processors.

And then you can always use pthreads calls in your code to explicitly thread.
0 Kudos
Reply