- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Hi,
I 've compiled FC9.1 on a cluster with Pentium D processor (3.4G) and the OS is Fedora Core 6. the cluster is used for calculation.
I 've compared the calculation time with the time from another cluster that i 've built up with P4 processor (3.4 G, with em64t) and the OS is ferdora core 4.
They are nearly thesame , it seems that i don't have any benefit from the dual-core technique.
is there somethingwrong?
Or
do i have to included any specific flag in order to gain benifit from the dual -core technique that should have in Pentium D processor?
Thanks!!
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The usual way of taking advantage of dual core nodes in a cluster would be to assign 2 processes to each node. If you are using MPI with an option to do so, you would use shared memory communication between the 2 cores. You have the option of running 2 threads per node instead, compiling with -parallel or -openmp (with OpenMP directives in your source code). Maybe you haven't explained what you mean.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As Tim hints, a given application, if it is not written to take advantage of multiple threads, will not use additional cores. There are many ways of adding parallelism to your application, and I'd suggest reading the chapter on Parallel Programming in the Optimizing Applications manual for the details.
At a simple level, you can throw the -parallel switch and see if the compiler can identify parallel opportunities. For many applications, this won't do a lot, but it doesn't hurt to try. Turn on the optimization reports to see what prevented the compiler from parallelizing a loop.
Next is OpenMP, where you use directives to tell the compiler what sections of code to run in parallel and which variables should be shared across threads or made private. This takes good understanding of your application, but can yield excellent results.
MPI that Tim mentioned is another option, though it is better suited to "clusters" rather than multi-core processors.
And then you can always use pthreads calls in your code to explicitly thread.
At a simple level, you can throw the -parallel switch and see if the compiler can identify parallel opportunities. For many applications, this won't do a lot, but it doesn't hurt to try. Turn on the optimization reports to see what prevented the compiler from parallelizing a loop.
Next is OpenMP, where you use directives to tell the compiler what sections of code to run in parallel and which variables should be shared across threads or made private. This takes good understanding of your application, but can yield excellent results.
MPI that Tim mentioned is another option, though it is better suited to "clusters" rather than multi-core processors.
And then you can always use pthreads calls in your code to explicitly thread.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page