- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm compiling a Monte Carlo simulation for deployment on other machines. I use the default parallelization settings. On my own (2-processor) machine I get a significant speedup from parallelization and Task Manager shows close to 100% CPU utilization. If I plan to run the executable one of the other machines (six-core) how can I get a similar speedup? I have changed environment variables to NUMBER_OF_PROCESSORS=6 and OMP_NUM_THREADS=6. Not only do I not get any speedup, the CPU utilization level hangs around 17%, though it's distributed over two processors.
Any clues here? I'm hoping I don't have to become an OpenMP expert to get some benefit out of the additional cores.
Thanks.
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The runtime automatically picks the number of "execution units" (CPUs*cores*threads) as the number of threads to create. It may be that your application doesn't scale past two threads. Note that the "17%" is total over the CPUs, so if, say, there are 12 threads possible (6 cores with HyperThreading), then two threads would be about 17%.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Normally, that should work on the other system. You don't even have to set the number of threads, that should be taken from the number of (logical) cores/processors. I would definitely not set the number of processors, leave that to the OS. Does it look in Task Manager as if the app is running on only one core at a time?
What processor type are your 6 cores? Does it have hyperthreading enabled? You might try setting KMP_AFFINITY=physical, though it doesn't reallysound like the problem here.
Monte Carlo apps typically make calls tolibrary random number routines, which may be serialized for thread safety. Does your app spend a lot of time in random number calls?
By default parallelizxation settings, do you mean "Yes (/Qparallel)"? Please could you send us your complete command line? And for completeness, the versions of Visual Studio, Windows and the Intel Compiler.
What processor type are your 6 cores? Does it have hyperthreading enabled? You might try setting KMP_AFFINITY=physical, though it doesn't reallysound like the problem here.
Monte Carlo apps typically make calls tolibrary random number routines, which may be serialized for thread safety. Does your app spend a lot of time in random number calls?
By default parallelizxation settings, do you mean "Yes (/Qparallel)"? Please could you send us your complete command line? And for completeness, the versions of Visual Studio, Windows and the Intel Compiler.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Six core intel xeon X5670. Just judging by the performance tab in task manager, I have hyperthreading enabled on *neither* my original (2-core) machine nor the 6-core xeon.
The app does spend a fair amount of time in random number calls, so that *could* be the issue, but....
My confusion is this: again, judging by what I see in Task Manager, the two-core machine uses both cores fully; the same executable on the six core machine uses only the equivalent of one core (that's the 17% utilization number). Not only am I not getting additional parallelization, I could be getting less.
I do mean Yes(/Qparallel).
Thanks both of you for the help. Sorry it took so long to get back.
In case it matters, here's the command line:
/nologo /Qparallel /Qip /fpp /I"C:\Program Files\Intel\Compiler\11.1\065\include" /I"C:\Program Files\Intel\Compiler\11.1\065\include\ia32" /I"c:\Program Files\Microsoft Visual Studio 8\VC\atlmfc\include" /I"c:\Program Files\Microsoft Visual Studio 8\VC\include" /I"c:\Program Files\Microsoft Visual Studio 8\VC\PlatformSDK\include" /I"c:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\include" /I"C:\Program Files\VNI\imsl\fnl600\IA32\include\static" /Qopenmp /module:"Release\" /object:"Release\" /libs:dll /threads /c
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I agree, if you can keep two cores busy on your laptop, you should be able to get at least two cores worth of work out of your other machine. Something doesn't seem right.
First comment, you would not normally want touse both /Qopenmp and /Qparallel together. Are you making use of IMSL, and is that why you have /Qopenmp? Are you calling parts of IMSL or MKL that are threaded? Does the parallelism that you see on your laptop come from /Qparallel or from /Qopenmp? (Eg, does it change when you remove /Qparallel ?)
You might collect some additional information about the OpenMP environments on the two systems by setting these environment variables:
KMP_VERSION=1
KMP_SETTINGS=1
and comparing the logs.
Are these significantly different versions of the Windows OS?
A next step might be to run a little test program, such as a standalone parallel matrix multiply, on both your laptop and your 6 core system, and see whether that scales as expected.
First comment, you would not normally want touse both /Qopenmp and /Qparallel together. Are you making use of IMSL, and is that why you have /Qopenmp? Are you calling parts of IMSL or MKL that are threaded? Does the parallelism that you see on your laptop come from /Qparallel or from /Qopenmp? (Eg, does it change when you remove /Qparallel ?)
You might collect some additional information about the OpenMP environments on the two systems by setting these environment variables:
KMP_VERSION=1
KMP_SETTINGS=1
and comparing the logs.
Are these significantly different versions of the Windows OS?
A next step might be to run a little test program, such as a standalone parallel matrix multiply, on both your laptop and your 6 core system, and see whether that scales as expected.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page