- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm using Visual Studio 2008, Intel compiler v11.1 and the MKL library that comes with it. I started my project using the Sequential option for MKL but now I want to use the parallel option. However, when I switch to parallel and recompile (release version), I neither see any performance improvement, nor see that the executable uses more than the CPUs that the sequential version uses (one). I have 8 cores(EDITED) more than 5 Gb RAM, using Windows 7 x64, and generating an x64 executable (fp model used is precise)
In my case, I'm generating about ~800k random numbers with VSL functions, and then getting the log of those numbers using another VSL function. I think that such volume of computations should benefit from parallelism. What am I doing wrong?
The only thing I change is the MKL option from Sequential to Parallel.
Thanks
EDIT: Setting the variableMKL_NUM_THREADS=4 before executing my program from the command line, does not yield any change from what I stated above.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, it's approximately the same time (which I don't find surprising given than no more CPUs appear to be used)
I'm sorry, but it's not the case. I was taking the time in other parts of my program together with the VSL functions. Now that I isolated the times that VSL routines take, I have notice the following (all times were measured with pairs ofGetTickCount() calls):
- Execution of the sequential version takes much less time (15-32 ms in several runs) than the parallel version (~1000 - ~2000 ms in several runs)
- CPU usage never goes beyond 20% even when I change the number of threads with MKL_NUM_THREADSto the maximum number of processors (8)
I guess MKL is using several cores after all, but the computations I do (random number generation and taking log of those) are not demanding enough to notice any noticeable difference by humans, or to benefit from parallelism
If you have a different take, please let me know.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I use these functions in the order specified below:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Creating independent streams
- Splitting streams into blocks withvslSkipAheadStream function
- Splitting streams into severaldisjoint subsequences withvslLeapfrogStream function
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page