I'm using Visual Studio 2008, Intel compiler v11.1 and the MKL library that comes with it. I started my project using the Sequential option for MKL but now I want to use the parallel option. However, when I switch to parallel and recompile (release version), I neither see any performance improvement, nor see that the executable uses more than the CPUs that the sequential version uses (one). I have 8 cores(EDITED) more than 5 Gb RAM, using Windows 7 x64, and generating an x64 executable (fp model used is precise)
In my case, I'm generating about ~800k random numbers with VSL functions, and then getting the log of those numbers using another VSL function. I think that such volume of computations should benefit from parallelism. What am I doing wrong?
The only thing I change is the MKL option from Sequential to Parallel.
EDIT: Setting the variableMKL_NUM_THREADS=4 before executing my program from the command line, does not yield any change from what I stated above.
Yes, it's approximately the same time (which I don't find surprising given than no more CPUs appear to be used)
I'm sorry, but it's not the case. I was taking the time in other parts of my program together with the VSL functions. Now that I isolated the times that VSL routines take, I have notice the following (all times were measured with pairs ofGetTickCount() calls):
I guess MKL is using several cores after all, but the computations I do (random number generation and taking log of those) are not demanding enough to notice any noticeable difference by humans, or to benefit from parallelism
If you have a different take, please let me know.