I am trying to use the parallel version of sgemm, but no matter how many threads I set to use, the program runs on one core only (on a 12-core westmere). Can anyone help me?
I am using Intel Compiler 11.1.073, and my makefile is:
icc -o blassgemm blassgemm.c -openmp -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_lapack -lmkl_core -liomp5 -lpthread
I set the number of threads using mkl_set_num_threads.