10-05-2011 05:25 PM
I am unable to get Intel MKL to work properly in an MPI program. In fact, merely creating and committing descriptors is enough to crash the program, even if the program does not use MKL functionality beyond the descriptors.
The program uses MPI and each MPI process resides on a single node (and not a single core). The program has OpenMP parallel regions. I am using MKL's 1D FFT routines as well as the trig transforms.
I compile using Intel's icpc. Linking options used are as follows:
-openmp `mpiCC -showme:link -lmkl_intel_lp64 -lmkl_sequential -lmkl_lapack -lmkl_core -pthread -lguide -lm
I want only the single thread, single core version of the FFT routines. Are these linking options OK?
I believe the Intel MKL version is 9, but it could be 10. I can check if that is important.
The program does not rely on MKL for any parallelism. Any attempt by MKL to run in parallel is highly undesirable and will break the program.
Chapter 6 of Intel MKL User's Guide considers MPI and possible conflicts in the execution environment (see Table 6-1). I used mkl_set_num_threads() to set the number of threads to 1 as suggested, but that did not help.I can try to reproduce this problem in a simple program. But that will take much work.
It is difficult to figure out MKL's execution environment and thread safety from the documentation and examples.
I would greatly appreciate some insight ASAP.
10-05-2011 06:23 PM
As you linked mkl_sequential, libiomp5 should not be used by MKL; only your icpc omp parallel regions would invoke it (or, in much older versions, libguide). mkl_set_num_threads() would have no effect. MPI standard, and the most widely used MPI implementations require that your MPI library usage begins with mpi_init_thread(). This also would take care of usage of mkl_thread library. mkl_sequential is widely used in MPI applications, including inside omp parallel regions. As far as I know, mkl_sequential is an MKL 10 library, and the OpenMP library for mkl_thread is libiomp5. Your icpc -openmp option would automatically choose libiomp5 or libguide, depending on icpc version, so -lguide and any explicit pthreads link reference are redundant.