- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello everybody,
I am using the intel solution for Nonlinear Least Squares Problem with Linear (Bound) Constraints
http://software.intel.com/sites/products/documentation/hpc/mkl/mklman/GUID-B6BADF1C-F90C-4D30-8B84-CF9A5F970E08.htm#GUID-B6BADF1C-F90C-4D30-8B84-CF9A5F970E08
Question: what do I need to do to run the optimizer in parallel?
A. Let me consider the intel example ex_nlsqp_bc_c.c, let's say I just call omp_set_num_threads(n) before starting the minimization loop:
omp_set_num_threads(n); //no pragmas!!! Just want to make sure I don't have to put any pragmas in the cycle.
while(not_converged)
{
dtrnlspbc_solve(OPTION); //intel mkl function minimizer;
if(OPTION-1) {my_function();} // user-supplied function
else if (OPTION-2) {djacobi(my_function);} //intel mkl function (numerical gradient); Does it call my_function from different threads?
}
In the multithreading mode what is done in parallel? Jacobian construction or just manipulations with Jacobian? I hope that calls to the user-supplied function are done with different X by multiple threads...
B. To check this I inserted omp_get_thread_num in my function
void my_function() {
i=omp_get_thread_num();
printf("%i\n",i); <-It prints different values thread numbers? Does it mean it is executed from different threads?
}
AND Thus all I need there is a thread-save function?? + set OMP_NUM_THREADS + linking correct libs?
I wish there was a better documentation on this issue.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
MKL has two versions of a library. Both versions are multi-thread safe.
One version creates its own OpenMP thread pool. This version (OpenMP multi-threaded) is intended for use with a single threaded application.
The second versions does not create its own OpenMP thread pool. This version you would typically use with an OpenMP application.
This may seem counter intuitive until you realize using the OpenMP version of MKL with an OpenMP application results in omp_num_threads() * kmp_num_threads() number of threads. Using defaults this results in the number of logical processors**2 - oversubscription.
This said, there are some cases where you might want to use both with their own OpenMP pool (two pools). But in doing so you may have to use
omp_set_num_threads(o);
kmp_set_num_threads(k);
// o*k == number of logical processors
And/or only call MKL from outside parallel regions .AND. set environment variable KMP_BLOCKTIME=0
And/or first level parallel region with greatly reduced number of threads for both pools.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have a similar question in regards to fortran
I'm linking with
LIBS=-L$(CFITSIO)lib64/ -lcfitsio $(MKLROOT)/lib/intel64/libmkl_lapack95_lp64.a -Wl,--start-gro
up $(MKLROOT)/lib/intel64/libmkl_intel_lp64.a $(MKLROOT)/lib/intel64/libmkl_core.a $(MKLROOT)/li
b/intel64/libmkl_intel_thread.a -Wl,--end-group -lpthread -lm
I use OPENMP explictly in several regions of the code and this working properly.
(1) How do I ensure a Lapack call will use available threads? i.e.,
CALL SYEVR(COVARIANCE,EIGVAL,UPLO,Z=EIGVEC,ABSTOL=ABSTOL,INFO=INFO)
(2) Is there a way to determine the MKL version that the code is linked to at run-time?
Thanks,
-- Pete Schuck
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It looks like threading inside syevr depends on there being significant work done by gemv et al. at a lower level, or better, if ?latrd could be parallelized to use multiple copies of gemv. See
http://software.intel.com/en-us/articles/intel-mkl-threaded-functions
http://software.intel.com/en-us/forums/topic/292428
(where it is suggested that threading should become useful from size 128)
I don't see a clear indication about consideration of threading it at a higher level than gemv. It's probably difficult on account of the varying gemv sizes.
You would either call MKL threaded functions from outside parallel regions or use OMP_NESTED, OMP_NUM_THREADS to control how many MKL threads are in use and try to increase parallelism by calling lapack from multiple threads. There aren't well developed facilities for placing the adjacent gemv threads on a single cache, if you are trying to run multiple copies.
I suppose it would be interesting to get a report on which MKL version is active, other than by checking shared object search paths, but I don't see such a thing in the docs.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page