I have a rather subtle use case where I'd appreciate some advice.
I have an application with a number of math libraries as shown in the illustration attached:
So my question is this: is it possible to have threaded and non-threaded versions of MKL loaded in this way at the same time? Or am I asking for trouble?
If I'm asking for trouble, would linking MKL statically to each library be an acceptable workaround? (I know this would mean a bigger footprint on disk and possibly more memory usage, but I don't expect that to be a problem)
Or will I have to settle for using a sequential version of the MKL for OpenCV as well (which is probably preferable to risking the threaded version of the MKL for the Fortran application).
What you could do is make a compromise.
Run the OpenMP with fewer than full complement of threads. .AND.
Run parallel MKL with fewer than full complement of threads.
You may have to experiment to find the best mix.
Also, when the OpenCV portion runs at different time intervals than OpenMP/MKL, consider setting the KMP_BLOCKTIME to 0 or some small number. Doing so may reduce the latency of thread wakeup between the different thread pools at a sacrifice of increased(induced) wakups between parallel regions within each thread pool.
I assume that if I used system calls or environment variables then there'd be some latency, and in reality I'll want to move back and forth between OpenCV and OpenMP quite frequently so I'd rather avoid this.
Try setting the environment variable KMP_BLOCKTIME=0, and run the multi-thread KML
IOW run a test and see what happens.
Can't argue with that!
The only thing is, I'm some way off building and running the code - I'm still trying to sort out the architecture (though the Fortran and C++ libraries already exist and have been tested separately).
The answer does seem to have come through loud and clear that I can't run dynamic instances of the MKL with different threading layers side-by-side. I'm still tempted to try the static linking option.
Thank you Jim; that does sound a bit of a compromise though; if I really must compromise somewhere, I'd rather put up with the single-threaded MKL for OpenCV, since getting maximum performance from the Fortran library is really critical.