I have a few issues regarding to multithreading in IPP 7.0. I will try to explain it by the example below.
Lets assume there is an application which uses three dynamic libraries (dll/LoadLibrary). Lets call them dll1, dll2 and dll3. Each of these libriaries contains some IPP threaded functions.
I run this application on PC with 8 cores (core numbers: 0-7). My goal is to share cores between libriaries.
So first I load dll1, set number of threads to 4 and affinity (ippSetAffinity) to cores 0-3, and finally run dll1 function constantly in background. Then I load dll2, set number of threads to 2, affinity to cores 4-5, and start processing function from dll2.
My first question is: Is it possible to create few independent "instances" of ipp in the same system process? I found out that I can set different number of threads (ippSetNumThreads) to dll1 and dll2 if these libriaries (dll1, dll2, dll3) are statically linked with IPP by *_t.lib, so it seems they are independent. But does it influence negatively to performance?
And now another situation: Functions from dll1 and dll2 are working constantly but I would like to use function from dll3 which has the highest priority. So I want to "take back" few cores from dll1 and dll2 for processing of function from dll3. So my second question is: Can I do it dynamically using ippSetNumThreads and ippSetAffinity? And is it safe?
Very interesting question.
The IPP internal multi-threading algorithm is managed by the Intel OpenMP library. The IPP functions you cite above (ippSetNumThreads, ippSetAffinity, etc.) to control threading within the library are interface functions to the OpenMP API. The OpenMP library offers a much broader API than that which is provided by IPP, but I'm not sure if what you want to do is supported by OpenMP. To find out more about what the OpenMP library can do, please go here:
You might also want to ask some questions on the Intel Compiler forum, where they will have better information regarding the behavior and features of the OpenMP library.
Based on the thread affinity documentation (see link above), it appears that you can dynamically change the affinity settings, so you may be able to exercise the type of control you describe above. For example, see this paragraph from that documentation:
Once an OpenMP thread has set its own affinity mask via a successful call to kmp_affinity_set_mask(), then that thread remains bound to the corresponding OS proc set until at least the end of the parallel region, unless reset via a subsequent call to kmp_affinity_set_mask().
Regarding your first question, in the ideal case you should be using the dynamic OpenMP library, regardless of whether you link with the IPP library statically or dynamically. It may be that in the static experiment you mention above you have two instances of the OpenMP library running on your system (I'm assuming you used the static version of the OpenMP library, which has been deprecated and will be discontinued in some future edition of the Intel compiler). In the case of the dynamic OpenMP library I'm not sure if you'd get the same behavior as your experiment.
If you run multiple copies of the OpenMP library you will likely run into conflicts (the reason for deprecating the static version of the OpenMP library). Each copy of the OpenMP library is trying to control a set of global resources (hardware threads on your CPU) and multiple instances of the OpenMP library will result in two masters trying to control a single set of resources.
Sorry, this doesn't completely answer your question; however, based on the documentation quote above, it does seem that you should be able to "throttle back" your multiple background threads in order to allow your critical threads to have "priority." I suspect that you might have to split this into multiple processes, but I am not sure.
Thank you for your response. You are right, I have linked omp statically (libiomp5mt.lib). However I checked "dynamic version" as well, I attached libiomp5md.lib, copied proper dlls and the behavior seems to be the same. I am really surprised now. I still can set different number of threads to each of my own libriaries (dll1, dll2, dll3). Anyway maybe I should implement some real threaded function to these libriaries and check it in details - not only set number of threads.
I am aware of risk of using few copies of OpenMP library but I just hope there is a mechanism inside Intel's libraries that can manage it :-) The main problem is in the final application these dlls (dll1, dll2, dll3) are completely independent modules, I am not able to influence their code except interface for setting number of threads and affinity.