- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@constraint (ComputingUnits="${ComputingUnits}") @task(returns=list) def createBlock(BSIZE, MKLProc, diag): import os os.environ["KMP_AFFINITY"]="verbose" os.environ["MKL_NUM_THREADS"]=str(MKLProc) block = np.array(np.random.random((BSIZE, BSIZE)), dtype=np.double,copy=False) mb = np.matrix(block, dtype=np.double, copy=False) mb = mb + np.transpose(mb) if diag: mb = mb + 2*BSIZE*np.eye(BSIZE) return mbMKL_NUM_THREADS is set to 64 in order to take advantage of all the cores. When executing the routine number 32, I obtain the following error:
OMP: Error #34: System unable to allocate necessary resources for OMP thread: OMP: System error #11: Resource temporarily unavailable OMP: Hint: Try decreasing the value of OMP_NUM_THREADS.
I've found here https://software.intel.com/en-us/forums/intel-open-source-openmp-runtime-library/topic/622016 that threads are not destroyed so I can be reaching the thread limit in the machine. The thing is that, at each time, only one thread is running so only 64 OpenMP threads are awaken. My problem is that I'm running this code in a shared cluster so I should not recompile the library with my custom setting if possible. Is there a way to avoid this problem without decrasing the amount of threads running on the machine? I think that just having a fewer amount of threads i could avoid this problem but this is a part of a bigger program and I am really interested in keeping the 64 threads.
Regards,
Ramon
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sergey,
Thanks for the fast response.
I've tryied what you suggested and, for the "ulimit -s" call, I get "unlimited". On the other hand, I've set "KMP_STACKSIZE" to "1000m". I get the same error at the same point.
I forgot to specify that I'm using MKL through Numpy with Intel Python 2.7.11. As shown in the example code, all the variables are set before entering the first parallel region. Nevertheless, the module is imported before. Could this be a problem? Thanks in advance.
Ramon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If resource exhaustion occurs with increasing number of threads, decreasing omp_stacksize seems a more likely tactic. Assuming it was working at 4m with a reasonable number of threads, changes by more than a factor of 2 seem ridiculous.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi to both and thanks for the responses,
>> If resource exhaustion occurs with increasing number of threads, decreasing omp_stacksize seems a more likely tactic.
I don't really understand why should I decrease the stack size. Nevertheless, I assigned "1m" to "KMP_STACKSIZE". The program crashed exactly at the same point.
>> In case of C/C++ languages I use kmp_set_defaults function, like: ... to set an environment variable(s), or set all environment variable(s) before starting your application.
So, I assume that is not possible to change KMP_NUM_THREADS dinamically depending on the call that is done at each moment? I thought that the amount of OpenMP threads was defined by the environmental variable at the beggining of the parallel version.
>> What is a default value for BSIZE?
BSIZE is equal to 4096 in this execution. The block created has 134MB. Is this important?
>> It is clear that application crashed with a default value for OMP stack size.
For the moment, I tried to not change the value, set it to "1000m" and "1m". The thing is that the program crash always at the same point, so I tend to think that is not directly caused by this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page