Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7248 讨论

issue using FEAST with high-dimensional manifolds and OMP

Gagan
初学者
11,598 次查看

whats good guys.

problem: i am giving an N x N sparse matrix as input to FEAST and I am trying to solve for d-dimensional manifolds.

Note that this dimensionality, d, is independent of N.

Now when d isn't too large; say <3000, FEAST works completely fine.

However, if I "up" the value of d to say, 9000, i get errors such as:

Intel MKL Extended Eigensolvers: Size subspace 9001

#Loop | #Eig  |    Trace     | Error-Trace |  Max-Residual

OMP: Error #34: System unable to allocate necessary resources for OMP thread:

OMP: System error #35: Resource temporarily unavailable

OMP: Hint: Try decreasing the value of OMP_NUM_THREADS.

Abort trap: 6

I have tried setting export OMP_STACKSIZE=1024m, and I have also tried using the -stack_size argument in clang++ to specify a stacksize that's pretty large. Neither of these solutions worked. I have also looked at how much memory is being consumed when I run FEAST as above with either settings and the limit was roughly the same. It suggests to me that I'm not setting OMP_stacksize right, or maybe OSX does this differently?

Using clang++ with c++11 threading, and also the MKL for all the math. any assistance on this situation would be great.

0 项奖励
1 解答
Ying_H_Intel
主持人
11,579 次查看

Hi Gagan,

Thanks for your explanation.  Right, after several hours later, i got same OMP error as your reported, so your code haven't problem.  There is a bug the function. I send your a private message for the fix.   

Best Regards,
Ying

在原帖中查看解决方案

0 项奖励
24 回复数
Gagan
初学者
1,895 次查看

hi,

per your recommendation, i added the -DMKL_ILP64 flag but this causes the mallocs to fail, instead of csrcoo. 

it seems that using the -DMKL_ILP64 causes an issue with memory allocation but allows the csrcoo function to look fine. i say this because malloc doesn't report an error when -DMKL_ILP64 is removed (but i am still linking with ilp64). here is the malloc error using -DMKL_ILP64

testCase(59103,0x7fff79900310) malloc: *** mach_vm_map(size=1125865622122496) failed (error code=3)

*** error: can't allocate region

*** set a breakpoint in malloc_error_break to debug

testCase(59103,0x7fff79900310) malloc: *** mach_vm_map(size=1125865622122496) failed (error code=3)

*** error: can't allocate region

*** set a breakpoint in malloc_error_break to debug

where the compilation line was:

icpc-g -stdlib=libc++ -std=c++11  -O3 -DMKL_ILP64  -fimf-arch-consistency=true -vec-guard-write -no-ftz -ansi-alias -fPIC -funroll-all-loops -ipo -mtune=native -o testCase test.c -L/usr/lib -L/usr/local/lib -L/opt/intel/composerxe/mkl/lib -lmkl_intel_ilp64 -lmkl_core -lpthread -lznz -lm -openmp -lmkl_intel_thread -lz 

0 项奖励
Vitaly_Lukinov
初学者
1,895 次查看

Hi Gagan,

I found several bugs in your test.c example. You can find all changes in attached test_SPBLAS.cpp file

The compilation line for intel C++ 14.0 compiler:

icpc -std=c++0x -DMKL_ILP64 -openmp -I${MKL_ROOT}/include test_SPBLAS.cpp -L${MKL_ROOT}/lib/intel64 -lmkl_intel_ilp64 -lmkl_core -lmkl_intel_thread -lpthread -lm -o test.exe

where MKL_ROOT is the path to __release_lnx/mkl directory.

About your example with EE solver. I am investigating  it on machine with 65 Gb of available RAM and seeing fail of EE solver. Now i am looking for the root cause. Could you tell me about your size of available RAM?  

W.B.R.

Vitaly

 

0 项奖励
Gagan
初学者
1,895 次查看

hey man,

tthanks for this code i will look at it asap.

regarding the failing of the eigensolver-- i am now observing this instead of the failed mallocs that i pasted above. however i think they are related, because the eigensolver is failing due to the silent, but failed, allocation of the output array.

  • i pinpointed it to the output array by printing out its indices, and noticed that i'd hit a segfault well-before the index variable was near the end. if i didn't have this test code in there, it'd hit the feast subroutine and then give an exit code (error 201). is this the error you were experiencing as well?

hope my experience helps. and thanks for this improved code!

0 项奖励
Gagan
初学者
1,895 次查看

oh and 128gb of main memory. i think i have used up to 108 without issue. so let's say that is the limit.

sorry about not answering the question in the first post :P

0 项奖励
回复