My target is to implement the Intel MKL solver into my SMP program to replace the current SOR method. I hope it would be faster.
I altered the sample code of Preconditioned CG to solve a linear system of i,j,k = 100 (uploaded code)
If the code compiled with
ifort -qmkl cg_jacobi_precon.f90
it runs normally with implicit parallization as expected.
However, if I compile it with qopenmp flag (not even adding openMP directives inside code)
ifort -qmkl -qopenmp -mcmodel=large -shared-intel cg_jacobi_precon.f90
and run with ./a.out , sigmentation fault occurred.
forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source libifcoremt.so.5 0000149902209359 for__signal_handl Unknown Unknown libpthread-2.27.s 00001498F61B4980 Unknown Unknown Unknown libmkl_avx512.so. 00001498F501C64C mkl_spblas_lp64_a Unknown Unknown libmkl_intel_thre 00001498FE44E595 mkl_spblas_lp64_d Unknown Unknown libiomp5.so 00001498F944A893 __kmp_invoke_micr Unknown Unknown libiomp5.so 00001498F93BDCB3 Unknown Unknown Unknown libiomp5.so 00001498F93BEF7D __kmp_fork_call Unknown Unknown libiomp5.so 00001498F937A425 __kmpc_fork_call Unknown Unknown libmkl_intel_thre 00001498FE44E11B mkl_spblas_lp64_d Unknown Unknown libmkl_intel_thre 00001498FE1CD051 mkl_spblas_lp64_m Unknown Unknown a.out 0000000000401CC0 Unknown Unknown Unknown a.out 0000000000400F62 Unknown Unknown Unknown libc-2.27.so 00001498F5DD2C87 __libc_start_main Unknown Unknown a.out 0000000000400E6A Unknown Unknown Unknown
If I lower the size of the system to i, j = 100 , k = 10 , everything seems fine.
Back to i,j,k=100, I tried
ulimit -s unlimited
but nothing solved.
I am not familiar to Intel-MKL, please tell me if I've done anything wrong with it.
※By the way, 100^3 isn't really a large array, I do computational simulation with a 10x larger system on the same environment with openMP and no problem occur.
Same question posted on Intel-Fortran-Compiler Section and no solution yet.
Thanks for reaching out to us.
The issue raised by you is reproducible. We are working on it, we will get back to you soon.
Meanwhile could you please let us know the MKL version being used in this case?
Thanks for the details.
Here is an update regarding the issue which you have raised.
We tested the sample code which you have provided and based on our observations, the issue is not because of the mkl calls. Here is the modified reproducer test.f90 to avoid all oneMKL calls and then removed "-qmkl" from the compilation line. The updated reproducer still segfaults when compiled with "-qopenmp".
We are working on this issue and will be back as soon as we get an update.
Hi @SamW1 ,
Thanks for your patience.
Here is the reason why we are getting segmentation fault when including -qopenmp flag
Without openmp flag, the large arrays in the code are globals
With openmp flag, they are local to the MAIN function and blow the stack hence seg-fault error
If we add SAVE attribute to the large arrays in the code then the issue is gone (attaching the modified sample code)
Please find the output of the screenshot below which were tested with the sample reproducer that you have provided (cg_jacobi_precon.f90) as you can see now it works fine both with and without including qopenmp flag.
Please let us know if it resolves the issue and you can get back to us if you still have any queries regarding the same.
Hi @SamW1 ,
As we haven't heard back from you, we are closing this thread. Please post a new question if you need any additional assistance from Intel as this thread will no longer be monitored.