I'm writting a program that calls the geqp3 frequently. Because the mkl subroutine is geqp3(A,jpvt), Then I need an array A.
However I want tp call geqp3 frequently. In each call the array size may be slightly different. I know that if at each time, I allocate the memory for A and jpvt, and call geqp3 and then deallocate A and jpvt, it works. However the speed becomes slow. Especially when using open MP to parallelize this program, the CPU used is only 30% if the number of threads are set to 32 (my server has 16 cores and 32 threads).
Is there any simple tip to solve this problem. Thank you very much!
Memory allocation/free is the sequential operations. If it is called in the multiple threadings, it may create some bottleneck there.
I am not sure I learn your problem: if the size for A/JPVT is very similar in your case, why not you just allocate once for each theading? It can allocate the size enough for the largest problem. You can reuse this buffer, and does not allocate/free them each time.
Dear Chao: Thank you for your reply! I also tried this but failed. One method is using fortran 77. But I encountered the problem similar to the one listed in the poster "MKL ERROR: Parameter 1 was incorrect on entry to cgelsy." I think they may caused by the same reason. Hence, I'm now focuse on the latter one. Please also help see what is the reason for the latter problem as postered in "MKL ERROR: Parameter 1 was incorrect on entry to cgelsy." Thank you!
This error could be caused by the invalid input data. You can have double check in the manual on the requirement of this function.
Also, do you have some test code that show this problem? That also be helpful to root the problem.
Is it Windows, or Linux, ia32 code, or Intel 64? I just run your code with ia32 windows. It did not report any error. The library I linked: mkl_intel_c.lib mkl_sequential.lib mkl_core.lib
since you already have high level threading, I linked with sequential MKL libraries.
Thank you very much! I just follow your steps and it works. I use mkl_sequential.lib mkl_core.lib and set the link with sequential MKL libraries. Then it works. Now, I use the results generated by the Intel® Math Kernel Library Link Line Advisor. It also works now. The link is also set as sequential MKL libraries.