Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7220 讨论

Segmentation fault on declaring large array when mkl is linked

JaspervdK
初学者
2,510 次查看

Hello,

 

I have a curious issue where a very simple (Fortran) program produces a segmentation fault if mkl is linked in the compiler. The program looks like,

 

--------------------------------------------------------------

program main
implicit none
  integer :: N_total
  complex*16, allocatable, dimension(:,:) :: Z_matrix_c

  N_total = 254571 

  print*, 'before allocation'
  allocate(Z_matrix_c(N_total, N_total))
  print*, 'After Z_matrix_C allocation'
end program main

--------------------------------------------------------------

I know the complex*16 type declaration is not standard, but for legacy reasons I have to stick with it for now. The program allocates a very large (~ 1TB) array of double precision complex numbers. I am running this code on a cluster, where it works fine if I compile using ifort without any flags. If however I link the mkl library via -qmkl, the program segfaults upon entering the allocate statement. Curiously, if I increase the dimension of the array and make N_total say 270,000, the program also runs without issue.

 

Does anyone have an idea what may be happening here? I am using the 2021 versions of ifort and mkl.

 

Thanks!

0 项奖励
1 解答
VarshaS_Intel
主持人
2,450 次查看

Hi,

 

Thanks for posting in Intel Communities.

 

Could you please let us know the OS details, and Cluster details you are using? Also, could you please provide us with the complete error you are getting while using the '-qmkl' option?

 

And also, we recommend you try the latest Intel MKL 2023.1.0 and let us know if your issue still persists.

 

>>Curiously, if I increase the dimension of the array and make N_total say 270,000, the program also runs without issue.

Could you please let us know if you are using the '-qmkl' option and you did not face any error? 

 

Thanks & Regards,

Varsha

 

 

在原帖中查看解决方案

0 项奖励
8 回复数
Gennady_F_Intel
主持人
2,465 次查看

You could try to link this case against ILP64 libraries. Please check MKL Linker Adviser which libraries and compiler option you need to use.

0 项奖励
VarshaS_Intel
主持人
2,451 次查看

Hi,

 

Thanks for posting in Intel Communities.

 

Could you please let us know the OS details, and Cluster details you are using? Also, could you please provide us with the complete error you are getting while using the '-qmkl' option?

 

And also, we recommend you try the latest Intel MKL 2023.1.0 and let us know if your issue still persists.

 

>>Curiously, if I increase the dimension of the array and make N_total say 270,000, the program also runs without issue.

Could you please let us know if you are using the '-qmkl' option and you did not face any error? 

 

Thanks & Regards,

Varsha

 

 

0 项奖励
JaspervdK
初学者
2,432 次查看

Yes if I increase the dimension of the array the program runs fine even with -qmkl turned on.

0 项奖励
JaspervdK
初学者
2,434 次查看

Thank you for your responses. After some digging it turns out that the problem is not with mkl itself, but rather with openmp. If the program is run with -mkl=sequential the issue disappears. Removing the -mkl flag altogether and adding the -fopenmp flag gives the same error. 

Running Valgrind on the program returns an invalid read of size 8 in libiomp5.so as the reason for the segfault.

Unfortunately I can only link libraries installed on the cluster and I don't have a local machine with enough memory to test this program. I have tried linking the intel 2022 libraries, which is the latest installed version, but the same problem persists. The cluster runs on CentOS.

0 项奖励
VarshaS_Intel
主持人
2,315 次查看

Hi,

 

Thanks for your reply.

 

Could you please provide us with the complete command being used by you in running the code and let us know whether you are using ILP64 or LP64 libraries?

 

In addition, could you please let us know if you are able to run with Intel MKL (sequential as well as parallel) or if you are facing an issue with only OpenMP?

 

Also, please find the below link for using Intel Link Line Advisor which helps you with the link line commands to be provided while compiling your source.

https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-link-line-advisor.html


Thanks & Regards,

Varsha


0 项奖励
JaspervdK
初学者
2,310 次查看

I compile the code as,

 

 

ifort code.F90 -i8 -qopenmp -o run

 

 

and then ./run. In this case I get the segmentation fault. You can see I run in ILP64, however turning off the -i8 flag actually does not matter for my error. If I compile with MKL instead,

 

 

ifort code.F90 -i8 -qmkl -o run

 

 

I get the same segmentation fault. If I compile in sequential mode,

 

 

ifort code.F90 -i8 -qmkl=sequential -o run

 

 

then the segmentation fault disappears.

 

Edit: I just noticed in the Link Line advisor that ILP64 requires linking -qmkl-ilp64. Maybe this is the issue, once I have access to the cluster again I will test this.

 

Edit 2: I now tried to compile as,

ifort code.F90 -i8 -qmkl-ilp64 -o run

but I still get the segmentation fault. 

0 项奖励
JaspervdK
初学者
2,252 次查看

Hello everyone,

 

Coincidentally, the intel compilers installed on the cluster were being updated to the new 2023 versions this week. It turns out that my allocation program does not run into any error when I use the new updated compilers, so it seems that my problem was solved in the latest update. Thanks a lot for the help!

0 项奖励
VarshaS_Intel
主持人
2,203 次查看

Hi,


>>Thanks a lot for the help!

It’s great to know that the issue has been resolved, in case you run into any other issues please feel free to create a new thread. 


Have a Good Day!


Thanks & Regards,

Varsha


0 项奖励
回复