Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

zgemm crash with signal 11

xu__mf
Beginner
188 Views

Hi,

We have huge matrix with about 100GB, and zgemm crashed. There is enough memory.

The Stack trace shows the problematic routine is : mkl_blas_avx2_zgemm_zcopy_right6_ea

We used 12 cpus with the multi-threaded lib.

What can we do to avoid such issue?

Thanks for the help.

0 Kudos
3 Replies
xu__mf
Beginner
188 Views

BTW, we tried v11.2.2 and v2018.3.222, both have the same problem.

Gennady_F_Intel
Moderator
188 Views

ok, could you give us more details: OS, 64 or 32 bit?, how to link the case, ILP64 or LP64 API

Also, could you set MKL_VERBOSE and shared the output of zgemm. All this info would help us to reproduce the problem. 

The fastest way would be if you will give us the reproducer.

Hans_P_Intel
Employee
188 Views

I think it is key to link with ILP64 otherwise linear addresses (calculated internally) will exceed the 32-bit space. A square matrix of 100 GB with double-precision complex numbers would be approx. 80kx80k and a linear address that is supposed to point to the inside of such a matrix is exceeding the (signed) 32-bit space in general.

Reply