- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear all,
I am running a program that has been running many times in a cluster.
Maybe because the cluster has been through software upgrade, there are errors while running the executable file a.out.
There is no problem for compiling and linking. Just error will show up while run the program halfway..
forrtl: error (65): floating invalid
Image PC Routine Line Source
libifcoremt.so.5 00002B6454D7A6D4 for__signal_handl Unknown Unknown
libpthread-2.17.s 00002B6452C20370 Unknown Unknown Unknown
libmkl_avx512_mic 00002B646F5370BE mkl_blas_avx512_m Unknown Unknown
libmkl_avx512_mic 00002B646F544B61 mkl_blas_avx512_m Unknown Unknown
libmkl_avx512_mic 00002B646F541935 mkl_blas_avx512_m Unknown Unknown
libmkl_intel_thre 00002B644F2FF714 mkl_blas_ztrsm_ho Unknown Unknown
libmkl_intel_thre 00002B644F319606 mkl_blas_ztrsm Unknown Unknown
libmkl_core.so 00002B64515C1F74 mkl_lapack_ztrtri Unknown Unknown
libmkl_core.so 00002B64514B032C mkl_lapack_zgetri Unknown Unknown
libmkl_intel_lp64 00002B644E98683D ZGETRI Unknown Unknown
Now we are using intel/17.0.4, impi/17.0.3.
call ZGETRF( N_LEN_2, N_LEN_2, BQ , N_LEN_2, IPIV , INFO ) call ZGETRI( N_LEN_2, BQ, N_LEN_2, IPIV, WORK, N_LEN_2, INFO )
The first subroutine
ZGETRF
is fine. But when it comes to the second function
ZGETRI. There is always a floating invalid error.
I just do not understand. Because the input of ZGETRI are just the output of ZGTRF.
*********updates********
I found the following on Intel® Math Kernel Library (Intel® MKL) 2017 Release Notes
Fixed irregular division by zero and invalid floating point exceptions in {C/Z}TRSM for Intel® Xeon Phi™ processor x200 (aka KNL) and Intel® Xeon® Processor supporting Intel® Advanced Vector Extensions 512 (Intel® AVX-512) code path
I found this maybe useful because my error message just mentioned
TRSM, Invalid floating AVX-512
********updates2********
It seems the error has something to do with the MKL library.
1. The code has been running for a long time.
2. I run the code in a low version MKL library, it works well.
I think the current MKL library which is 17.0.4 must has something not correct.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hello, thanks for report.
1. what is the current version you are use? could you look at mkl_version.h file? and
2 >>> I run the code in a low version MKL library, it works well.
what is the previous version which works well?
3. Could you give us the reproducer?
--Gennady
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Gennady,
Those days I have been debugging my code.
As I said, this code has been used for many years.
Previous intel mkl version is 16.0.3, now is 17.0.4 (17.0.1 is also fine for this code).
Even running the same case as before will produce errors.
forrtl: error (65): floating invalid
Image PC Routine Line Source
libifcoremt.so.5 00002B6454D7A6D4 for__signal_handl Unknown Unknown
libpthread-2.17.s 00002B6452C20370 Unknown Unknown Unknown
libmkl_avx512_mic 00002B646F5370BE mkl_blas_avx512_m Unknown Unknown
libmkl_avx512_mic 00002B646F544B61 mkl_blas_avx512_m Unknown Unknown
libmkl_avx512_mic 00002B646F541935 mkl_blas_avx512_m Unknown Unknown
libmkl_intel_thre 00002B644F2FF714 mkl_blas_ztrsm_ho Unknown Unknown
libmkl_intel_thre 00002B644F319606 mkl_blas_ztrsm Unknown Unknown
libmkl_core.so 00002B64515C1F74 mkl_lapack_ztrtri Unknown Unknown
libmkl_core.so 00002B64514B032C mkl_lapack_zgetri Unknown Unknown
libmkl_intel_lp64 00002B644E98683D ZGETRI Unknown Unknown
I think the 17.0.4 must have some changes in dealing with complex numbers.
Thanks!
Tai
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page