- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm using mkl (10.0.011, ia64) in finite element simulations on an Itanium2 system. Several times during the simulation I get following message:
floating-point assist fault at ip 2000000000780561, isr 0000020000000008
This leads to a significant slow-down of my simulations. Using
prctl --fpemu=signal
the program stopps in mkl_lapack_dtrtri or dcsrilu0. I have prepared a small example for the failure in mkl_lapack_dtrtri:
#include
#include
int matrix_inverse (double *mat, double *inv, int dim)
{
memcpy (inv,mat,dim*dim *sizeof(double));
int* ipiv = new int[dim];
double* work = new double[dim *dim];
int info;
dgetrf_ (&dim, &dim, inv, &dim, ipiv, &info);
dgetri_ (&dim, inv, &dim, ipiv, work, &dim, &info);
delete[] work;
delete[] ipiv;
return 0;
}
int main()
{
for(int i = 0; i < 1000; i++)
{
std::cout << i << std::endl;
double test1[9] = {1,0,0,0,1,0,0,0,1};
double inv[9];
matrix_inverse(test1,inv,3);
}
return 0;
}
The program is compiled with (Compiler Version: 10.0.026):
icpc -O0 -g -ftz -o test test.cpp -I/ahome/ism/eckardt4/intel/mkl/include -L/ahome/ism/eckardt4/intel/mkl/lib/64/lib -lmkl_lapack -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lguide -lpthread
Running in idb:
> idb ./test
Intel Debugger for applications running on IA-64, Version 10.0-29 , Build 20070405
------------------
object file name: ./test
Reading symbols from /tmp/eckardt4/test...done.
(idb) run
[New Thread 2305843009251186048 (LWP 13474)]
[New Thread 2305843009251186048 (LWP 13474)]
Starting program: /tmp/eckardt4/test
0
1
Program received signal SIGFPE
mkl_lapack_dtrtri () in /ahome/ism/eckardt4/intel/mkl_10.0.011/lib/64/libmkl_lapack.so
(idb) where
#0 0x2000000000780562 in mkl_lapack_dtrtri () in /ahome/ism/eckardt4/intel/mkl_10.0.011/lib/64/libmkl_lapack.so
#1 0x200000000046d8d0 in mkl_lapack_dgetri () in /ahome/ism/eckardt4/intel/mkl_10.0.011/lib/64/libmkl_lapack.so
#2 0x200000000120f920 in DGETRI () in /ahome/ism/eckardt4/intel/mkl_10.0.011/lib/64/libmkl_intel_lp64.so
#3 0x4000000000001440 in matrix_inverse (mat=0x607fffffff4963b0, inv=0x607fffffff496400, dim=3) at test.cpp:13
#4 0x4000000000001820 in main () at test.cpp:28
#5 0x200000000217bc20 in __libc_start_main () in /lib/libc-2.4.so
#6 0x4000000000000ec0 in _start () in /tmp/eckardt4/test
dmesg gives the following message:
test(13474): floating-point assist fault at ip 2000000000780561, isr 0000020000000008Compiling without debug informations (without -g) the program works well.
The original finite element code is compiled without debug informations (-O2 -ftz) and this message is not only observed in mkl_lapack_dtrtri but also in dcsrilu0.
Thank you,
Stefan Eckardt
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would suggest you submit your problem report on your premier.intel.com support account.
We have seen somewhat similar situations where software not under our control sent console
messages at an unacceptable rate, so it was necessary to suppress the message. Evidently, it
would be preferable to have the source of the problem investigated.
We have seen somewhat similar situations where software not under our control sent console
messages at an unacceptable rate, so it was necessary to suppress the message. Evidently, it
would be preferable to have the source of the problem investigated.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page