- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear All,
I am compiling a Fortran code with the ifort compiler version 2017.4 and I get a strange segmentation fault message. Now, this code works fine on an older machine with compiler version 2017.1, and running Ubuntu 16.04. The code is pretty large I cannot post it, but it's available on GitHub (it's ttpy on GitHub). The computer I am using runs CentOS 7.4, and I tried all sort of compilers (2017.1, 2017.4 2017.6 and 2018.1, all give a segfault message).
I am using the lp64 mkl interface.
I know it is hard to tell from here but I'd really appreciate any help. Below is the message I get.
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
test_ksl_cme 00000000004E2744 Unknown Unknown Unknown
libpthread-2.17.s 00002B90AA2925E0 Unknown Unknown Unknown
libiomp5.so 00002B90A9C51B57 omp_in_parallel Unknown Unknown
libmkl_intel_thre 00002B90A679588B mkl_serv_domain_g Unknown Unknown
libmkl_intel_thre 00002B90A67A8C49 mkl_blas_zcopy Unknown Unknown
libmkl_intel_lp64 00002B90A5D0C968 ZCOPY Unknown Unknown
test_ksl_cme 0000000000435B75 Unknown Unknown Unknown
test_ksl_cme 00000000004C267C Unknown Unknown Unknown
test_ksl_cme 0000000000411E6E Unknown Unknown Unknown
test_ksl_cme 00000000004058C2 Unknown Unknown Unknown
test_ksl_cme 0000000000403F9E Unknown Unknown Unknown
libc-2.17.so 00002B90AA4C0C05 __libc_start_main Unknown Unknown
test_ksl_cme 0000000000403EA9 Unknown Unknown Unknown
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
test_ksl_cme 00000000004E2A71 Unknown Unknown Unknown
libpthread-2.17.s 00002B90AA2925E0 Unknown Unknown Unknown
libiomp5.so 00002B90A9C66DA4 Unknown Unknown Unknown
ld-2.17.so 00002B90A599AB58 Unknown Unknown Unknown
libc-2.17.so 00002B90AA4D7A69 Unknown Unknown Unknown
libc-2.17.so 00002B90AA4D7AB5 Unknown Unknown Unknown
test_ksl_cme 00000000004DEAD9 Unknown Unknown Unknown
test_ksl_cme 00000000004E2744 Unknown Unknown Unknown
libpthread-2.17.s 00002B90AA2925E0 Unknown Unknown Unknown
libiomp5.so 00002B90A9C51B57 omp_in_parallel Unknown Unknown
libmkl_intel_thre 00002B90A679588B mkl_serv_domain_g Unknown Unknown
libmkl_intel_thre 00002B90A67A8C49 mkl_blas_zcopy Unknown Unknown
libmkl_intel_lp64 00002B90A5D0C968 ZCOPY Unknown Unknown
test_ksl_cme 0000000000435B75 Unknown Unknown Unknown
test_ksl_cme 00000000004C267C Unknown Unknown Unknown
test_ksl_cme 0000000000411E6E Unknown Unknown Unknown
test_ksl_cme 00000000004058C2 Unknown Unknown Unknown
test_ksl_cme 0000000000403F9E Unknown Unknown Unknown
libc-2.17.so 00002B90AA4C0C05 __libc_start_main Unknown Unknown
test_ksl_cme 0000000000403EA9 Unknown Unknown Unknown
Thanks
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
thanks for the reply, actually the zcopy seems to fail randomly and in different parts of the code. I have checked the stack size limits on my machine and they are fine (unlimited). I am trying to reproduce the problem but without success.
Concerning the necessity of using zcopy I think the main reason behind that is the possibility to exploit multithread architecture, anpart from that I don't know why the author of code decided to use the zcopy routine. I'll try to remove the offending zcopy routines and see what happens.
Raffaele
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
By the way,
I just wanted to add that your assignment is reversed to what zcopy does. It should be
crU(1:nc) = zresult_core(1:nc)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is unlikely that someone will download, build and run a large package (involving Fortran+Python+?) just to check a call to a BLAS routine. My suggestion is that, because the file tt-fort.f90 at https://github.com/oseledets/tt-fort/blob/65a62e3a4d7b10ffd00e55628ba1216d1dae3fd9/test_ksl_cme.f90 contains just one call to ZCOPY, i.e.,
call zcopy(sum(ru(1:d)*n(1:d)*ru(2:d+1)), zresult_core, 1, crU, 1)
integer nc ... nc = sum(ru(1:d)*n(1:d)*ru(2:d+1) print *,'nc = ',nc, ubound(zresult_core),ubound(crU) crU(1:nc) = zresult_core(1:nc) ! I had earlier, incorrectly, zresult_core(1:nc) = crU(1:nc)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page