- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I cannot reproduce this so there’s something about your environment that I’m not matching.
What Linux OS are you using?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I suspect that the problem lies with the system C runtime rather than with Intel Fortran, because the error output seems to start from the bowels of /lib64/libc.so.6 . You could try a couple of simple tests to obtain a bit more information. 1) Compile your test program with Gfortran and try redirecting the output of the a.out. 2) Compile using Ifort but use the -traceback and -g options, then try redirection.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Kevin Davis (Intel) wrote:
I cannot reproduce this so there’s something about your environment that I’m not matching.
What Linux OS are you using?
We have systems running RHEL 6.4 and RHEL 6.6, both of which can reproduce this issue. The RHEL 6.4 systems have kernel 2.6.32-358.11.1.el6.x86_64 and glibc-2.12-1.107.el6_4.2.x86_64. The RHEL 6.6 systems have kernel 2.6.32-504.16.2.el6.x86_64 and glibc-2.12-1.149.el6_6.7.x86_64.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
mecej4 wrote:
I suspect that the problem lies with the system C runtime rather than with Intel Fortran, because the error output seems to start from the bowels of /lib64/libc.so.6 . You could try a couple of simple tests to obtain a bit more information. 1) Compile your test program with Gfortran and try redirecting the output of the a.out. 2) Compile using Ifort but use the -traceback and -g options, then try redirection.
gfortran works fine, as does Intel 15.0 ifort compiler. With Intel 16.0.1 ifort compiler with -traceback and -g options, here is the stderr:
watrok@amrndhl460:$ ifort -v
	ifort version 16.0.1
	watrok@amrndhl460:$ ifort -traceback -g ok.f90
	watrok@amrndhl460:$ ./a.out
	 ok
	watrok@amrndhl460:$ ./a.out > ok.out
	*** buffer overflow detected ***: ./a.out terminated
	======= Backtrace: =========
	/lib64/libc.so.6(__fortify_fail+0x37)[0x3091302527]
	/lib64/libc.so.6[0x3091300410]
	/lib64/libc.so.6[0x30912ff869]
	/lib64/libc.so.6(_IO_default_xsputn+0xc9)[0x3091274639]
	/lib64/libc.so.6(_IO_vfprintf+0x11d8)[0x30912451a8]
	/lib64/libc.so.6(__vsprintf_chk+0x9d)[0x30912ff90d]
	/lib64/libc.so.6(__sprintf_chk+0x7f)[0x30912ff84f]
	./a.out[0x445add]
	./a.out[0x4473e9]
	./a.out[0x4340ad]
	./a.out[0x4097fe]
	./a.out[0x402e9e]
	./a.out[0x402e1e]
	/lib64/libc.so.6(__libc_start_main+0xfd)[0x309121ed5d]
	./a.out[0x402d29]
	======= Memory map: ========
	00400000-004b1000 r-xp 00000000 fd:02 2883941                            /tmp/watrok/f90/a.out
	006b0000-006b4000 rw-p 000b0000 fd:02 2883941                            /tmp/watrok/f90/a.out
	006b4000-006d2000 rw-p 00000000 00:00 0
	02673000-02694000 rw-p 00000000 00:00 0                                  [heap]
	3090a00000-3090a20000 r-xp 00000000 fd:00 526774                         /lib64/ld-2.12.so
	3090c1f000-3090c20000 r--p 0001f000 fd:00 526774                         /lib64/ld-2.12.so
	3090c20000-3090c21000 rw-p 00020000 fd:00 526774                         /lib64/ld-2.12.so
	3090c21000-3090c22000 rw-p 00000000 00:00 0
	3090e00000-3090e83000 r-xp 00000000 fd:00 535929                         /lib64/libm-2.12.so
	3090e83000-3091082000 ---p 00083000 fd:00 535929                         /lib64/libm-2.12.so
	3091082000-3091083000 r--p 00082000 fd:00 535929                         /lib64/libm-2.12.so
	3091083000-3091084000 rw-p 00083000 fd:00 535929                         /lib64/libm-2.12.so
	3091200000-309138a000 r-xp 00000000 fd:00 528578                         /lib64/libc-2.12.so
	309138a000-309158a000 ---p 0018a000 fd:00 528578                         /lib64/libc-2.12.so
	309158a000-309158e000 r--p 0018a000 fd:00 528578                         /lib64/libc-2.12.so
	309158e000-309158f000 rw-p 0018e000 fd:00 528578                         /lib64/libc-2.12.so
	309158f000-3091594000 rw-p 00000000 00:00 0
	3091600000-3091617000 r-xp 00000000 fd:00 529004                         /lib64/libpthread-2.12.so
	3091617000-3091817000 ---p 00017000 fd:00 529004                         /lib64/libpthread-2.12.so
	3091817000-3091818000 r--p 00017000 fd:00 529004                         /lib64/libpthread-2.12.so
	3091818000-3091819000 rw-p 00018000 fd:00 529004                         /lib64/libpthread-2.12.so
	3091819000-309181d000 rw-p 00000000 00:00 0
	3091a00000-3091a02000 r-xp 00000000 fd:00 528120                         /lib64/libdl-2.12.so
	3091a02000-3091c02000 ---p 00002000 fd:00 528120                         /lib64/libdl-2.12.so
	3091c02000-3091c03000 r--p 00002000 fd:00 528120                         /lib64/libdl-2.12.so
	3091c03000-3091c04000 rw-p 00003000 fd:00 528120                         /lib64/libdl-2.12.so
	3097600000-3097616000 r-xp 00000000 fd:00 534069                         /lib64/libgcc_s-4.4.7-20120601.so.1
	3097616000-3097815000 ---p 00016000 fd:00 534069                         /lib64/libgcc_s-4.4.7-20120601.so.1
	3097815000-3097816000 rw-p 00015000 fd:00 534069                         /lib64/libgcc_s-4.4.7-20120601.so.1
	7f9d8a058000-7f9d8a05d000 rw-p 00000000 00:00 0
	7f9d8a082000-7f9d8a084000 rw-p 00000000 00:00 0
	7fff37f87000-7fff37f9d000 rw-p 00000000 00:00 0                          [stack]
	7fff37fb9000-7fff37fba000 r-xp 00000000 00:00 0                          [vdso]
	ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
	forrtl: error (76): Abort trap signal
	Image              PC                Routine            Line        Source
	a.out              0000000000477435  Unknown               Unknown  Unknown
	a.out              00000000004751F7  Unknown               Unknown  Unknown
	a.out              0000000000444AE4  Unknown               Unknown  Unknown
	a.out              00000000004448F6  Unknown               Unknown  Unknown
	a.out              00000000004259F6  Unknown               Unknown  Unknown
	a.out              00000000004037D8  Unknown               Unknown  Unknown
	libpthread.so.0    000000309160F710  Unknown               Unknown  Unknown
	libc.so.6          0000003091232625  Unknown               Unknown  Unknown
	libc.so.6          0000003091233E05  Unknown               Unknown  Unknown
	libc.so.6          0000003091270537  Unknown               Unknown  Unknown
	libc.so.6          0000003091302527  Unknown               Unknown  Unknown
	libc.so.6          0000003091300410  Unknown               Unknown  Unknown
	libc.so.6          00000030912FF869  Unknown               Unknown  Unknown
	libc.so.6          0000003091274639  Unknown               Unknown  Unknown
	libc.so.6          00000030912451A8  Unknown               Unknown  Unknown
	libc.so.6          00000030912FF90D  Unknown               Unknown  Unknown
	libc.so.6          00000030912FF84F  Unknown               Unknown  Unknown
	a.out              0000000000445ADD  Unknown               Unknown  Unknown
	a.out              00000000004473E9  Unknown               Unknown  Unknown
	a.out              00000000004340AD  Unknown               Unknown  Unknown
	a.out              00000000004097FE  Unknown               Unknown  Unknown
	a.out              0000000000402E9E  MAIN__                      1  ok.f90
	a.out              0000000000402E1E  Unknown               Unknown  Unknown
	libc.so.6          000000309121ED5D  Unknown               Unknown  Unknown
	a.out              0000000000402D29  Unknown               Unknown  Unknown
	Aborted (core dumped)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've tried to answer mecej4's question twice, but I keep getting "Your comment has been queued for review by site administrators and will be published after approval." Why?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This forum uses an automated system for spam detection, and it sometimes gets thrown off by code or diagnostics just as part of text. Messages are reviewed promptly and dealt with.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you. I can match mostly to your RHEL 6.4/glibc versions but I cannot reproduce the issue. Yours is a slightly different kernel variant but don't suspect that. I have 2.6.32-358.el6.x86_64 / glibc-2.12-1.107.el6.x86_64.
This sort of error usually proves hard to identify. I have seen similar failures with a variety of causes. One instance was mixing older shared libs on newer distros or other distros. Another related using LD_PRELOAD. Another older case was mixing the g++ and C++ libstdc++ library and the ifort C++ library libcxa.
You might look at your environment and various settings like LD_LIBRARY_PATH and the like since you are loading a different module configuration. Check if LD_PRELOAD is in play. Maybe looking at ldd and/or ldconfig would shed clues.
Running in the debugger might shed more clues. You can set this up for running under the debugger as shown below. I do not know if stepping or setting other breakpoints within libc call stack you showed and running would shed more clues or not.
$ gdb a.out Reading symbols from /tmp/u607517/a.out...done. (gdb) set args "> out.txt" (gdb) br __sprintf_chk Breakpoint 2 at 0x3f752ff7b0 (gdb) r Starting program: /tmp/u607517/a.out "> out.txt" [Thread debugging using libthread_db enabled] Breakpoint 2, 0x0000003f752ff7b0 in __sprintf_chk () from /lib64/libc.so.6 (gdb) bt #0 0x0000003f752ff7b0 in __sprintf_chk () from /lib64/libc.so.6 #1 0x0000000000406de6 in for__preconnected_units_create () #2 0x0000000000405983 in for_rtl_init_ () #3 0x0000000000402e19 in main () #4 0x0000003f7521ecdd in __libc_start_main () from /lib64/libc.so.6 #5 0x0000000000402d29 in _start () (gdb)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Kevin Davis (Intel) wrote:
You might look at your environment and various settings like LD_LIBRARY_PATH and the like since you are loading a different module configuration. Check if LD_PRELOAD is in play. Maybe looking at ldd and/or ldconfig would shed clues.
Running in the debugger might shed more clues. You can set this up for running under the debugger as shown below. I do not know if stepping or setting other breakpoints within libc call stack you showed and running would shed more clues or not.
Neither LD_LIBRARY_PATH and LD_PRELOAD are set in my shell environment. Here is what it looks like on a RHEL 6.4 system:
$ ldd a.out
        linux-vdso.so.1 =>  (0x00007fffa6dab000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003809000000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003808c00000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003808400000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003808800000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003567c00000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003808000000)
$ gdb a.out
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /tmp/watrok/f90/a.out...done.
(gdb) set args "> out.txt"
(gdb) br __sprintf_chk
Breakpoint 1 at 0x402c08
(gdb) r
Starting program: /tmp/watrok/f90/a.out "> out.txt"
[Thread debugging using libthread_db enabled]
Breakpoint 1, ___sprintf_chk (s=0x7fffffffd8d0 "", flags=1, slen=32, format=0x494404 "FORT%d") at sprintf_chk.c:28
28      {
(gdb) bt
#0  ___sprintf_chk (s=0x7fffffffd8d0 "", flags=1, slen=32, format=0x494404 "FORT%d") at sprintf_chk.c:28
#1  0x0000000000406de6 in for__preconnected_units_create ()
#2  0x0000000000405983 in for_rtl_init_ ()
#3  0x0000000000402e19 in main ()
#4  0x000000380881ecdd in __libc_start_main (main=0x402df0 <main>, argc=2, ubp_av=0x7fffffffdc38, init=<value optimized out>,
    fini=<value optimized out>, rtld_fini=<value optimized out>, stack_end=0x7fffffffdc28) at libc-start.c:226
#5  0x0000000000402d29 in _start ()
(gdb) f 0
#0  ___sprintf_chk (s=0x7fffffffd8d0 "", flags=1, slen=32, format=0x494404 "FORT%d") at sprintf_chk.c:28
28      {
(gdb) p s
$1 = 0x7fffffffd8d0 ""
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can duplicate the problem on 16.0.2 with centos7.2.
I'm may be seeing the same problem but with stdin on centos7 as strace looks almost exactly the same.
https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/622937
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This problem exists only if process has high PID.
Look at the post kgore4:
https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/622937#
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As noted in the earlier cited thread, this has been reproduced and directed to our run-time library team for further analysis/repair.
(Internal tracking id: DPD200585850)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>*** buffer overflow detected ***: ./a.out terminated
This was from ___sprintf_chk
If your Fortran program is calling sprint (directly or indirectly), and it is passing a format string, it may be that you forgot to append a NULL character to the format string (relying on uninitialized data to supply the null).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Development identified a workaround for this defect; hopefully it is usable.
Instead of using the redirection symbol ">", you can set an environment variable to direct the output to a file (or /dev/null).
So, instead of doing:
./ok-16.0.1 > ok.out
do this:
    setenv FOR_PRINT ok.out       (assuming the c shell; otherwise, it would be "export FOR_PRINT")
	    ./ok-16.0.1
Using "setenv FOR_PRINT /dev/null" also works.
If you are redirecting stderr (unit=0) to a file, then use the env variable name FORT0.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Kevin, dear all,
in our recently deployed cluster we are experiencing a similar issue with ifort 16.0.3 20160415 and CentOs 7.2.1511
We tried the workaround you suggested
    setenv FOR_PRINT ok.out       (assuming the c shell; otherwise, it would be "export FOR_PRINT")
	    ./ok-16.0.1
and it worked for some use cases but not all of them. Setting max_pid to 999999 (sysctl -w kernel.pid_max=999999) seems a more general workaround, or at least fixed some use cases that still failed. Any thought on that? What is the opinion of Intel on that?
Thanks for the help and all the suggestions the this forum provides.
Regards,
CINECA User Support group
	 
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I lack in-depth Linux kernel knowledge to comment much. The suggested setting should avoid the potential of a 7-digit PID; however, in a large cluster that might easily be exhausted with heavy usage. Maybe it is a more reasonable work around on Centos though given other's comments in this thread (and the other cited thread) seem to suggest Centos may start with high PIDs by default (which helped expose this defect).
It appears our fix for this will be our upcoming PSXE 2016 Update 4 release tentatively scheduled for late-August. I need to confirm, but for our upcoming major release later this year, PSXE 2017, it appears the fix will not make the initial release but will be in the first update. I’ll confirm.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I confirmed the fix will be in the upcoming PSXE 2016 Update 4 release (late-August timeframe) and PSXE 2017 Update 1 (mid-Q4 '16 timeframe), and not the initial release.
 
					
				
				
			
		
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
