segfault reading from piped stdin on centos7.2

kgore4 · ‎03-29-2016

I'm getting a buffer overflow detected segfault when reading from stdin via a pipe on centos7.2. Works on ubuntu 14.04. Works on both when compiled with gfortran. The ubuntu machines has the compiler installed. SELinux is enforcing but same result when permissive.

ifort (IFORT) 16.0.2 20160204

test program

program a
 implicit none
 integer :: b

 read(*,*) b
 print *,b

 stop
end

Compiled with (also tried -assume old_unit_star with no difference)

ifort -O0 -debug -static a.f90

when run with

echo 1 | ./a.out

produces segfault

*** buffer overflow detected ***: ./a.out terminated
======= Backtrace: =========
[0x4a368b]
[0x4d7552]
[0x4d74ee]
[0x4d6f09]
[0x4a958c]
[0x4e6628]
[0x4d6f8c]
[0x4d6eed]
[0x44dbdf]
[0x44daac]
[0x44ec19]
[0x43b8bd]
[0x4059ea]
[0x401128]
[0x4010ae]
[0x49844c]
[0x400f97]
======= Memory map: ========
00400000-0057e000 r-xp 00000000 00:2d 110493698                          /m/work/kgore4/a.a/a.out
0077d000-00783000 rw-p 0017d000 00:2d 110493698                          /m/work/kgore4/a.a/a.out
00783000-007a7000 rw-p 00000000 00:00 0
021ec000-0220f000 rw-p 00000000 00:00 0                                  [heap]
7f28195fe000-7f28195ff000 rw-p 00000000 00:00 0
7ffe6e79d000-7ffe6e7be000 rw-p 00000000 00:00 0                          [stack]
7ffe6e7d1000-7ffe6e7d3000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source
a.out              000000000047CE95  Unknown               Unknown  Unknown
a.out              000000000047AAB7  Unknown               Unknown  Unknown
a.out              000000000044C2F4  Unknown               Unknown  Unknown
a.out              000000000044C106  Unknown               Unknown  Unknown
a.out              000000000042AD76  Unknown               Unknown  Unknown
a.out              00000000004012F0  Unknown               Unknown  Unknown
a.out              0000000000497DA0  Unknown               Unknown  Unknown
a.out              000000000049EE77  Unknown               Unknown  Unknown
a.out              00000000004A3690  Unknown               Unknown  Unknown
a.out              00000000004D7552  Unknown               Unknown  Unknown
a.out              00000000004D74EE  Unknown               Unknown  Unknown
a.out              00000000004D6F09  Unknown               Unknown  Unknown
a.out              00000000004A958C  Unknown               Unknown  Unknown
a.out              00000000004E6628  Unknown               Unknown  Unknown
a.out              00000000004D6F8C  Unknown               Unknown  Unknown
a.out              00000000004D6EED  Unknown               Unknown  Unknown
a.out              000000000044DBDF  Unknown               Unknown  Unknown
a.out              000000000044DAAC  Unknown               Unknown  Unknown
a.out              000000000044EC19  Unknown               Unknown  Unknown
a.out              000000000043B8BD  Unknown               Unknown  Unknown
a.out              00000000004059EA  Unknown               Unknown  Unknown
a.out              0000000000401128  Unknown               Unknown  Unknown
a.out              00000000004010AE  Unknown               Unknown  Unknown
a.out              000000000049844C  Unknown               Unknown  Unknown
a.out              0000000000400F97  Unknown               Unknown  Unknown

strace seems to be the same until getpid is called where ubuntu returned 8283 and went on to read the 1 and \n. centos returned 1997017 and went straight into the error above. I wonder if that pid tried to go into a 16bit int? /proc/sys/kernel/pid_max is 4194303 on centos7 and 32768 on ubuntu.

EDIT: On further testing, it may not the pid. Adding an explicit call to getpid() and a print above the read lets it get past the getpid.

The difference in strace between running it with the pipe and without is

ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffc321491e0) = -1 ENOTTY (Inappropriate ioctl for device)

If I put the 1 into a real file and do "cat testinput | ./a.out" it still crashes. also with "./a.out <testinput".

EDIT2: add gdb backtrace after the segfault

(gdb) bt
#0  0x000000000049ee77 in abort ()
#1  0x00000000004a3690 in __libc_message ()
#2  0x00000000004d7552 in __fortify_fail ()
#3  0x00000000004d74ee in __chk_fail ()
#4  0x00000000004d6f09 in _IO_str_chk_overflow ()
#5  0x00000000004a958c in _IO_default_xsputn ()
#6  0x00000000004e6628 in vfprintf ()
#7  0x00000000004d6f8c in __vsprintf_chk ()
#8  0x00000000004d6eed in __sprintf_chk ()
#9  0x000000000044d089 in for__compute_filename ()
#10 0x000000000044ec19 in for__open_proc ()
#11 0x000000000043b8bd in for__open_default ()
#12 0x00000000004059ea in for_read_seq_lis ()
#13 0x0000000000401128 in a () at a.f90:5
#14 0x00000000004010ae in main ()
#15 0x000000000049844c in __libc_start_main ()
#16 0x0000000000400f97 in _start ()

kgore4 · ‎03-31-2016

If I do "echo 32768 >/proc/sys/kernel/pid_max", it runs.

Here's where it gets weird.

cat /proc/sys/kernel/pid_max
4194303  # fails

echo 32768 >/proc/sys/kernel/pid_max    # works
echo 32769 >/proc/sys/kernel/pid_max    # works
echo 4194303 >/proc/sys/kernel/pid_max  # fails
echo 65536 >/proc/sys/kernel/pid_max    # works
echo 65537 >/proc/sys/kernel/pid_max    # works
echo 65538 >/proc/sys/kernel/pid_max    # works
echo 165538 >/proc/sys/kernel/pid_max   # works
echo 1655380 >/proc/sys/kernel/pid_max  # works
echo 2655380 >/proc/sys/kernel/pid_max  # works
echo 4194303 >/proc/sys/kernel/pid_max  # works!  ??HUH??

Yuan_C_Intel · ‎07-07-2016

Hi,

Thank you for reporting the issue and providing the investigation results.

This is indeed related to a high PID which is over 7 digits long, eg. 1387017.

We have reproduced your issue and entered it in our problem tracking system. We will try to resolve this issue as soon as we can. However, please be advised that this issue may have to be targeted to for the next major release. I will let you know when I have an update on this issue.

Thanks.

Kevin_D_Intel · ‎07-12-2016

Related issue: https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/607517

Simon_C_ · ‎02-20-2017

I am getting this exact problem using ifort (IFORT) 17.0.0 20160721

setting sysctl -w kernel.max_fid=32768 seems to fix the problem..

however, following instructions for installation a program (ceph file system) recommends using a value of kernel.max_fid=4194303

kgore4 · ‎02-20-2017

Yes, the fix isn't in 17.0.0. It should be in 17.0.1 (according to the related issue link in Kevin's post) which has been out for a little while. I haven't had time to test it though.

Yuan_C_Intel · ‎02-20-2017

Hi, all

Yes, this has been fixed in Intel® Fortran Linux* 2017 Update 1 or 2016 Update 4.

Thanks.