PXFFORK-ed processes adding up, hitting num of processes limit on Mac

nooj · ‎11-11-2009

I am using PXFFORK to fork my simulation for doing post-processing and whatnot. Like so:

! Mac desktop, 2.93GHz Quad-Core Intel Xeon

! Mac OSX v 10.5.8

! ifort v 11.1

subroutine postprocess_wrapper()

integer pid, ierror

call PXFFORK(pid,ierror)

if(pid>0) return ! We are the parent. Go back to computing.

if(pid==0) then ! We are the child. Postprocess.

call postprocess

stop !!!!!!!!!!!!! STOP !!!!!!

endif !pid

if(pid<0) then ! System error. No child created.

write(*,*) "FORK ERROR: ", pid

call postprocess

endif !pid

return

end subroutine postprocess_wrapper

These forked processes do not completely exit when they reach the STOP command:

localhost:~> ps -u nooj

[snip]

502 83350 ?? 0:00.00 (MyForkProgram)

502 83366 ?? 0:00.00 (MyForkProgram)

502 83367 ?? 0:00.00 (MyForkProgram)

502 83368 ?? 0:00.00 (MyForkProgram)

502 83369 ?? 0:00.00 (MyForkProgram)

502 83370 ?? 0:00.00 (MyForkProgram)

502 83375 ?? 0:00.00 (MyForkProgram)

[snip]

localhost:~>

On mac, there is a maximum number of processes, default of 100. (Google for "fork: Resource temporarily unavailable".) There are ways to increase this number, of course, but that's not the problem. The problem is that my code is done, and the process should exit completely (successfully). All the processes immediately exit completely when the last thread finishes.

Ideas? I think this should be solvable by replacing my code as I have given it with something better.

- Nooj

Steven_L_Intel1 · ‎11-13-2009

Have you tried some method to exit other than STOP?

nooj · ‎11-16-2009

Quoting - Steve Lionel (Intel)

Have you tried some method to exit other than STOP?

I'm not aware of any other Fortran commands to cease execution and exit than STOP.

But at your suggestion, I tested the only other thing I could think of: deliberately causing a crash. I accessed an uninitialized variable with -ftrapuv in force. I get the same behavior: the thread ceases execution, but the process does not exit.

I also tested this (both with STOP and the run-time exeception) on linux (ubuntu, not sure how to get OS version; ifort version 11.1 20090511). The same behavior occurs there: processes cease execution and go defunct, but do not exit.

$ ps -u nooj

[snip]

9293 pts/4 00:01:08 a.opt

9298 pts/4 00:00:08 a.opt

[snip]

$

- Fred

nooj · ‎11-16-2009

I did a little googling for the problem on linux, and found this very sensible, very relevant set of comments.

http://unix.derkeiler.com/Mailing-Lists/Tru64-UNIX-Managers/2003-06/0037.html

text only google cache: http://74.125.95.132/search?q=cache:YvhVoqaI7AkJ:unix.derkeiler.com/Mailing-Lists/Tru64-UNIX-Managers/2003-06/0037.html+fork+OR+thread+defunct+linux&strip=1

It seems the poster there piped the output of his program to tee, as did I. There is an explanation of what was happening there, but the long story short is that the machine's behavior is normal. (Processes who are ostensibly sending their output to a running process do not get reaped by the shell until everyone exits. Or something.)

One commenter suggested the use of named pipes instead of just "myprog | tee logfile":

If you want to achieve this kind of pipelines and also fork and also avoid
zombies, you can try to use named pipes, such as

% mknod /tmp/pipe$$ p

% tee < /tmp/pipe$$ hello.log&

% hello >> /tmp/pipe$$

This worked for the poster. I am trying this technique now (so far unsuccessfully--defunct processes still lingering) and will report back if there is success.

- Fred

nooj · ‎01-09-2010

Okay, so investigations have shown that the proper way to fork without creating zombie processes is the famous double-fork, discussed in Richard Stevens' UNIX Network Programming, Volume 1,Chapter 13, section 4:

proper way to create a daemon process:

1. parent: fork

2. parent: exit

3. child 1: setsid

4. child 1: ignore SIGHUP

5. child 1: fork

6. child 1: exit

7. child 2: change working directory to something benign (like '/')

8. child 2: close any open descriptors

9. child 2: redirect stdin, stdout, stderr to /dev/null

10. child 2: conduct business, use syslogd for errors

How can I do this in FORTRAN on MacOSX?

I especially want to do this without the original parent process exiting (step 2 above).

I am stuck on two issues:

1. To ignore SIGHUP, I should call PXFSIGACTION, which requires a "sigaction" struct created by PXFSTRUCTCREATE. But what (integer) value do I give to my sigaction to mean "ignore"?

2. Does the method of avoiding zombie processes outlined above work if the original parent does not exit? It seems like it should.

Thanks everyone.

Fred