Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29382 Discussions

Sudden abort of CFD code for large data

Wee_Beng_T_
Beginner
951 Views
Hi,

I'm running a computational fluid dynamics MPI code. Originally, the grid size is around 2000x1000. When I change it to 4000x1000, my code aborts at the early stage of the run.

I tried to compile using the debug option and it works fine.

I then tried to pinpoint the location at which the abort occurs in the release version.

I inserted :

call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *, "x"

where x is 0 to 10 at different locations

Strangely, the error occurs right after the code enters a subroutine:

program ....

...

call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *, "0"

call initial

call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *, "2"

...

end program



subroutine initial

integer :: i,j,ierr

call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *, "1"

....

end subroutine initial

So it prints "0" and then aborts. If run correctly, it should print "0", "1", "2".

The strange thing is that it does not even print "1". I am only entering a subroutine. There is no allocation of data or whatsoever. Why does the code abort here?


How should I debug? In the release version, I'm using "-O3 -r8 -w95 -c -save -ipo"

In the debug version, I'm using "-g -debug all -check all -implicitnone -warn unused -fp-stack-check -heap-arrays -ftrapuv -check pointers -check bounds -r8 -w95 -c -O0 -save"


0 Kudos
3 Replies
mecej4
Honored Contributor III
951 Views
> There is no allocation of data or whatsoever. Why does the code abort here?

Do you have any local variables, specifically arrays or structures of appreciable size, in that subroutine? If so, you may be running out of stack space. What, if any, error messages do you see when the program aborts?

There is a compiler option to have local arrays allocated on the heap, instead of the stack. There is a linker (Windows) or shell (Linux) option to set a specified maximum stack allocation.
0 Kudos
Steven_L_Intel1
Employee
951 Views
On Linux, the stack limit is not a linker option, it is a process option with kernel configurastion limits.
0 Kudos
Wee_Beng_T_
Beginner
951 Views
Thanks for the reply mecej4,

As you see below:

subroutine initial

integer :: i,j,ierr

call MPI_Barrier(MPI_COMM_WORLD,ierr); if (myid==0) print *, "1"

call gen_xy_uv


there's only integer i,j,ierr declared. Before evening getting to gen_xy_uv, "1" should be printed out.

However, it didn't. I also use MPI_Barrier(MPI_COMM_WORLD,ierr) to ensure that at least all procs are synchronize before moving forward.

I also got these messages below. I wonder if they are relevant.

[n12-52:08178] mca: base: component_find: unable to open /opt/openmpi-1.5.3/lib/openmpi/mca_ess_tm: /opt/openmpi-1.5.3/lib/openmpi/mca_ess_tm.so: cannot open shared object file: Text file busy (ignored)

[n12-52:08178] mca: base: component_find: unable to open /opt/openmpi-1.5.3/lib/openmpi/mca_plm_rsh: /opt/openmpi-1.5.3/lib/openmpi/mca_plm_rsh.so: cannot open shared object file: Text file busy (ignored)

[n12-52:08178] mca: base: component_find: unable to open /opt/openmpi-1.5.3/lib/openmpi/mca_iof_orted: /opt/openmpi-1.5.3/lib/openmpi/mca_iof_orted.so: cannot open shared object file: Text file busy (ignored)

Thanks


0 Kudos
Reply