- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I'm working with a .f90 program (compiled with ifort). I modified this program. I included a C program to this .f90 program (prog.f90 calls to prog.C). In a subroutine the .f90 program calls to system command to creat a directory. The program runs in a good way but presents a detail: The modified program calls to system command but the command is not ejecuted, this is very strange. The grogram run but does't ejecute the comand. The unmodified program (without .C) ejecutes correctly the system command.
The .f90 program can run with an MPI architecture. When I activate the MPI architecture (with mpif90) the program compile but when I run the program it tells me:
forrtl: error (72): floating overflow
Image PC Routine Line Source
ramses3d 00000000005F3C48 Unknown Unknown Unknown
ramses3d 000000000040634F Unknown Unknown Unknown
ramses3d 0000000000410DC2 Unknown Unknown Unknown
ramses3d 0000000000423273 Unknown Unknown Unknown
ramses3d 0000000000426D48 Unknown Unknown Unknown
ramses3d 000000000043B6C1 cooling_module_mp 323 cooling_module.f90
ramses3d 0000000000450D04 init_time_ 60 init_time.f90
ramses3d 0000000000453801 adaptive_loop_ 21 adaptive_loop.f90
ramses3d 000000000052E1E2 MAIN__ 8 ramses.f90
ramses3d 0000000000404482 Unknown Unknown Unknown
libc.so.6 0000003F8381D974 Unknown Unknown Unknown
ramses3d 00000000004043A9 Unknown Unknown Unknown
The program without the modification (without the .C program) compiles and runs in a good way (in serial and parallel mode), but when I include the .C program the program doesn't ejecute the system command with serial architecture and show me the error (72) with parallel architecture.
The problem is in my .C program, but what is the problem?...
Someone can give a clue?
Thank you.
The .f90 program can run with an MPI architecture. When I activate the MPI architecture (with mpif90) the program compile but when I run the program it tells me:
forrtl: error (72): floating overflow
Image PC Routine Line Source
ramses3d 00000000005F3C48 Unknown Unknown Unknown
ramses3d 000000000040634F Unknown Unknown Unknown
ramses3d 0000000000410DC2 Unknown Unknown Unknown
ramses3d 0000000000423273 Unknown Unknown Unknown
ramses3d 0000000000426D48 Unknown Unknown Unknown
ramses3d 000000000043B6C1 cooling_module_mp 323 cooling_module.f90
ramses3d 0000000000450D04 init_time_ 60 init_time.f90
ramses3d 0000000000453801 adaptive_loop_ 21 adaptive_loop.f90
ramses3d 000000000052E1E2 MAIN__ 8 ramses.f90
ramses3d 0000000000404482 Unknown Unknown Unknown
libc.so.6 0000003F8381D974 Unknown Unknown Unknown
ramses3d 00000000004043A9 Unknown Unknown Unknown
The program without the modification (without the .C program) compiles and runs in a good way (in serial and parallel mode), but when I include the .C program the program doesn't ejecute the system command with serial architecture and show me the error (72) with parallel architecture.
The problem is in my .C program, but what is the problem?...
Someone can give a clue?
Thank you.
Link Copied
17 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jpprieto
Hi, I'm working with a .f90 program (compiled with ifort). I modified this program. I included a C program to this .f90 program (prog.f90 calls to prog.C). In a subroutine the .f90 program calls to system command to creat a directory. The program runs in a good way but presents a detail: The modified program calls to system command but the command is not ejecuted, this is very strange. The grogram run but does't ejecute the comand. The unmodified program (without .C) ejecutes correctly the system command.
The .f90 program can run with an MPI architecture. When I activate the MPI architecture (with mpif90) the program compile but when I run the program it tells me:
forrtl: error (72): floating overflow
Image PC Routine Line Source
ramses3d 00000000005F3C48 Unknown Unknown Unknown
ramses3d 000000000040634F Unknown Unknown Unknown
ramses3d 0000000000410DC2 Unknown Unknown Unknown
ramses3d 0000000000423273 Unknown Unknown Unknown
ramses3d 0000000000426D48 Unknown Unknown Unknown
ramses3d 000000000043B6C1 cooling_module_mp 323 cooling_module.f90
ramses3d 0000000000450D04 init_time_ 60 init_time.f90
ramses3d 0000000000453801 adaptive_loop_ 21 adaptive_loop.f90
ramses3d 000000000052E1E2 MAIN__ 8 ramses.f90
ramses3d 0000000000404482 Unknown Unknown Unknown
libc.so.6 0000003F8381D974 Unknown Unknown Unknown
ramses3d 00000000004043A9 Unknown Unknown Unknown
The program without the modification (without the .C program) compiles and runs in a good way (in serial and parallel mode), but when I include the .C program the program doesn't ejecute the system command with serial architecture and show me the error (72) with parallel architecture.
The problem is in my .C program, but what is the problem?...
Someone can give a clue?
Thank you.
The .f90 program can run with an MPI architecture. When I activate the MPI architecture (with mpif90) the program compile but when I run the program it tells me:
forrtl: error (72): floating overflow
Image PC Routine Line Source
ramses3d 00000000005F3C48 Unknown Unknown Unknown
ramses3d 000000000040634F Unknown Unknown Unknown
ramses3d 0000000000410DC2 Unknown Unknown Unknown
ramses3d 0000000000423273 Unknown Unknown Unknown
ramses3d 0000000000426D48 Unknown Unknown Unknown
ramses3d 000000000043B6C1 cooling_module_mp 323 cooling_module.f90
ramses3d 0000000000450D04 init_time_ 60 init_time.f90
ramses3d 0000000000453801 adaptive_loop_ 21 adaptive_loop.f90
ramses3d 000000000052E1E2 MAIN__ 8 ramses.f90
ramses3d 0000000000404482 Unknown Unknown Unknown
libc.so.6 0000003F8381D974 Unknown Unknown Unknown
ramses3d 00000000004043A9 Unknown Unknown Unknown
The program without the modification (without the .C program) compiles and runs in a good way (in serial and parallel mode), but when I include the .C program the program doesn't ejecute the system command with serial architecture and show me the error (72) with parallel architecture.
The problem is in my .C program, but what is the problem?...
Someone can give a clue?
Thank you.
It's unclear from what you have sent. What is the statement at line 323 in cooling_module.f90?
I'd also try compiler options -g -traceback -fp-stack-check -check all -warn all
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Ronald W. Green (Intel)
It's unclear from what you have sent. What is the statement at line 323 in cooling_module.f90?
I'd also try compiler options -g -traceback -fp-stack-check -check all -warn all
ron
Hi, Ronald.
The line 323 is:
call evol_single_cell(astart,aend,dasura,h,omegab,omega0,omegaL,-1.0d0,T2end,mu,ne,.false.)
the type of the input variables is
real(kind=8) :: astart,aend,dasura,T2end,mu,ne
real(kind=8) :: h,omegab,omega0,omegaL
About the compiler option -g -traceback -fp-stack-check -check all -warm all.
I'm compiling with
F90 = mpif90
FFLAGS = -O3 -g -traceback -fpe0 -ftrapuv -cpp -DNDIM=$(NDIM) -DNPRE=$(NPRE) -DSOLVER$(SOLVER) -DNOSYSTEM
I'll compile it with the options given by you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jpprieto
Quoting - Ronald W. Green (Intel)
It's unclear from what you have sent. What is the statement at line 323 in cooling_module.f90?
I'd also try compiler options -g -traceback -fp-stack-check -check all -warn all
ron
Hi, Ronald.
The line 323 is:
call evol_single_cell(astart,aend,dasura,h,omegab,omega0,omegaL,-1.0d0,T2end,mu,ne,.false.)
the type of the input variables is
real(kind=8) :: astart,aend,dasura,T2end,mu,ne
real(kind=8) :: h,omegab,omega0,omegaL
About the compiler option -g -traceback -fp-stack-check -check all -warm all.
I'm compiling with
F90 = mpif90
FFLAGS = -O3 -g -traceback -fpe0 -ftrapuv -cpp -DNDIM=$(NDIM) -DNPRE=$(NPRE) -DSOLVER$(SOLVER) -DNOSYSTEM
I'll compile it with the options given by you.
The stacktrace into evol_single_cell() and all calls after that have no symbolic information. Is that code in C or in Fortran? Something down in evol_single_cell or something it calls is blowing up.
If evol_single_cell is Fortran, compile everything with:
-gen-interfaces -warn interfaces
to make sure that the calling sequence is correct.
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Ronald W. Green (Intel)
Quoting - jpprieto
Quoting - Ronald W. Green (Intel)
It's unclear from what you have sent. What is the statement at line 323 in cooling_module.f90?
I'd also try compiler options -g -traceback -fp-stack-check -check all -warn all
ron
Hi, Ronald.
The line 323 is:
call evol_single_cell(astart,aend,dasura,h,omegab,omega0,omegaL,-1.0d0,T2end,mu,ne,.false.)
the type of the input variables is
real(kind=8) :: astart,aend,dasura,T2end,mu,ne
real(kind=8) :: h,omegab,omega0,omegaL
About the compiler option -g -traceback -fp-stack-check -check all -warm all.
I'm compiling with
F90 = mpif90
FFLAGS = -O3 -g -traceback -fpe0 -ftrapuv -cpp -DNDIM=$(NDIM) -DNPRE=$(NPRE) -DSOLVER$(SOLVER) -DNOSYSTEM
I'll compile it with the options given by you.
The stacktrace into evol_single_cell() and all calls after that have no symbolic information. Is that code in C or in Fortran? Something down in evol_single_cell or something it calls is blowing up.
If evol_single_cell is Fortran, compile everything with:
-gen-interfaces -warn interfaces
to make sure that the calling sequence is correct.
ron
evol_single_cell is a fortran program. evol_single_cell calls to the C program.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jpprieto
Quoting - Ronald W. Green (Intel)
Quoting - jpprieto
Quoting - Ronald W. Green (Intel)
It's unclear from what you have sent. What is the statement at line 323 in cooling_module.f90?
I'd also try compiler options -g -traceback -fp-stack-check -check all -warn all
ron
Hi, Ronald.
The line 323 is:
call evol_single_cell(astart,aend,dasura,h,omegab,omega0,omegaL,-1.0d0,T2end,mu,ne,.false.)
the type of the input variables is
real(kind=8) :: astart,aend,dasura,T2end,mu,ne
real(kind=8) :: h,omegab,omega0,omegaL
About the compiler option -g -traceback -fp-stack-check -check all -warm all.
I'm compiling with
F90 = mpif90
FFLAGS = -O3 -g -traceback -fpe0 -ftrapuv -cpp -DNDIM=$(NDIM) -DNPRE=$(NPRE) -DSOLVER$(SOLVER) -DNOSYSTEM
I'll compile it with the options given by you.
The stacktrace into evol_single_cell() and all calls after that have no symbolic information. Is that code in C or in Fortran? Something down in evol_single_cell or something it calls is blowing up.
If evol_single_cell is Fortran, compile everything with:
-gen-interfaces -warn interfaces
to make sure that the calling sequence is correct.
ron
evol_single_cell is a fortran program. evol_single_cell calls to the C program.
I compiled the code with your options
F90 = mpif90
FFLAGS = -O3 -g -traceback -fp-stack-check -ftrapuv -warn all -check all -cpp -DNDIM=$(NDIM) -DNPRE=$(NPRE) -DSOLVER$(SOLVER) -DNOSYSTEM
Now, the error is:
forrtl: warning (402): fort: (1): In call to CMP_CHEM_NONEQ, an array temporary was created for argument #7
forrtl: error (65): floating invalid
Image PC Routine Line Source
ramses3d 0000000000410D55 Unknown Unknown Unknown
ramses3d 0000000000423EA4 Unknown Unknown Unknown
ramses3d 0000000000427B04 Unknown Unknown Unknown
ramses3d 0000000000439F7C cooling_module_mp 487 cooling_module.f90
ramses3d 0000000000463849 Unknown Unknown Unknown
ramses3d 00000000004A4C31 init_time_ 60 init_time.f90
ramses3d 00000000004AB0AD adaptive_loop_ 21 adaptive_loop.f90
ramses3d 0000000000AC83F6 MAIN__ 8 ramses.f90
ramses3d 0000000000404442 Unknown Unknown Unknown
libc.so.6 0000003BCA61D974 Unknown Unknown Unknown
ramses3d 0000000000404369 Unknown Unknown Unknown
The line 487 of cooling_module.f90 has the called to the C program:
call cmp_chem_noneq(nH,T2,dt_cool,DT2,mu,aexp,uini(1,1:nvar-ndim-3),ini)
The argument number 7 has som physical properties of gas element. There is something wrong in the way in which I send uini to the C program?
uini is defined as
real(dp),allocatable,dimension(:,:)::uini
and in the C program is received as
a
double uini[]
variable.
I need compile this program because I want to perform large hydrodynamical simulations in a cosmological context.
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's hard to say if uini is the cause. From the last trace we see that on the Fortran side an array temporary is created since you are passing a row vector that is discontiguous in memory. This is the correct thing to do, since the C is expecting a vector that is contiguous in memory.
Without a trace on the C code we don't have enough to go on. Are you compiling the C code with -g?
Do you have a debugger like TotalView you can use to debug MPI? If not, add some code to the C to check the arguments coming in. How have you declared cmp_chem_noneq within cooling_module.f90, and please don't say you just declare it EXTERNAL.
Someone needs to dig deep into this code, the error is not obvious from the little information I have.
ron
Without a trace on the C code we don't have enough to go on. Are you compiling the C code with -g?
Do you have a debugger like TotalView you can use to debug MPI? If not, add some code to the C to check the arguments coming in. How have you declared cmp_chem_noneq within cooling_module.f90, and please don't say you just declare it EXTERNAL.
Someone needs to dig deep into this code, the error is not obvious from the little information I have.
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Ronald W. Green (Intel)
It's hard to say if uini is the cause. From the last trace we see that on the Fortran side an array temporary is created since you are passing a row vector that is discontiguous in memory. This is the correct thing to do, since the C is expecting a vector that is contiguous in memory.
Without a trace on the C code we don't have enough to go on. Are you compiling the C code with -g?
Do you have a debugger like TotalView you can use to debug MPI? If not, add some code to the C to check the arguments coming in. How have you declared cmp_chem_noneq within cooling_module.f90, and please don't say you just declare it EXTERNAL.
Someone needs to dig deep into this code, the error is not obvious from the little information I have.
ron
Without a trace on the C code we don't have enough to go on. Are you compiling the C code with -g?
Do you have a debugger like TotalView you can use to debug MPI? If not, add some code to the C to check the arguments coming in. How have you declared cmp_chem_noneq within cooling_module.f90, and please don't say you just declare it EXTERNAL.
Someone needs to dig deep into this code, the error is not obvious from the little information I have.
ron
No.I'm compiling the code without -g. I'm compiling with
gcc -c coolinghd.c
to create the .o file and to do the link with the others .o files from the fortran code.
I don't have a debugger. But I have checked the arguments comming in the C code and there isn't problem. All arguments have the correct value.
About cmp_chem_noneq: The coolinghd.c code start with (to the end)
int cmp_chem_noneq_(double *rhob,double *T2,double *dt,double *DT2,double *mu,double *a,double uini[],int *ii)
{
...
And inside the cooling_module.f90 I call to this program as
...
call cmp_chem_noneq(nH,T2,dt_cool,DT2,mu,aexp,uini,ini)
...
I make the link between the codes with the .o files.
Thank you very much.
I'm wayting for your reply.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jpprieto
Hi.
No.I'm compiling the code without -g. I'm compiling with
gcc -c coolinghd.c
to create the .o file and to do the link with the others .o files from the fortran code.
I don't have a debugger. But I have checked the arguments comming in the C code and there isn't problem. All arguments have the correct value.
About cmp_chem_noneq: The coolinghd.c code start with (to the end)
int cmp_chem_noneq_(double *rhob,double *T2,double *dt,double *DT2,double *mu,double *a,double uini[],int *ii)
{
...
And inside the cooling_module.f90 I call to this program as
...
call cmp_chem_noneq(nH,T2,dt_cool,DT2,mu,aexp,uini,ini)
...
I make the link between the codes with the .o files.
Thank you very much.
I'm wayting for your reply.
No.I'm compiling the code without -g. I'm compiling with
gcc -c coolinghd.c
to create the .o file and to do the link with the others .o files from the fortran code.
I don't have a debugger. But I have checked the arguments comming in the C code and there isn't problem. All arguments have the correct value.
About cmp_chem_noneq: The coolinghd.c code start with (to the end)
int cmp_chem_noneq_(double *rhob,double *T2,double *dt,double *DT2,double *mu,double *a,double uini[],int *ii)
{
...
And inside the cooling_module.f90 I call to this program as
...
call cmp_chem_noneq(nH,T2,dt_cool,DT2,mu,aexp,uini,ini)
...
I make the link between the codes with the .o files.
Thank you very much.
I'm wayting for your reply.
How is cmp_chem_noneq defined in the Fortran program? Do you have any sort of interface declaration for it, or are you simply calling it as shown?
thanks --
- Lorri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jpprieto
Quoting - Ronald W. Green (Intel)
It's hard to say if uini is the cause. From the last trace we see that on the Fortran side an array temporary is created since you are passing a row vector that is discontiguous in memory. This is the correct thing to do, since the C is expecting a vector that is contiguous in memory.
Without a trace on the C code we don't have enough to go on. Are you compiling the C code with -g?
Do you have a debugger like TotalView you can use to debug MPI? If not, add some code to the C to check the arguments coming in. How have you declared cmp_chem_noneq within cooling_module.f90, and please don't say you just declare it EXTERNAL.
Someone needs to dig deep into this code, the error is not obvious from the little information I have.
ron
Without a trace on the C code we don't have enough to go on. Are you compiling the C code with -g?
Do you have a debugger like TotalView you can use to debug MPI? If not, add some code to the C to check the arguments coming in. How have you declared cmp_chem_noneq within cooling_module.f90, and please don't say you just declare it EXTERNAL.
Someone needs to dig deep into this code, the error is not obvious from the little information I have.
ron
No.I'm compiling the code without -g. I'm compiling with
gcc -c coolinghd.c
to create the .o file and to do the link with the others .o files from the fortran code.
I don't have a debugger. But I have checked the arguments comming in the C code and there isn't problem. All arguments have the correct value.
About cmp_chem_noneq: The coolinghd.c code start with (to the end)
int cmp_chem_noneq_(double *rhob,double *T2,double *dt,double *DT2,double *mu,double *a,double uini[],int *ii)
{
...
And inside the cooling_module.f90 I call to this program as
...
call cmp_chem_noneq(nH,T2,dt_cool,DT2,mu,aexp,uini,ini)
...
I make the link between the codes with the .o files.
Thank you very much.
I'm wayting for your reply.
Are you sure that you are not accessing UINI out of bounds? As a quick test, you could print the first and last value of UINI from Fortran before the call, and from C after the call.
Also, how does this floating error in a C call relate to a system command call?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - krahn@niehs.nih.gov
Quoting - jpprieto
Quoting - Ronald W. Green (Intel)
It's hard to say if uini is the cause. From the last trace we see that on the Fortran side an array temporary is created since you are passing a row vector that is discontiguous in memory. This is the correct thing to do, since the C is expecting a vector that is contiguous in memory.
Without a trace on the C code we don't have enough to go on. Are you compiling the C code with -g?
Do you have a debugger like TotalView you can use to debug MPI? If not, add some code to the C to check the arguments coming in. How have you declared cmp_chem_noneq within cooling_module.f90, and please don't say you just declare it EXTERNAL.
Someone needs to dig deep into this code, the error is not obvious from the little information I have.
ron
Without a trace on the C code we don't have enough to go on. Are you compiling the C code with -g?
Do you have a debugger like TotalView you can use to debug MPI? If not, add some code to the C to check the arguments coming in. How have you declared cmp_chem_noneq within cooling_module.f90, and please don't say you just declare it EXTERNAL.
Someone needs to dig deep into this code, the error is not obvious from the little information I have.
ron
No.I'm compiling the code without -g. I'm compiling with
gcc -c coolinghd.c
to create the .o file and to do the link with the others .o files from the fortran code.
I don't have a debugger. But I have checked the arguments comming in the C code and there isn't problem. All arguments have the correct value.
About cmp_chem_noneq: The coolinghd.c code start with (to the end)
int cmp_chem_noneq_(double *rhob,double *T2,double *dt,double *DT2,double *mu,double *a,double uini[],int *ii)
{
...
And inside the cooling_module.f90 I call to this program as
...
call cmp_chem_noneq(nH,T2,dt_cool,DT2,mu,aexp,uini,ini)
...
I make the link between the codes with the .o files.
Thank you very much.
I'm wayting for your reply.
Are you sure that you are not accessing UINI out of bounds? As a quick test, you could print the first and last value of UINI from Fortran before the call, and from C after the call.
Also, how does this floating error in a C call relate to a system command call?
Now I'm compiling with
icc -c -g -traceback -w coolinghd.c
All seems ok. I follow the uini and its values are ok, but the program end with
...
rank 1 in job 1 geryon07_46495 caused collective abort of all ranks
exit status of rank 1: killed by signal 9
About the call system... this was the first problem, beacuse the command doesn't work in the serial mode of the program. I must to create the files manually.
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What MPI and version are you using?
And what version of Intel Fortran?
for MPI - what is your configure arguments when you built the MPI package?
ron
And what version of Intel Fortran?
for MPI - what is your configure arguments when you built the MPI package?
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Ronald W. Green (Intel)
What MPI and version are you using?
And what version of Intel Fortran?
for MPI - what is your configure arguments when you built the MPI package?
ron
And what version of Intel Fortran?
for MPI - what is your configure arguments when you built the MPI package?
ron
mpif90 for 1.0.6 Version 10.0
I don't know how to see the last information you need: MPI-...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jpprieto
I'm using MPICH2 and
mpif90 for 1.0.6 Version 10.0
I don't know how to see the last information you need: MPI-...
mpif90 for 1.0.6 Version 10.0
I don't know how to see the last information you need: MPI-...
http://www.contrib.andrew.cmu.edu/~milop/www1/mpif90.html
Note the reference to the installation manual for reconfiguring MPICH2.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - tim18
Important information includes how you set up mpif90 so that it uses ifort and ifort run-time libraries, rather than gnu libraries, or whatever would be in a default version of mpif90. An example of the most basic instructions:
http://www.contrib.andrew.cmu.edu/~milop/www1/mpif90.html
Note the reference to the installation manual for reconfiguring MPICH2.
http://www.contrib.andrew.cmu.edu/~milop/www1/mpif90.html
Note the reference to the installation manual for reconfiguring MPICH2.
All seems ok but suddenly the program stop.
All calculated values are ok but stop in the middle of a function JJ21(nu,aexp). This function is called by IR1Gl(nu,aexp) and this function is called by cpm_chem_noneq(...).
I'm compilig the c program with mpicc -g -w -c.
The output of ramses.pe is
rm: cannot remove `/tmp/70484.1.bigmem.q/rsh': No such file or directory
and the .e file contains
segmentation fault without information about routines.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jpprieto
Now I'm testing with mpi working with 1 processor.
All seems ok but suddenly the program stop.
All calculated values are ok but stop in the middle of a function JJ21(nu,aexp). This function is called by IR1Gl(nu,aexp) and this function is called by cpm_chem_noneq(...).
I'm compilig the c program with mpicc -g -w -c.
The output of ramses.pe is
rm: cannot remove `/tmp/70484.1.bigmem.q/rsh': No such file or directory
and the .e file contains
segmentation fault without information about routines.
All seems ok but suddenly the program stop.
All calculated values are ok but stop in the middle of a function JJ21(nu,aexp). This function is called by IR1Gl(nu,aexp) and this function is called by cpm_chem_noneq(...).
I'm compilig the c program with mpicc -g -w -c.
The output of ramses.pe is
rm: cannot remove `/tmp/70484.1.bigmem.q/rsh': No such file or directory
and the .e file contains
segmentation fault without information about routines.
# Compilation time parameters
NDIM = 3
NPRE = 8
SOLVER = hydro
PATCH =
EXEC = ramses
# --- MPI, ifort syntax, additional checks -----------
F90 = mpif90
FFLAGS = -O3 -g -traceback -fpe0 -ftrapuv -cpp -DNDIM=$(NDIM) -DNPRE=$(NPRE) -DSOLVER$(SOLVER) -DNOSYSTEM
#############################################################################
MOD = mod
#############################################################################
# MPI librairies
LIBMPI =
#LIBMPI = -lfmpi -lmpi -lelan
LIBS = $(LIBMPI)
#############################################################################
# Sources directories are searched in this exact order
VPATH = $(PATCH):../$(SOLVER):../hydro:../pm:../poisson:../amr
#############################################################################
# All objects
MODOBJ = amr_parameters.o amr_commons.o random.o pm_parameters.o pm_commons.o poisson_parameters.o poisson_commons.o hydr
o_parameters.o hydro_commons.o coolinghd.o cooling_module.o bisection.o
AMROBJ = read_params.o init_amr.o init_time.o init_refine.o adaptive_loop.o amr_step.o update_time.o output_amr.o flag_ut
ils.o physical_boundaries.o virtual_boundaries.o refine_utils.o nbors_utils.o hilbert.o load_balance.o title.o sort.o coo
ling_fine.o units.o
# Particle-Mesh objects
PMOBJ = init_part.o output_part.o rho_fine.o synchro_fine.o move_fine.o newdt_fine.o particle_tree.o add_list.o remove_li
st.o star_formation.o sink_particle.o feedback.o
# Poisson solver objects
POISSONOBJ = init_poisson.o phi_fine_cg.o interpol_phi.o force_fine.o multigrid_coarse.o multigrid_fine_commons.o multigr
id_fine_fine.o multigrid_fine_coarse.o gravana.o boundary_potential.o rho_ana.o output_poisson.o
# Hydro objects
HYDROOBJ = init_hydro.o init_flow_fine.o write_screen.o output_hydro.o courant_fine.o godunov_fine.o uplmde.o umuscl.o in
terpol_hydro.o godunov_utils.o condinit.o hydro_flag.o hydro_boundary.o boundana.o read_hydro_params.o synchro_hydro_fine
.o
# All objects
AMRLIB = $(MODOBJ) $(AMROBJ) $(HYDROOBJ) $(PMOBJ) $(POISSONOBJ)
#############################################################################
ramses: $(AMRLIB) ramses.o
$(F90) $(FFLAGS) $(AMRLIB) ramses.o -o $(EXEC)$(NDIM)d $(LIBS)
#############################################################################
coolinghd.o:coolinghd.c
mpicc -c -g -w coolinghd.c
#############################################################################
%.o:%.f90
$(F90) $(FFLAGS) -c $^ -o $@
#############################################################################
clean :
rm *.o *.$(MOD)
#############################################################################
If you something wrong please tell me.
I included the .c part.
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - jpprieto
Now I'm testing with mpi working with 1 processor.
All seems ok but suddenly the program stop.
All calculated values are ok but stop in the middle of a function JJ21(nu,aexp). This function is called by IR1Gl(nu,aexp) and this function is called by cpm_chem_noneq(...).
I'm compilig the c program with mpicc -g -w -c.
The output of ramses.pe is
rm: cannot remove `/tmp/70484.1.bigmem.q/rsh': No such file or directory
and the .e file contains
segmentation fault without information about routines.
All seems ok but suddenly the program stop.
All calculated values are ok but stop in the middle of a function JJ21(nu,aexp). This function is called by IR1Gl(nu,aexp) and this function is called by cpm_chem_noneq(...).
I'm compilig the c program with mpicc -g -w -c.
The output of ramses.pe is
rm: cannot remove `/tmp/70484.1.bigmem.q/rsh': No such file or directory
and the .e file contains
segmentation fault without information about routines.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - krahn@niehs.nih.gov
My guess is that the 'rm' is a failure in the MPI handler, where 'rsh' is a script to launch one of the processes. Have you sucessfully run another MPI program using your current MPI build? Make sure that some test programs can run. It may all be an MPI configuration problem.
When I run a test code (ramses unmodified) the code run in a good way for the following flags:
F90 = mpif90 -O3
FFLAGS = -cpp -DNDIM=$(NDIM) -DNPRE=$(NPRE) -DSOLVER$(SOLVER) -DNOSYSTEM
But with the following flags:
F90 = mpif90
FFLAGS = -O3 -g -traceback -fpe0 -ftrapuv -cpp -DNDIM=$(NDIM) -DNPRE=$(NPRE) -DSOLVER$(SOLVER) -DNOSYSTEM
the program doesn't run.
The error output is
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
ramses3d 0000000000456000 getnborgrids_ 545 nbors_utils.f90
Stack trace terminated abnormally.
And the ramses .pe shows
again
rm: cannot remove `/tmp/70490.1.bigmem.q/rsh': No such file or directory
Thank you.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page