- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I compiled a openmp program with intel fortran successfully, and this is my macros and Makefile:
macros
1 .IGNORE:
2 RM = rm -f
3 COMPILER90= ifort -openmp #-fbounds-check
4 FREESOURCE= -free #-ffree-form
5 F90FLAGS = -c -I/home/2015051066/soft/netcdf/include
6 MODFLAG =
7 LDFLAGS = -O3
8 CPP = cpp
9 CPPFLAGS = -P -traditional
10 #F90FLAGS = -c -I/cm/shared/apps/netcdf/gcc/64/4.5.0/include
11 #F90FLAGS = -c -I/cm/shared/uaapps/netcdf/fortran/gcc/4.4.4/include
12 #F90FLAGS = -c -I/home/u4/niug/netcdf-fort-4.4.4/include
13 RM = rm -f
14 RM = rm -f
15 RM = rm -f
16 RM = rm -f
17 RM = rm -f
Makefile
1 # Makefile
2 #
3 #INCLUDES=
4 .SUFFIXES:
5 .SUFFIXES: .o .f
6
7 include ../macros
8
9 MAINOBJ1 = ../IO_code/Noah_driver.o
10
11 OBJS = \
12 ../Noah_code/module_Noahlsm.o \
13 ../Noah_code/module_Noahlsm_param_init.o \
14 ../Noah_code/module_Noahlsm_utility.o \
15 ../Noah_code/module_date_utilities.o \
16 ../IO_code/module_Noah_NC_output.o \
17 ../IO_code/module_Noahlsm_gridded_input.o
18
19 CMD = Noah
20 all: $(CMD)
21
22 Noah: $(OBJS) $(MAINOBJ1)
23 @echo ""
24 $(COMPILER90) -o $(@) $(OBJS) $(MAINOBJ1) -L/home/2015051066/soft/netcdf/lib -lnetcdff -lnetcdf
25 # $(COMPILER90) -o $(@) $(OBJS) $(MAINOBJ1) -L/cm/shared/apps/netcdf/gcc/64/4.5.0/lib -lnetcdff -lnetcdf
26 # $(COMPILER90) -o $(@) $(OBJS) $(MAINOBJ1) -L/cm/shared/uaapps/netcdf/fortran/gcc/4.4.4/lib -lnetcdff -lnetcdf
27 @echo ""
28
29 # This command cleans up
30
31 clean:
32 rm ../IO_code/*mod ../IO_code/*.o
33 rm ../Noah_code/*mod ../Noah_code/*.o
34 rm $(CMD)
When I run the model with the pbs script:
1 #!/bin/sh -f
2 #PBS -N test
3 #PBS -m n
4 #PBS -l mem=1gb
5 #PBS -l nodes=1:ppn=28
6 nprocs=`wc -l < $PBS_NODEFILE`
7 cd $PBS_O_WORKDIR
8 #date
9 #mpirun -genv I_MPI_DEVICE ssm -np $nprocs -hostfile $PBS_NODEFILE $PBS_O_WORKDIR/test.sh
10 #cd /home/2014011989/noahmp/Run
11 /usr/bin/time ./Noah >&out
12 #date
13
The error is:
1 mdt, minute = 01010000
2 INPUT LANDUSE = USGS
3 LANDUSE TYPE = USGS FOUND 27 CATEGORIES
4 INPUT SOIL TEXTURE CLASSIFICAION = STAS
5 SOIL TEXTURE CLASSIFICATION = STAS FOUND 19 CATEGORIES
6 successful initialize general model parameters
7 ------------- successful reading surface data ------------------
8 30
9 0.3950000 0.4100000 0.4340000 0.4760000 0.4850000
10 0.4390000 0.4040000 0.4640000 0.4650000 0.4060000
11 0.4680000 0.4680000 0.3950000 1.000000 0.2000000
12 0.4210000 0.4680000 0.2000000 0.3390000 0.0000000E+00
13 0.0000000E+00 0.0000000E+00 0.0000000E+00 0.0000000E+00 0.0000000E+00
14 0.0000000E+00 0.0000000E+00 0.0000000E+00 0.0000000E+00 0.0000000E+00
15 fini=arbitrary initialization
16 ------------- successful initialization ------------------
17 1 1980 1 1 1 0 1
18 -------------------------------------------
19 READFORC: opening /home/2014011989/noahmp/Noah_data/forcings/1980/1980010100.nc
20 The model is losing(-)/gaining(+) fake water
21 ERRWAT = -5.000397
22 ix,iy,WA,END_WB,BEG_WB,PRCP*DT,ECAN*DT,EDIR*DT,ETRAN*DT,RUNSRF*DT,RUNSUB*DT
23 The model is losing(-)/gaining(+) fake water
24 forrtl: severe (40): recursive I/O operation, unit -1, file unknown
25 �^H~[^@^@^@^@^@~@Y~W^@^@^@^@^@^@^@^@^@^@^@^@^@^@4.02user 2.11system 0:13.07elapsed 46%CPU (0avgtext+0avgdata 150056maxresiden t)k
26 22032inputs+0outputs (0major+51064minor)pagefaults 0swaps
~
Is the problem from my wrong script? How to solve it?
Thank you very much!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does the error also appear when not using OpenMP? The message says it is recursive I/O, which is pretty obvious. So where does it happen in your code? You haven't shown any of your code. You should rather direct that question to the maintainer of the code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When I use GFortran to compile the code, it does not show these error. The maintainer of the code actually used GFortran to compile the code, but I want to try intel FORTRAN. You said the error is obvious, I don't understand what you meant. Can you tell me more?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Recursive I/O is when you start an I/O operation on a unit when one is already in progress. The standard says:
An input/output statement that is executed while another input/output statement is being executed is a recursive
input/output statement. A recursive input/output statement shall not identify an external unit that is identified
by another input/output statement being executed except that a child data transfer statement may identify its
parent data transfer statement external unit.
It may be that gfortran allows this as an extension. As noted, you have not shown your code so we don't know what your program is doing. Typically one sees this error if you reference a function in an I/O statement data list that also does I/O on the same unit (PRINT in your case).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This one at least
write(88,'(4i5,9f10.2)')iyloop,imloop,idloop,itime,cosz(ix,iy)& ,soldn(ix,iy),lwdn(ix,iy),sfctmp(ix,iy)& ,uu(ix,iy),prcp(ix,iy)*dt,q2(ix,iy)*1000,sfcprs(ix,iy)& ,co2air(ix,iy)/sfcprs(ix,iy)*1.e6
is inside an $OMP PARALLEL DO environment! And there maybe more as this is not the complete code. Again, do you see the problem without OpenMP?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What do you mean without OpenMP? The code I got just uses OpenMP.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OpenMP is a set of pragmas (instruction for you how to distribute workers of your code on different threads). As you might have noted these statements all appear after exclamation marks, hence are just comments according to the Fortran standard. If you compile your code without openmp flags, it will be executed serially. Again my question: does it work then?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am checking the write statements now! But I want to compile it as a parallel program! If I do not use -openmp option, it will not be a parallel program, right?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sure, then it will not be parallel, but before you run your program parallel you have to make sure that the serial version is correct, preferrably with more than compiler. Then you can start running it parallel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I set up: export OMP_THREAD_NUM=1 and delete the write stament. However, when I ran the model, it still shows the error:
The model is losing(-)/gaining(+) fake water
ERRWAT = -5.000022
ix,iy,WA,END_WB,BEG_WB,PRCP*DT,ECAN*DT,EDIR*DT,ETRAN*DT,RUNSRF*DT,RUNSUB*DT
71 85 19 4510.00 5131.98 5137.00 0.0000 0.0000 0.0220 0.0000 0.0000 0.0000 0.0000 0.2453 0.3159 0.3588
forrtl: severe (40): recursive I/O operation, unit -1, file unknown
forrtl: severe (40): recursive I/O operation, unit -1, file unknown
0X��{+) fake water
forrtl: severe (28): CLOSE error, unit 30, file "Unknown"
*** Error in `./Noah': free(): invalid pointer: 0x00002ba2522051db ***
forrtl: severe (40): recursive I/O operation, unit -1, file unknown
forrtl: severe (40): recursive I/O operation, unit -1, file unknown
Image PC Routine Line Source
Stack trace terminated abnormally.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This seems to be from this code, https://ral.ucar.edu/solutions/products/noah-multiparameterization-land-surface-model-noah-mp-lsm, at least loosely, but there are nowhere any OpenMP statements in the build instructions. Did you modify the original instructions, or the maintainers of the code? It would be a good idea to ask them about the portability of the code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That is a good idea! Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I/O done in parallel threads may very well trigger this problem. If you really need to do I/O in a parallel section, enclose it in a critical section.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am not sure about I/O code because I have checked all write statements in parallel regions. But its still! If I export OMP_NUM_THREADS=1, it works, but it is not a parallel run.
I attached all the code including OMP statements. Hope it can give more information
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I wonder how I can find recursive errors in the code about I/O code in parallel regions because I have checked all OMP statements and did not find any inside them. Is there some code or program to check it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
! *** EXAMPLE of critical secion for write statement within parallel region if(ix == int(nx/2) .and. iy == int(ny/2) ) then !$omp critical(write_88) write(88,'(4i5,9f10.2)')iyloop,imloop,idloop,itime,cosz(ix,iy)& ,soldn(ix,iy),lwdn(ix,iy),sfctmp(ix,iy)& ,uu(ix,iy),prcp(ix,iy)*dt,q2(ix,iy)*1000,sfcprs(ix,iy)& ,co2air(ix,iy)/sfcprs(ix,iy)*1.e6 !$omp end critical(write_88) endif
*** note, the write statements will not be made in nested loop index order as they would for a serial (single threaded) application.
You can use named critical sections, as used above, a write to unit 88 can overlap with a write to some other unit. You can use any arbitrary name that is not associated with any other global name (e.g. subroutine name, ...). An alternate typical name might be the name portion of the file being written as opposed to the unit number as used above.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a question about nested loop index. Do I have to wirte critical statements for nested loops that are not inside the OMP parallel region?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
!$omp parallel do private(j)
do i = 1, n ! the loop iteration/control variable of a parallel do is implicitly private
do j = 1, m ! this interior loop iteration/control variable within a parallel region has no implicitness to it
! The private(j) clause dictates this variable is to remain private
Note, while j may get registerized in one loop and thus have no adverse interactions amongst threads, in other places j may have a context located in the scope outside the parallel region and thus may exhibit a conflict of use. Use of private clause eliminates the conflict usage.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I solved my problem by add !$OMP CRITICAL statement in a subroutine that does not have any OMP statements. Actually, this subroutine is called by a module that does not have any OMP statements. And this module is called in a OMP parallel region. Although I think this module is not the cause, what I did worked.......
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, the CRITICAL section was the correct solution. Even though the subroutine itself had no OpenMP directives, that it was called from a parallel region triggered the error.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page