- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I am running my program with parallel computing and the program needs to read many files within each thread. I save the i/o unit to different numbers and make sure the i/o units for each file are not in conflict for different threads.
However, when I run my program, sometimes the program will show errors below:
forrtl: No such file or directory
forrtl: severe (29): file not found, unit 5012707, file /'foler'/fort.5012707
This error might occur at different times and in different places. I think this is because the file name is changed, I am not sure why the model can open the correct file and connect to an i/o unit. Then, the i/o unit doesn't recognize the file name after that.
I attached my source code and the data that needed.
Can anyone help me with this? Thank you very much.
Best Regards
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can tell you that the filename in the error message is what you get when you have a READ or WRITE to unit 5012707 without OPENing it first. It might be that something closed that unit when it shouldn't.
Which compile options did you choose?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, Steve, I used 'ifort -openmp -o program.exe *.f *.f90 -g -traceback' to compile the program.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When your program prints the error message, it should also have given you a line number traceback, from which you can ascertain the statements(s) that failed to be executed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yeah, I have the them (shown below)
program 00000000005A2CB8 Unknown Unknown Unknown
program 0000000000483507 updateprofile_dai 185 updateprofile.f
program 000000000047F326 hyd_run_ 65 hyd_run.f
program 000000000047ED6F MAIN__ 22 hru_loop.f
libiomp5.so 00001514D0362623 Unknown Unknown Unknown
However, this error occurs at different times and in different places.
As Steve pointed out, I think it might be that something closed that unit. But I am not sure what closed the i/o unit. I cannot figure this out. This only happens with parallel computing and with many threads.
I tried a run with only two threads, the program went through. But two thread is too slow for me.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What version of the compiler are you using? There are recent fixes regarding I/O with OpenMP threads. The current release is 2021.3.0.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Barbara,
I am not sure what version I am using since I am compiling it in the supercomputer from my school. I will need to ask.
But do you think the errors occur because of the bug? I am not sure if my source code is correct (I think it is, but just not sure).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Barbara,
Our support told me the version I am using is:
icc (ICC) 19.0.4.235 20190416.
Do you have any suggestions?
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry he told me that is icc.
for ifort, it should be 'ifort (IFORT) 19.1.1.217 20200306'
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If there is a bug related to this, it will be in the run-time library and not the compiler. Sadly, the 2021 oneAPI installers do not update the run-time library (at least on Windows - not sure how it works on Linux.) You may need to install the separate "standalone" run-time installer from https://software.intel.com/content/www/us/en/develop/articles/oneapi-standalone-components.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Steve,
Do you think the errors occur because of the bug?
So you think I should know what runtime version I am using, then install another one from the website?
The runtime version for Linux are: APT, YUM and DNF, and Intel oneAPI Runtime Libraries, am I correct?
Thank you very much.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your code has some bugs, I think.
There are some local variables that are used before they have been set; there are a couple of instances of DO loops whose index variable is REAL.
It is likely that the presence of undefined variables may cause the program to abort in an unpredictable way. Finding and fixing these bugs is not going to be easy since your program is quite complex and a single run may take many hours, and creates, reads and writes thousands of files.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your suggestions, I will check my code again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I do not know if a bug in the run-time library is responsible for the behavior you see. I was just saying that the compiler itself is not involved.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I may be wrong, but it appears that there is a 2021 version of Hydrus that is available. Your version is noted as 2009, this means that a lot of errors in Fortran that were not necessarily picked up in compilers in 2009 are now being picked up. Also there are 12 years of bug fixes.
If the two Hydrus are from the same base code, when you publish any results, some one like me reviewing the paper will ask why you did not use the latest one.
Of course your Hydrus may be completely different, but it appears to be very similar.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi John,
This is the latest version I can obtain. I think the developer is not offering the source code of the 2021 version of HYDRUS. Do you know if I can download the latest 2021 version source code? Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Siyuliben: In addition to the bugs that I mentioned above, the source code that you posted may have another problem, which it shares with the other codes such as SWMS_2D, etc., written by the same group of authors.
This bug pertains to insufficient accuracy in the calculation of 1/tanh(x) - 1/x for small values of x when the FPU does not promote intermediate results to ten-byte reals, as the X87 did. This inaccuracy my affect subroutine Pecour in file solute.f90.
For details of this bug and a solution, see my post in the PC-Progress user forum.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Mecej4,
I am not sure if I could ask this question here, but I saw you also used gfortran to compile the source code.
I tried to use gfortran to compile the program with '-fopenmp' to see if gfortran will work. But when I run the program, it does not create any threads.
The environment was set to 64 using 'set omp_num_threads = 64'
I used 'gfortran -o program.exe *.f *.f90 -fbacktrace -fdollar-ok -fopenmp' to compile the file.
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I recommend that you work first on getting rid of the bugs in your source code, using whichever compiler/OS is most effective in catching and fixing bugs. Only after doing that does it make sense to consider using Openmp and other ways of parallelizing the program. I also suspect that your modifications to Hydrus create far more temporary files and do needless I/O to those files.
If your were to describe how you modified the Hydrus original sources and what your objectives are, it would be possible to give more constructive comments.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, I will fix the bug you mentioned above.
It didn't show errors without parallelizing the program. So I thought the source code was ok.
I will follow your instruction.
I didn't modify much about the original source code. I just changed the i/o units to make it work for parallelizing, and also take some variables out of the HYDRUS subroutine and use those variables to create new input files.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page