- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear All,
I have a parallelized fortran code which has unusual behavior when compiled using latest intel compiler, with OpenMP enabled. The code can be compiled, but when I run the code, it is extreme slow and the simulation crashes at the first timestep. Absolute something is wrong.
However, the same code has been seriously tested on different platform and I just wonder if I need to do some special configuration when using the latest intel fortran compiler.
Below is my test history:
Linux workstation-1 Intel 2014 Works
Linux workstation-2 GFortran 5 Works
Windows workstation-3 Intel 2013 Works
Linux cluster-4 GFortran 7 Works
Linux cluster-4 Intel2017/2018 Fails
The makefile looks like below. For Intel2017/2018, I changed -fopenmp to -qopenmp.
FC = ifort
FFLAGS = -fopenmp -O3
FPPFLAGS = -DLINUX -DRELEASE -DOPENMP
This is the first time I switch to Intel 2018 compiler and I just wonder if I need special configuration.
Thanks,
Danyang
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Without the code it is really hard to say something. What do you mean by "the simulation crashes"? Can it be that it runs out of memory?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Among the possibilities are that your application violates OpenMP in some way which was not exposed by earlier compilers, or that there is actually a new bug. If you can't figure it out, you could submit a case to online service center.
I mentioned already a case (using both reduction and lastprivate clauses in one directive) of which I'm uncertain. This first failed with the new compiler.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Juergen R. wrote:
Without the code it is really hard to say something. What do you mean by "the simulation crashes"? Can it be that it runs out of memory?
The code is running, but it is much slower than sequential version and it does not converge. I think the code is actually not compiled as expected for the openmp related function.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tim P. wrote:
Among the possibilities are that your application violates OpenMP in some way which was not exposed by earlier compilers, or that there is actually a new bug. If you can't figure it out, you could submit a case to online service center.
I mentioned already a case (using both reduction and lastprivate clauses in one directive) of which I'm uncertain. This first failed with the new compiler.
Thanks, Tim. There is no directive that use reduction and lastprivate at the the same time. Acutally there is only one lastprivate in the code and this part is not used in my test case. I will make more test and then submit to service center for help if there is still problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tim P. wrote:
Among the possibilities are that your application violates OpenMP in some way which was not exposed by earlier compilers, or that there is actually a new bug. If you can't figure it out, you could submit a case to online service center.
I mentioned already a case (using both reduction and lastprivate clauses in one directive) of which I'm uncertain. This first failed with the new compiler.
Today, I tested the latest 2019 version and the code can be compiled and run without problem in release mode, but still cause convergence problem in debug mode. The problem sounds like the OpenMP related code is not parsed correctly when I include some functions, even though these functions are not called when the code is running. I have reported this problem to service center with source code and example.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>> but still cause convergence problem in debug mode.
While this can be explained by a compiler bug, it can more often be explained by a poorly written convergence code. By this I mean writing convergence test using literal constants (e.g. an epsilon) as opposed to using a runtime determination of what the epsilon should be. This runtime determination can vary depending upon floating point optimization levels.
It is possible that for all these years, the convergence code work by accident as opposed to by design.
Usually, convergence issues, tend to be the obverse of what you are experiencing. IOW convergence work in Debug, but not in Release. Your experience is peculiar. The support center may be able to determine the underlying cause.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
jimdempseyatthecove wrote:
>> but still cause convergence problem in debug mode.
While this can be explained by a compiler bug, it can more often be explained by a poorly written convergence code. By this I mean writing convergence test using literal constants (e.g. an epsilon) as opposed to using a runtime determination of what the epsilon should be. This runtime determination can vary depending upon floating point optimization levels.
It is possible that for all these years, the convergence code work by accident as opposed to by design.
Usually, convergence issues, tend to be the obverse of what you are experiencing. IOW convergence work in Debug, but not in Release. Your experience is peculiar. The support center may be able to determine the underlying cause.
Jim Dempsey
Hi Jim,
I am afraid this convergence problem is caused by incorrect code parse. Take the following code section for example.
=========Code section===========
#ifdef OPENMP
!$omp do schedule(static, chunk)
#endif
do ivol = 1,nngl !loop over control volumes
#ifdef OPENMP
tid = omp_get_thread_num() + 1
#else
tid = 1
#endif
...
!c code test to check if OpenMP schedule is correct, output to temporatory file 1000+tid
write(1000+tid,*) "tid",tid,"ivol",ivol
#ifdef USG
if (discretization_type > 0) then
grad_locs(ivol) = gradient_dd_green_gauss_tri(ivol)
end if
#endif
end do
=========End of code Section===========
If the code is compiled with "USG" part, even though the simulation case does not use this part (when discretization_type == 0), the code still crashes because of race condition. The race condition is because the loop "do ivol = 1,nngl" does not run as expected when using Intel XE 2017/2018.
For example if nngl is 100 and chunk is 25, when run this part using 4 threads, the correct output for each threads from GFortran, XE2013 and XE2019 is
tid 0 ivol 1
tid 0 ivol 2
...
tid 0 ivol 25
tid 1 ivol 26
tid 1 ivol 27
...
tid 1 ivol 50
tid 2 ivol 51
tid 2 ivol 52
...
tid 2 ivol 75
tid 3 ivol 76
tid 3 ivol 77
...
tid 3 ivol 100
However, Intel XE 2017/2018 gives the following wrong results
tid 0 ivol 1
tid 0 ivol 2
...
tid 0 ivol 100
tid 1 ivol 1
tid 1 ivol 2
...
tid 1 ivol 100
tid 2 ivol 1
tid 2 ivol 2
...
tid 2 ivol 100
tid 3 ivol 1
tid 3 ivol 2
...
tid 3 ivol 100
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In your parallel region you write to tid, which defaults to being a shared variable. You must identify the per-thread context variables.
!$omp do schedule(static, chunk) private(tid) ! Note, the do loop control variable is implicitly private.
If you have other private variables, add them to the private clause.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
jimdempseyatthecove wrote:
In your parallel region you write to tid, which defaults to being a shared variable. You must identify the per-thread context variables.
!$omp do schedule(static, chunk) private(tid) ! Note, the do loop control variable is implicitly private.
If you have other private variables, add them to the private clause.
Jim Dempsey
Sorry for confusing. I actually have added these variables to the private clause. Just forgot to copy these lines here. It not because of this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The behavior you gave for the second example is what you would see if the !$omp do was not processed.
BTW, presumably, prior to the sample code presented, you have an expanded !$omp parallel.
You should be aware that:
!$...
statements do not need the conditional compile #ifdef OPENMP enclosures. When compiled without OpenMP option, they are comments.
My guess is, for some reason the compiler is performing:
... !$omp parallel ... ... ! *** not seen *** !$omp do... ...
Try something like:
=========Code section=========== !$omp parallel ... !$omp do schedule(static, chunk) private(tid) do ivol = 1,nngl !loop over control volumes tid = 1 !in event of non-OpenMP !$ tid = omp_get_thread_num() + 1 !overstrike in event of OpenMP ...
Note, with any compiler optimization enabled, the tid = 1 will be removed.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
jimdempseyatthecove wrote:
The behavior you gave for the second example is what you would see if the !$omp do was not processed.
BTW, presumably, prior to the sample code presented, you have an expanded !$omp parallel.You should be aware that:
!$...
statements do not need the conditional compile #ifdef OPENMP enclosures. When compiled without OpenMP option, they are comments.
My guess is, for some reason the compiler is performing:
... !$omp parallel ... ... ! *** not seen *** !$omp do... ...Try something like:
=========Code section=========== !$omp parallel ... !$omp do schedule(static, chunk) private(tid) do ivol = 1,nngl !loop over control volumes tid = 1 !in event of non-OpenMP !$ tid = omp_get_thread_num() + 1 !overstrike in event of OpenMP ...Note, with any compiler optimization enabled, the tid = 1 will be removed.
Jim Dempsey
Hi Jim,
Thanks any way. Unfortunately, this does not solve the problem I have. I will update once I get a solution from support center.
Danyang
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page