I am a new user and I am running a code in a slurm controlled cluster. There I load Intel 2018 products, but they don't have openmp for me to load. So I downloaded it and installed in my home folder and now I would like to compile my code but redirecting the -qopenmp flag to my home folder.
Right now I compile with -qopenmp flag but when the program reaches the parallel region to do a matrix multiplication, it seems to ignore it and continue the program (I included a print command and it is not printing anything).
How do I solve this?
Many Thanks in advance
If your program does not crash with libiomp5.so not found, then the Intel OpenMP .so file was loaded. Therefore, you have some other issue with your program. It seams that the shell/mpiexec/xwindow/??? is not redirecting the console output to your terminal/workstation.
Try a simple program compiled with OpenMP
!$OMP END PARALLEL
END PROGRAM TEST
Once you get that working (output as expected and no crash) then you should be all set to use your application in a similar environment.
Dear Jim Demspey
Thank you very much for your reply. Well, It seems to be working correctly, I guess. If I set OMP_NUM_THREADS=5, It prints 5 numbers, from 0 to 4, but this is not the correct number of the thread the job is allocated. I am running this code in a cluster, with 64 threads and by looking at the output of the command "htop" I see that my code is not using the first 5 threads.
But when I try to do the OPENMP do loop, Nothing happens, as if this part of the code is ignored. This part of my code is:
im = dcmplx(0.d0,1.d0) dydt=dcmplx(0.d0,0.d0) !$OMP parallel default(private) shared (ha,dydt,y) !$OMP do do i=1,n do j=1,n dydt(i) = dydt(i) + ha(i,j) * y(j)/im end do end do !$OMP end do !$OMP end parallel
In this code (this part is a subroutine just to do a matrix multiplication), y and dydt are complex vectors of size 67662, and ha is a real square matrix of same size, all values with double precision. I define ha in an module and use it as global variable.
If I run this code by compiling without -qopenmp:
ifort -mkl -c -fpic -mcmodel=large global_param.f90;ifort -mkl -c -fpic -mcmodel=large dyn.f90; ifort -mkl -fpic -mcmodel=large global_param.o dyn.o -check bounds;
The code take some seconds to do this matrix multiplication and I have nonzero values in the dydt vector.
Now, if I compile with -qopenmp:
ifort -qopenmp -mkl -c -fpic -mcmodel=large global_param.f90;ifort -qopenmp -mkl -c -fpic -mcmodel=large dyn.f90; ifort -qopenmp -mkl -fpic -mcmodel=large global_param.o dyn.o -check bounds;
The I get only 0.d0 for dydt. If I put any print command inside the loop, nothing is printed. I tried to pause the code including a read(*,*) inside the loop, but it is also ignored.
What should I do now?
Many thanks in advance
The variable im is default private (and thus uninitialized). the variable im should be shared .OR. firstprivate.
Not sure why .../im produces (0.d0,0.d0) instead of junk.
Dear Jim Dempsey,
Thanks again for your reply.
im was defined as a parameter so the compiler complained to put it as shared. Then I tried with im not as parameter, and putting it both as private or shared it stills doesn't work. As I said, if I insert a line inside the parallel do loop to print something, nothing was printed.
But now, with this modification to include im as shared, I get a compilation error if I insert a write command.
If I do:
!$OMP parallel default(private) shared (ham,dydt,y,im) !$OMP do write(*,*) 'a' do i=1,n do j=1,n dydt(i) = dydt(i) + ham(i,j) * y(j)/im end do end do !$OMP end do !$OMP end parallel
i get the error:
dyn.f90(405): error #7644: The statement or directive following this OpenMP* directive is incorrect. !$OMP do ------^ compilation aborted for dyn.f90 (code 1) ifort: error #10236: File not found: 'dyn.o'
Dear Jim Dempsey,
Thank you again for your help. You are right, after a !$OMP do must come a normal do statement.
Now, I don't really understand why but I managed to make it work by changing:
!$OMP parallel default(private) shared (ham,dydt,y,im) !$OMP do do i=1,n do j=1,n dydt(i) = dydt(i) + ham(i,j) * y(j)/im end do end do !$OMP end do !$OMP end parallel
!$OMP parallel do shared(dydt) do i=1,n do j=1,n dydt(i) = dydt(i) + ham(i,j) * y(j)/im end do end do !$OMP end parallel do
Thank you very much for the support.
j may need to be private too. If you are going to use explicit shared and private, please include ham, y, im, and n too.
Note, j may have been registerized (or if the two loops were collapsed via optimization), the code may have worked as a consequence, but not work by design.
The reason why the first code in #7 failed to work properly is likely due to n being private (undefined).
The compiler behaves more like Alice then Humpty Dumpty:
"When I use a word," Humpty Dumpty said, in rather a scornful tone, "it means just what I choose it to mean—neither more nor less." "The question is," said Alice, "whether you can make words mean so many different things."