As the title, detailed codesare as follows:
double precision ff(1:Nz+1,1:Nr)
integer ir, ipar(128),tt_type
double precision dpar(3*nz/2+1)
type(dfti_descriptor), pointer :: handle
!init the variables
!parallel loop call TT transform routines
!$omp parallel do private(i,handle,ipar,dpar,ir)
if (ir.ne.0) then
write(*,*) "TT transform error number:",ir
CALL FREE_TRIG_TRANSFORM(handle,ipar,ir) "
When compile with -openmp option and run the program, DFT interface error is found because it gives " "TT transform error number: -1000". But if not compile with -openmp option, it runs ok.
If I put the commit routine in the parallel do region, it also give error information, the error number is -100.
The omp_threads number ipar(9) is not altered.
I have tried to declare ffprivate variable in OpenMP clause, but the same problem happened.
Anyone can give help is greatly appreciated.
I followed your suggestions and changed handle,ipar,dpar,ir as shared variables, set maximumthreads numbers,then no DFT interface error is found.But there are still something incorrect. That is,transform results ofsome specific columnsof ff are not right.
For example, if total number of columns of ff Nr=128 and the maximunnumber of thread ipar(9)is set to 4. Run the program with command "env OMP_NUM_THREADS=4 ./a.out". The transform results for i=1,33,65,97and randomly for i=2, i=34 or other columns are not correct. It seems that it's the problem with the first transform in the threads.
Should I call commit for each thread? If so, how to? Need I to declarehandle,ipar,dpar,ir the as private?
It's a simple test program new wrote, and it's about parallel calling TT staggered cosine routines. You can compile it with correct link to MKL library.
My environment is Intel Fortran Compiler 10.1, CMKL 9.0, the CPU is (XeonE5450/5430)*2, andjob manager isCHESS on Linux.
For the test program, I found the error was not the same as I'm found before. For example, if ipar(9) is set to 8, if I submit my job with 1 or 2 or 3 threads, it runs ok and the results is correct. But if 4 or more thread are used, the results is not correct. But I supposed to have maximum of 8 threads to use for my CPU.
Can you give a correct example to parallel call TT routines?
I have checked it. You are correct! In fortran the first element of array has 1 asdefault index, while in C, it's zero.
So thanks for your helpful suggestions in persistence.