Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Compiler bug in version 17 taskloop directive

Ahmed_E_2
Beginner
574 Views
I have a program that uses the following OpenMP directives together: taskloop and collapse. The program works fine when I compile with GCC 7.2, but it simply crashes when I compile with Intel compiler v 17.0.1. I investigated my code and I created a simple program that reproduces the cause of the crash. The code is below int main () { double * a=(double *) malloc(100*sizeof(double)); printf("1. a %x\n",a); #pragma omp taskloop collapse(3) shared(a) for(int i=0;i<1;i++) { for(int j=0;j<1;j++) { for (int k=0;k<1;k++) { printf("2. a %x\n",a); double sum=0; for(int l=0;l<5;l++) { sum+=1; } printf("%f\n",sum); } } } free(a); } For gcc, the output is correct 1. a 11610a0 2. a 11610a0 i 0 j 0 k 0 5.000000 For Intel v 17, the output is wrong 1. a d2e010 I compiled the same code with Intel version 18, the output is correct and similar to GCC.7.2 1. a 202f010 2. a 202f010 i 0 j 0 k 0 5.000000 In another complex code, I also found that the value of the pointers that are shared (like the Pointer "a" in the code above) are always changed to be zero also sometimes the indices (like i, j, k in the code above) of the collapsed loops have large negative numbers? My question Is there any fix for such bugs for version 17?
0 Kudos
1 Solution
jimdempseyatthecove
Honored Contributor III
574 Views

Ahmed,

For now, I would suggest a conditional compile directive to expand either the collapse loops or the manually collapsed loops. It makes it easier to later locate the sections of code to update once compiler bug is corrected. IOW based on compiler vendor/version #define BUG_COLLAPSE or something like that.

Pick a define that is self explanatory.

Jim Dempsey

View solution in original post

0 Kudos
6 Replies
Viet_H_Intel
Moderator
574 Views

 

I am afraid that this problem will be addressed in 17.0. Can you use 18.0 instead?

Thanks,

Viet

0 Kudos
jimdempseyatthecove
Honored Contributor III
574 Views

taskloop binds to the current team. In the above code, there is no current team. Try:

int main ()
{
double * a=(double *) malloc(100*sizeof(double));
printf("1. a %x\n",a);
#pragma omp parallel
{
#pragma omp master
{
#pragma omp taskloop collapse(3) shared(a)
for(int i=0;i<1;i++)
{
for(int j=0;j<1;j++)
{
for (int k=0;k<1;k++)
{
printf("2. a %x\n",a);
double sum=0;
for(int l=0;l<5;l++)
{
sum+=1;
}
printf("%f\n",sum);
}
}
}
}
}
free(a);
}

Jim Dempsey

0 Kudos
Ahmed_E_2
Beginner
574 Views

I think the taskloop directive binds tasks to the existing threads in the context. In this case, it would be one thread which is the main thread. Anyways in the original code, I had #pragma omp parallel and #pragma omp single wrapping the #pragma omp taskloop, and I have the same problem when using the Intel compiler version 17. I tested the proposed solution by jimdempseyatthecove in the simple form of my code as below and it did not work.

#include<stdio.h>
#include <stdlib.h>
int main ()
{
  double * a=(double *) malloc(100*sizeof(double));
  printf("1. a %x\n",a);
        #pragma omp parallel
        #pragma omp single 
        #pragma omp taskloop collapse(2) shared(a) 
        for(int i=0;i<1;i++)
        {
                for(int j=0;j<1;j++)
                {
                        for (int k=0;k<1;k++)
                        {
                                printf("2. a %x i %d j %d k %d \n",a, i, j, k);
                                double sum=0;
                                for(int l=0;l<5;l++)
                                {
                                        sum+=1;
                                }
                                printf("%f\n",sum);
                        }
                }
        }

}

 

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
574 Views

>> In this case, it would be one thread which is the main thread

However, OpenMP had not been initialized in the original post. It is unknown if omp taskloop operation requires the initialization of OpenMP runtime system. For example the task enqueuing system and/or other structures used by taskloop.

It appears that the issue relates to collapse(2).

Are i and j correct when you reach the 3rd for?

I suspect that one or both are trashed.

Jim Dempsey

0 Kudos
Ahmed_E_2
Beginner
574 Views

Yes. With the newer version of Intel Compiler (V.18). I thought the problem of taskloop has gone. In fact, I discovered that collapse was the original source of the problem. I mean at the third level, the indices are trashed as you (jimdempseyatthecove) expected. Moreover, I tried another simple code that uses #pragma omp parallel for and collapse(5). I  got floating point exception although I was only using integer data types. I believe the way the compiler collapse the loops is the problem. For curiosity, I may investigate the generated assembly.

To conclude, the best way is to collapse loops by yourself especially if they are more than 2 levels. Am I correct? 

0 Kudos
jimdempseyatthecove
Honored Contributor III
575 Views

Ahmed,

For now, I would suggest a conditional compile directive to expand either the collapse loops or the manually collapsed loops. It makes it easier to later locate the sections of code to update once compiler bug is corrected. IOW based on compiler vendor/version #define BUG_COLLAPSE or something like that.

Pick a define that is self explanatory.

Jim Dempsey

0 Kudos
Reply