Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Unexpected return value of omp_get_thread_num() with tasks

saineyl
Beginner
518 Views

Hi,

I'm using icc version 13.0.0 (gcc version 4.1.2 compatibility) on linux.

I've created a simple example of a code I'm trying to use.

Running this code using gcc -fopenmp seems ok, but with icc -openmp the program is stuck.

With icc I get inconsistency with the thread numbers inside the for loop and inside the task_func() function.

An example for the output:

For loop: number of threads: 4
For loop: New thread created, id = 0
Enter task_func: omp_thread_num = 0
For loop: New thread created, id = 1
Enter task_func: omp_thread_num = 1
For loop: New thread created, id = 3
Enter task_func: omp_thread_num = 1
For loop: New thread created, id = 2
Enter task_func: omp_thread_num = 3

 (I get thread id 1 twice).

Any idea what the problem is?

Thanks,

The code:

#include <iostream>

#include "omp.h"

void task_func() {
    #pragma omp critical
    std::cout << "Enter task_func: omp_thread_num = " << omp_get_thread_num() << "\n";

    #pragma omp barrier
}

int main () {

    int threads_cnt = atoi(getenv("OMP_NUM_THREADS"));
    int task_i;

    std::cout << "For loop: number of threads: " << threads_cnt << "\n";

    #pragma omp parallel for
    for (task_i = 0 ; task_i < threads_cnt ; task_i++) {

          #pragma omp critical
          {
          std::cout << "For loop: New thread created, id = " << omp_get_thread_num() << "\n";
          }

          #pragma omp task
          task_func();
    }
}

0 Kudos
1 Reply
Feilong_H_Intel
Employee
518 Views

Hi saineyl,

From icc documentation: "When a thread encounters a task construct, a task is generated from the code for the associated structured block. The encountering thread may immediately execute the task, or defer its execution. In the latter case, any thread in the team may be assigned the task."  So, it is not guaranteed that each thread will get one task to work on.  Some threads may get one or more tasks, while others may get nothing.

I added a "sleep(1);" before #pragma omp task.  Sometimes, I got "perfect" result, which is everybody gets a task.  Most cases, I got something similar to this:

$ icpc -fopenmp t.cpp && ./a.out
For loop: number of threads: 8
For loop: New thread created, id = 0
For loop: New thread created, id = 1
For loop: New thread created, id = 2
For loop: New thread created, id = 3
For loop: New thread created, id = 4
For loop: New thread created, id = 5
For loop: New thread created, id = 6
For loop: New thread created, id = 7
Enter task_func: omp_thread_num = 0
Enter task_func: omp_thread_num = 1
Enter task_func: omp_thread_num = 2
Enter task_func: omp_thread_num = 3
Enter task_func: omp_thread_num = 4
Enter task_func: omp_thread_num = 2
Enter task_func: omp_thread_num = 4
Enter task_func: omp_thread_num = 0

$

 

 This is absolutely fine.  You make your tasks big enough like this

    #pragma omp critical
    {
        std::cout << "Enter task_func: omp_thread_num = " << omp_get_thread_num() << "\n";
        sleep(3);
    }

 

You will always get "perfect" task assignment as you expected.

Another thing to note is the barrier.  From the documentation: "Each thread that encounters this pragma must wait until all threads in the team have arrived. After the last thread of the team arrives, all threads are released and may continue execution of the enclosing parallel region."

Since some threads might get nothing to work on, they will not go into the task_func() and will not hit #pragma omp barrier.  That would be a problem, and your program might hang for ever.

Thanks.

0 Kudos
Reply