Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Bug for firstprivate variables in omp tasks

Christof_Soeger
Beginner
741 Views
I think firstprivate variables are not correctly handled when creating an omp task.

I attached an example where a std:vector is used as a firstprivate variable.

At the beginning the vector is filled with
0 1 0 2
then a thread is started

after that the vector is changed to
0 0 -1 2
and another thread starts

both threads do some dummy computation and then print the vector to cout, here is the result:

original hyp1:
0 1 0 2
original hyp2:
0 0 -1 2
in task1 hyp:
0 0 -1 2
in task2 hyp:
0 0 -1 2

As you can see both task have the data of the changed vector. This happens also for other data types, but apperently not for simple typs like int.

I'm using Intel Composer XE for Windows in MS Visual Studio

Am I right that this is a bug?
0 Kudos
10 Replies
Om_S_Intel
Employee
741 Views
You are forcing that the code is to run using single thread. If I remove the "#pragma omp single", the initialization is done outside the firstprivate context.

Nothing wrong with compiler.
0 Kudos
Andreas_Klaedtke
Beginner
741 Views
I am not sure if this is the whole story. Although, I am confused by the tasks being declared inside the single pragma as well, it seems to work. So, there is not only 1 task being executed.

I am not sure if I fully understand this at the moment, but tests show that N threads are started with the beginning of the parallel section. Then it goes into the single mode and executes everything in there, including the setup of the two tasks.
After the single section has ended, the tasks seem to come to life and do their thing which seems to fail. I get an exception being thrown and sometimes a segmentation fault.

This is with the version 11.1 20100414 of icpc on linux 64 bit.


But the firstprivate should properly construct C++ objects (through calling the copy constructor). This also works in "normal" circumstances (e.g. parallel for).
0 Kudos
Andreas_Klaedtke
Beginner
741 Views
Christof,

I think this is an interesting problem you found.

Have a look at the following code snippet:
[cpp]int main(int argc, char* argv[])
{	
  vector hyp;
  hyp = vector(4,0);
  hyp[1]=1; hyp[3]=2;

  cout << report_thread() << hyp << endl;

#pragma omp parallel firstprivate(hyp)
  {
    cout << report_thread() << hyp << endl;

    if (omp_get_thread_num() == 0) {
      hyp[0] = 9;
    }

#pragma omp for firstprivate(hyp)
    for (int i = 0; i < 10; ++i) {
      cout << report_thread() << hyp << 't' << i << endl;
    }
      
  }

  return 0;
}[/cpp]


This code produces the following compiler error:
[bash]t2.cpp(39): error: variable "hyp" in firstprivate or lastprivate clause of an OpenMP pragma must be shared in the enclosing context
  #pragma omp for firstprivate(hyp)
  ^

compilation aborted for t2.cpp (code 2)[/bash]
And this seems to be sensible, because it would not be well defined which version of hyp to use to initialise the "for local" hyp variable. Only shared, "global" variables seem to be sensible.

If you remove the firstprivate statement from the parallel section in line 9, it will compile and run.

In your case, the compiler does not report an error, but compiles something that is, at least to me, not well defined.
No idea which hyp object is used to copy construct the task private hyp objects.

I hope this might shed some light on the problem.
0 Kudos
Andreas_Klaedtke
Beginner
741 Views
I should have tried this in the first place as well:
There is another issue, I do not fully understand. If you replace the pragma omp for in line 17 in the posting above with an #pragma omp task firstprivate(hyp), it will compile but abort with a segmentation fault if you execute it.

This is independent of defining the hyp variable local in the parallel section in the first place.

If the task is started without the firstprivate(hyp), it works if hyp is global in the first place, but segfaults if it was declared firstprivate in the parallel section.

Strange...

0 Kudos
Christof_Soeger
Beginner
741 Views
To make a parallel / single region to create tasks it the standard way of using tasks. The example posted is just a demo for the problem I had with my program. There I want to go over a list of vectors and do the work for each vektor parallel, and since only a single thread should go over the list I need the single pragma. Without the single each thread in the team executes the complete code, so on my maschine with 4 cores the I get the same incorrect output, just 4 times

Compare also with Example A.13.3c form the OpenMP 3.0 specifications.
typedef struct node node; struct node { int data; node * next; }; void process(node * p) { /* do work here */ } void increment_list_items(node * head) { #pragma omp parallel { #pragma omp single { node * p = head; while (p) { #pragma omp task // p is firstprivate by default process(p); p = p->next; } } }
Here is how it works: the parallel starts a team of threads, the single ensures that only one thread executs the while loop (we don't want to iterate over the list with each thread) the task pragma then creates a new task which can be assigned to ANY thread in the team of the enclosing parallel region. I need something very similay but I reduced the test example to the critical point.
0 Kudos
Christof_Soeger
Beginner
741 Views
"I am not sure if I fully understand this at the moment, but tests show that N threads are started with the beginning of the parallel section. Then it goes into the single mode and executes everything in there, including the setup of the two tasks.
After the single section has ended, the tasks seem to come to life and do their thing which seems to fail."

That is how it should work, well the execution of the tasks could start instantly by annother thread of the team, but that doesn't matter here.


"
But the firstprivate should properly construct C++ objects (through calling the copy constructor). This also works in "normal" circumstances (e.g. parallel for)."

Exactly, it should be the same for tasks.
Also compiled with g++ on linux everything works fine (forgot to mention it earlier)



"In your case, the compiler does not report an error, but compiles something that is, at least to me, not well defined.
No idea which hyp object is used to copy construct the task private hyp objects."

In a parallel region shared is default, so in my example there is only one hyp in the region. I also tested to set shared(hyp) explicitly, but it didn't change anything.
0 Kudos
Andreas_Klaedtke
Beginner
741 Views
I have unfortunately never used the task feature of OMP3 before, so this is just a lack of knowledge on my side. I agree that the code is perfectly fine. And I can also verify that gcc does the right thing.

There is only one point in the OMP3 which might be important here:
Page 61 of the OpenMP API document states in line 15-18 that the programmer has the responsibility to "ensure that storage does not reach the end of its lifetime before the explicit task region completes its execution.".
So it might be possible that a barrier or other means of synchronisation is necessary in your example.

I have added some diagnostic output to the examples to view the pointers of the objects and it seems to me that the Intel compiler does not copy construct the projects properly.

0 Kudos
Christof_Soeger
Beginner
741 Views
The mentioned API note is about _shared_ storage, not private data (but it is really importent, you can easily make mistakes there)

I have also done more tests and I also come to the conclusion the data is not properly copyed using the copy constructor.
(Changing the values of hyp in one task changes the values in the other task)


EDIT:
I submitted a problem report. Thanks for looking into it Andreas!
0 Kudos
Andreas_Klaedtke
Beginner
741 Views
Well, I learnt something today. So thank you for this.
0 Kudos
Christof_Soeger
Beginner
741 Views

Just as a follow up, the bug was confirmed and resolved by the intel team after i have reported it to the support.

0 Kudos
Reply