- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

I tried to adopt OpenMP to parallelize the matrix loading process. There are many elements that contribute to the matrix. I used omp parallel for to load the elements' contribution to the Jacobian in parallel. However, I got a very strange situation. If the thread number > 1, the matrix seems fine. But if I tried to use single thread by setting the environment variable to 1 or set the if condition to false, in some cases, the matrix got wrong values somewhere.

The pseudo code is as below:

[cpp]Element** elmAry = (Element**)calloc(size, sizeof(Element*)); //elmAry is setup somewhere. int idx; #pragma omp parallel for private(...) firstprivate(...) if (...) for (idx = 0; idx < size; ++idx) { Element* elm = elmAry[idx]; ... load elm's contribution to matrix ... } ...[/cpp]I just add the parallel for symtax to parallelize the original program. However, when the thread number is 1, the matrix got wrong values somewhere. I tried many cases, only few with large matrix size has this problem, it's hard to figure out what's going wrong by gdb. And of course, if the parallel for is removed, the matrix is correct. This problem occurs while using icc10.1.011. I tried icc11.1, this problem dismissed.

Is there anyone can help to tell what happens or what affects the program results in single thread mode with parallel mechanism. Thanks a lot...

Best Regards

YJ

Link Copied

3 Replies

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

If you can help us with a testcase then we can review the issue.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Thanks for the reply. I have discussed with my colleuges about this and they have this problems before, and... The conclusion we came up with is => it should be the optimization issue. Since our product is sensitive to the matrix values, some slightly modified code may result in different optimization binary. Thus some slightly different values will sum up to totally different results. (In my case, the modified code does not even executed, so compiled binary makes difference)

I tried to move the code up and down, write in other words (try to come up with different binary with the same meaning). Any way, the modified binary works fine for my case now. I know it's stupid to overcome the problem like my doing, but I got no other ideas currently. So, any comments will be welcome. Thanks...

Best Regards

YJ

ps: by the way, I can't provide the code and test case I suffered this problem, since they are owned by the company, and this problem occurs only for large cases, I didn't suffer this problem in small ones. So I didn't come up with small testcase right now.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

(or lack of copyin, etc...)

You may have a bug in your serial (1 thread parallel) code that was hidden until now.

Jim Dempsey

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page