- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could somebody please give some idea how the intel c++ compiler deals with parallelizing nested loops.
And let me know how to deal with nested loops, in case of applying parallelization with '-parallel'.
Because no matter what changes i make, to make the loops parallelizable, the compiler is throwing some remarks saying 'loop cannot be parallelized' because of FLOW, ANTI, OUTPUT dependencies between statements.
Thanks in advance.
Regards
Kiran.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Tim... for the reply..
Here i am giving the sample program with complete compiler remarks output.
As i mentioned in the other post an example program which i'm giving here for your reference:
I have a program to reverse a string
ex: "Its a nice intel compiler"(I/P) -> "compiler intel nice a Its"(O/P)
The logic is to reverse the words in the string first in the 1st nested loop and then reversing the complete string as a whoel in the second loop.
I am trying to compile the code for this with intel c++ compiler (icpc) with '-parallel' option. Below is the code:
with the command:
icpc -parallel -par-report=3 stringrev_.cpp
1 #include
2 #include
3
4 char s[]="proud to be indian";
5 char temp;
6
7 int main()
8 {
9 int i=0, j=0, k=0;
10 int size = strlen(s);
11 int a[] = {0,6,9,12};
12 int b[] = {4,7,10,17};
13
14 /**********************THE ERROR BLOCK***************************/
15 //#pragma nounroll
16 while(i<4)
17{
18 k = b;
19 j = a;
20 //#pragma nounroll
21 while(j
23 temp = s
24 s
25 s
26 k--;
27 j++;
28 }
29 i++;
30 }
31/*************************END BLOCK*****************************/
33 i=0;
34 j = size-1;
35/***********************NO ERRORS SHOWN*****************/
36 while(i < j)
37 {
38 temp = s;
39 s = s
40 s
41 i++;
42 j--;
43 }
/****************************END***************************/
I am facing some compiler errors as :
procedure: main
stringrev_.cpp(16): (col. 2) remark: loop was not parallelized: existence of parallel dependence.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 25, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 24, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 25, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 24, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 25, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 25, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 25, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 25, and s line 25.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 24, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 25, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 24, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven OUTPUT dependence between s line 25, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 24, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 24.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven OUTPUT dependence between temp line 23, and temp line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven OUTPUT dependence between temp line 23, and temp line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 24.
stringrev_.cpp(24): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 24, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 23.
stringrev_.cpp(23): (col. 5) remark: parallel dependence: proven ANTI dependence between s line 23, and s line 25.
stringrev_.cpp(25): (col. 5) remark: parallel dependence: proven FLOW dependence between s line 25, and s line 23.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've gone through the document Compiler optimizations, I could see that there are some optimizations that compiler can do on a nested loop as below:
- Loop Interchange
I think this option should not be applied here, cause that doesnt make sense to interchange in this case.
- Unrolling
I have tried with the pragma '#pragma nounroll' before the nested loop starts, but it is giving the same problem. So I am thinking this is not applied by compiler.
- Cache Blocking
No Idea if this is applicable here.
- Loop Distribution
I think this is not applicable here, cause outer loop has only 4 iterations.
- Loop Fusion
Shouldnt be applied
What do you think about the above, if any one of them could have applied?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your code is not performing what your comments state.
Your comments state to the effect reverse word order
Your first while loop is performing keep word order same, swap letter order within word.
Which is it?
Parallizing a loop of a few bytes within each word (byte reversal of words) is hardly worth parallization.
Parallizing a loop of per word, byte reversal within word is worth parallization.
For word swap (no byte reversal) you wold required an extra buffer of at least the size of the largest word (+/- one byte) or use larger second buffer. Not doing so will result in overstriking words in process of being read.
a bb ccc ... zzzzzzzzzzzzzzzzzzzzzzzzzz
1) z bb ccc ... zzzzzzzzzzzzzzzzzzzzzzzzza
Notice you wacked last letter of last word, when you complete new 1st word, it will contain overstruck letters of last word (not prior contents of last word).
Jim Dempsey
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page