Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

optimization with xW vectorization

jgoldswo
Beginner
1,261 Views
Having problems using the vectorization switch with ifc 7.1 on a server with P4 Xeon processors running MPI. The following section of code is the problem:
1 do j=1,totpln
2 write (istt,13) j,plnlabel(j),notests(j),bmave
3 write (istt,41) (siave(i,j),i=1,notd)
4 write (istt,42) (ckave(i),i=1,notd)
5 write (istt,15) bmasd
6 write (istt,41) (siasd(i,j),i=1,notd)
7 write (istt,42) (ckasd(i),i=1,notd)
8 enddo
when compiled without the vectorization switch (-xW) the code runs perfect, however when compiled with the -xW switch an "Address Error" occurs. Also when lines 4 & 7 or 3 & 6 are removed the program runs perfect with the xW switch. Does ifc have any problems vectorizing this loop (if it does at all) and what could be the possible cause of the runtime error. Any help would be fantastic.
Cheers, Jason

Message Edited by jgoldswo on 04-06-2004 05:25 AM

0 Kudos
7 Replies
gagan_saket
Beginner
1,262 Views
Hi,

I am trying a simple code like this in fortran

do ii= 1, 1000
temp = t(ii)
enddo

This loop cannot be vectorized. I am using Intel fortran compiler version 8.1, on Intel Xeon CPU 2.66GHz with HT.

Actually I was even stuck in something like the following not vectorizable

do ii= 1, 1000
t(ii)=temp
enddo
in th emiddle of my code, where t can be a big array of real.
0 Kudos
Steven_L_Intel1
Employee
1,262 Views
How would you expect to vectorize the first loop? It could be replaced simply by:

temp = t(1000)

The second loop is not vectorizeable either. It's just a lot of memory loads. Vectorization comes into play when there's computation being done on array elements, not just moving data around.
0 Kudos
gagan_saket
Beginner
1,262 Views
Well I think It was my fault that i put the qustion incorreclty,

actually the proper loop is

do ii=1,1000
temp=t(ii)
temp=temp+some calcuations
cp(ii)=temp
enddo
0 Kudos
Steven_L_Intel1
Employee
1,262 Views
Context is everything. A lot depends on what the "some calculations" were. If you ask for a vectorization report, the compiler will give you more information, perhaps, as to why it did not vectorize. Also, on IA-32, I think you also have to use -O3 to vectorize.
0 Kudos
Intel_C_Intel
Employee
1,262 Views

Dear Gagan,

I have to correct Steve here. First, vectorization is enabled with any of the xKNWBP switches, -O3 is optional for higher performance, but not required. Second, the fill loop simply vectorizes:

do ii= 1, 1000
t(ii) = temp
enddo

>ifort -xP joho.f
joho.f(6) : (col. 7) remark: LOOP WAS VECTORIZED.

The generated assembly uses 128-bit wide stores to fill the array. Finally, your example

do ii=1,1000
temp=t(ii)
temp=temp+some calculations
cp(ii)=temp
enddo

may are may not vectorize, depending on what some calculations are. Here the switch vec-report2may give you more insights in potential complications. Let me know.

As for Jasons posting, I only just noticed this now. A loop with write statements cannot be vectorized, so any bug you encounter must be elsewhere (the xW switch enables a lot more than just vectorization). If you still have the problem, please give some more context, or file the bug to Premier Support.

Aart Bik
http://www.aartbik.com/

0 Kudos
TimP
Honored Contributor III
1,262 Views
I think Aart meant -vec_report3 to get the diagnostics about why loops don't vectorize. -vec_report2 is on by default, and reports only those which do vectorize.

If you see bugs with -xW in an old compiler, it would be well worth while to try a compiler update posted within the last month.
0 Kudos
Intel_C_Intel
Employee
1,262 Views

No, I really meant vec-report2. Folks, please get your facts straight before advising our customers!

The setting vec-report1 is on by default and reports successful vectorization. The setting vec-report2, in addition, reports reasons of failure. The switch vec-report3 is only useful in you want more information on prohibiting data dependences.

So:

do ii= 1, 1000
t(ii) = t(ii-1) * 2
enddo

=> ifort -xP vec-report2 joho.f
joho.f(5) : (col. 7) remark: loop was not vectorized: existence of vector dependence.

=> ifort -xP vec-report3 joho.f
joho.f(5) : (col. 7) remark: vector dependence: proven FLOW dependence between T line 6, and T line 6.
joho.f(5) : (col. 7) remark: loop was not vectorized: existence of vector dependence.

If you are interested in a silent compilation, please use vec-report0. All this, and more, can be found in:

http://www.intel.com/cd/ids/developer/asmo-na/eng/65774.htm

0 Kudos
Reply