topic Re: Intra vectorization and Intel Linux FORTRAN 90 compiler in Intel® Fortran Compiler

Intra vectorization and Intel Linux FORTRAN 90 compiler

scottrwth — Tue, 13 Sep 2005 15:39:40 GMT

Dear Sirs,

My apoligies in advance if this question has already been posed.
If I am not mistaken, apparently the INTEL processor has intra-vectorization possibilities. I am not sure how this works
because I am also told that unless we are dealing with a CRAY,
processors with vector registers are no longer made (?)

At any rate, this vectorization can be triggered when using the
PORTLAND compiler using the -fastsse option. In general, this
is the sse option and variations thereof...

Question is: what is that intra-vectorization for the Linux
FORTRAN 90 compiler?

The intra-vectorization possibilities is a hardware feature of
the chip itself and so I would think this should be accessible
regardless of the language or operating system (in principle).

Any feedback would be greatly appreciated

Tony Scott
RWTH-Aachen

Re: Intra vectorization and Intel Linux FORTRAN 90 compiler

Steven_L_Intel1 — Tue, 13 Sep 2005 20:09:00 GMT

Welcome to the forum, Tony.

Yes, the Intel compilers can perform vectorization when you specify that you are compiling for one or more of the Intel processor types which include vector instructions. There are three generations of vector instructions in our IA-32 processors. SSE (Streaming SIMD Extensions) was introduced with Pentium III and primarily dealt with integers. SSE2 was introduced with Pentium 4 and adds floating types, while SSE3, the newest, was introduced with more recent Pentium 4 and Xeon processors. You use the -x switch to specify processor generation, for example, -xP enables generation of SSE3 instructions. See the Intel Fortran Compiler Compiler Options reference for details on the switches.

Intel Itanium processors also offer vectorization and our compilers for Itanium use this feature.

For more information, refer to the optimization sections of the Intel Fortran Optimizing Applications manual, as well as extensive general discussion of the SSE instructions on the Intel web site.

Re: Intra vectorization and Intel Linux FORTRAN 90 compiler

scottrwth — Tue, 13 Sep 2005 23:11:23 GMT

Dear Sir,

Many thanks for the info. I have checked the -xp with the
-vec_report1 or -vec_report3 to see what was vectorized and
what wasn't. It seems that all my loops for e.g. setting
a matrix or array to zero were vectorized.

There is an important loop I am trying to vectorized but
it involves a call to a subroutine, i.e.:

Do 1000 i=1,N
call UMD2SO(....)
1000 continue

Even though each subroutine call is independent, the loop
is NOT vectorized.

Any advice?

Is it a matter of putting the vectorization block inside
the subroutine UMD?

best wishes

Tony Scott

Re: Intra vectorization and Intel Linux FORTRAN 90 compiler

Steven_L_Intel1 — Wed, 14 Sep 2005 00:23:37 GMT

A subroutine call will prevent vectorization. You may want to split this into two loops, one which does the computations and one which does the subroutine calls.

Re: Intra vectorization and Intel Linux FORTRAN 90 compiler

TimP — Wed, 14 Sep 2005 00:23:52 GMT

If your subroutine has no loop in it, and you are looking to enable vectorization by in-line expansion in the calling program, you would require the -ipo option (for separate files) or -ip (inline within same file). Pushing the DO loop inside the subroutine may be more satisfactory.

Re: Intra vectorization and Intel Linux FORTRAN 90 compiler

Intel_C_Intel — Wed, 14 Sep 2005 02:11:58 GMT

Hi Tony,

As an additional comment, since you use the term intra vectorization, I guess you meant intra-register vectorization, a term I used a few years ago to distinguish vectorization for multimedia extensions from vectorization for traditional vector processors, like the Cray (the term SIMDizationseems to havebecome more popular, however). An online tutorial forvectorization can be found at IDS at:
http://www.intel.com/cd/ids/developer/asmo-na/eng/65774.htm
and, after reading this, you want to know more on the background of vectorization for multimedia extensions, you may want to consider reading The Software Vectorization Handbook at:
http://www.intel.com/intelpress/sum_vmmx.htm
or some other publications at:
http://www.aartbik.com/pub.html

Aart Bik
http://www.aartbik.com/

Message Edited by abik on 09-16-2005 12:16 PM

Re: Intra vectorization and Intel Linux FORTRAN 90 compiler

scottrwth — Wed, 14 Sep 2005 17:42:17 GMT

Dear Sirs,

Again my apologies. I found out through a test example
taht if I use -ip or -ipo, then the loop does vectorize
after all even if it has a subroutine or function call.

best wishes

Tony

Re: Intra vectorization and Intel Linux FORTRAN 90 compiler

Steven_L_Intel1 — Thu, 15 Sep 2005 01:51:45 GMT

Aart contacted me offline to let me know that my descriptions of the various Intel processor vectorization features was somewhat "off".

SSE was single-precision, SSE2 was double precision and SSE3 extended features from SSE2.

Itanium processors don't have vectorization, per se, but the architecture there is very different and the compiler can use various features such as rotating registers to do a lot of computations in fewer cycles.