auto-parallelization

david_livshin · ‎08-26-2009

Hi,

I am looking for examples of parallelizable code that can not be auto-parallelized by the Intel's compiler.

Thank you,

David

http://www.dalsoft.com

Mike_Rezny · ‎08-26-2009

Quoting - david.livshin

Hi,

I am looking for examples of parallelizable code that can not be auto-parallelized by the Intel's compiler.

Thank you,

David

http://www.dalsoft.com

Hi David,
1: any loops with a subroutine or function call in them.
The trick then is to get the compiler to inline the call, then the compiler might attempt to
auto-parallelize.

2: any loops with dependencies, there are quite a few variants.
The most common are loops with the following in them:
a(j) = ... a(j+1)

3: any loops with CDEC$ noparallel

Beware when using -parallel and -O3 on code with nested loops, because the compiler may interchange loops and then it is not exactly clear where the compiler directive needs to be placed. You have to look very carefully at the opt-report and decipher what the compiler is doing.

regards
Mike
where

david_livshin · ‎08-26-2009

Quoting - Mike Rezny

1: any loops with a subroutine or function call in them.
The trick then is to get the compiler to inline the call, then the compiler might attempt to
auto-parallelize.

2: any loops with dependencies, there are quite a few variants.
The most common are loops with the following in them:
a(j) = ... a(j+1)

1. In order to parallelize loops with a subroutine or function call in them, it is not necessary to inline the call - it is sufficient for a call to be thread safe. The auto-parallelizer I developed treats this succesfully. On my web site, two examples of the code I parallelized have function calls inside the loops being parallelized. Do you know if icc can parallelize these code?

2. loop with dependencies in your example, seems to be inherently not-parallelizable.

--
David Livshin

http://www.dalsoft.com

Mike_Rezny · ‎08-26-2009

Quoting - david.livshin

1. In order to parallelize loops with a subroutine or function call in them, it is not necessary to inline the call - it is sufficient for a call to be thread safe. The auto-parallelizer I developed treats this succesfully. On my web site, two examples of the code I parallelized have function calls inside the loops being parallelized. Do you know if icc can parallelize these code?

2. loop with dependencies in your example, seems to be inherently not-parallelizable.

Hi David,
I seem to have not understood your email. You asked for loops that the Intel fortran compiler would not parallelize.
Perhaps you meant paralellisable loops that the Fortran compiler would not parallelise.

As far as Iknow,ifort -parallel will not parallelize ANY loop with a function or subroutine call in it.
In the past, I have had to get the compiler to inline the call (-ip or -ipo). Once that is done, the compiler can determine if the inlined code is threadsafe or not.

Or, put explicit OpenMP directives around it after ensuring that the called subroutine is threadsafe.

I just tried it and even with !DEC$ PARALLEL before a loop containing a subroutine, the compiler reported that the loop was not parallelized: existance of parallel dependence.

regards
Mike

Mike_Rezny · ‎08-27-2009

Quoting - david.livshin

1. In order to parallelize loops with a subroutine or function call in them, it is not necessary to inline the call - it is sufficient for a call to be thread safe. The auto-parallelizer I developed treats this succesfully. On my web site, two examples of the code I parallelized have function calls inside the loops being parallelized. Do you know if icc can parallelize these code?

2. loop with dependencies in your example, seems to be inherently not-parallelizable.

--
David Livshin

http://www.dalsoft.com

Hi David,
I finally found the reference in the Compiler docs I was looking for:

The compiler can only effectively analyze loops with a relatively simple structure. For example, the compiler cannot determine the thread safety of a loop containing external function calls because it does not know whether the function call might have side effects that introduce dependences. Fortran90 programmers can use the PURE attribute to assert that subroutines and functions contain no side effects. You can invoke interprocedural optimization with the -ipo (Linux* OS and Mac OS X) or /Qipo (Windows) compiler option. Using this option gives the compiler the opportunity to analyze the called function for side effects.

I tried declaring a PUREexternal subroutine in a module, but I still could not get the compiler to parallelize the loop
containing the subroutine call using 11.0.074

regards
Mike