- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi,

I have tested the compiler regarding vectorization and I get the following weird problem. In the following code

double test80_c(double* x, double* y, int n, int nb_loops) { double sum{ 0.0 }; double a{ 0.0 }; for (int i = 0; i < nb_loops; ++i) { a += 1.0; for (int k = 0; k < n; ++k) { sum += std::sqrt(x*x + y *y + a); } } return sum; } double test80_cpp(const std::vector<double>& x, const std::vector<double>& y, int nb_loops) { double sum{ 0.0 }; double a{ 0.0 }; for (int i = 0; i < nb_loops; ++i) { a += 1.0; for (std::size_t k = 0; k < x.size(); ++k) { sum += std::sqrt(x *x + y *y + a); } } return sum; }

the first version (C one) gets vectorized but not the second one. Note that the following Fortran code

function test80_f(x, y, n, nb_loops) bind(c) use iso_c_binding real(dp), dimension(1:n), intent(in) :: x real(dp), dimension(1:n), intent(in) :: y integer, intent(in) :: n integer, intent(in) :: nb_loops real(dp) :: test80_f ! local variables integer :: i, k real(dp) :: a test80_f = 0.0_dp a = 0.0_dp do i = 1, nb_loops a = a + 1.0_dp do k = 1, n test80_f = test80_f + sqrt(x(k)**2 + y(k)**2 + a) end do end do end function test80_f

does not get vectorized either. Could you reproduce that on your compiler ? I am using icpc 15.0.0 20140716 under Mac OSX.

Best regards,

Francois

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Same thing with this code. The C version and the corresponding Fortran version do not get vectorized, but the C++ version does.

double test30_c(double* x, int n, int nb_loops) { double sum{ 0.0 }; double a{ 0.0 }; for (int i = 0; i < nb_loops; ++i) { a += 1.0; for (int k = 0; k < n; ++k) { sum += x+ a; } } return sum; } double test30_cpp(std::vector<double>& x, int nb_loops) { double sum{ 0.0 }; double a{ 0.0 }; for (int i = 0; i < nb_loops; ++i) { a += 1.0; for (std::size_t k = 0; k < x.size(); ++k) { sum += x + a; } } return sum; }

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Francois,

I tried your code on Mac OSX* and I was unable to reproduce your findings. I tested the code you provided with this command:

icpc test.cpp -O2 -std=c++11 -S -vec-report3

My file, test.cpp, includes your functions: test80_c, test80_cpp, test30_c, and test30_cpp. In the vectorization report for this code I saw the loops getting vectorized in all functions. If your question comes from the message: "loop was not vectorized: inner loop was already vectorized" then you can look to the inner loop to see where the vectorization occurred. This is was tested with the Intel® 15.0 Compiler.

Thank you,

Richard

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

What was ICL vectorization report ?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

@RICHARD A.

I received "page cannot be found " when I tried to download optimization report.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Sorry about that, not sure why it's not working. I'll post the contents of the file here:

Begin optimization report for: main()

Report from: Vector optimizations [vec] LOOP BEGIN at /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/c++/v1/vector(899,20) inlined into test.cpp(60,23) remark #15344: loop was not vectorized: vector dependence prevents vectorization. First dependence is shown below. Use level 5 report for details remark #15346: vector dependence: assumed OUTPUT dependence between __end_ line 1685 and __end_ line 897 LOOP END LOOP BEGIN at /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/c++/v1/vector(899,20) inlined into test.cpp(61,23) remark #15344: loop was not vectorized: vector dependence prevents vectorization. First dependence is shown below. Use level 5 report for details remark #15346: vector dependence: assumed OUTPUT dependence between __end_ line 1685 and __end_ line 897 LOOP END LOOP BEGIN at /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/c++/v1/vector(442,5) inlined into test.cpp(68,1) remark #15414: loop was not vectorized: nothing to vectorize since loop body became empty after optimizations LOOP END LOOP BEGIN at /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/c++/v1/vector(442,5) inlined into test.cpp(68,1) remark #15414: loop was not vectorized: nothing to vectorize since loop body became empty after optimizations LOOP END =========================================================================== Begin optimization report for: test30_c(double *, int, int) Report from: Vector optimizations [vec] LOOP BEGIN at test.cpp(7,5) remark #15542: loop was not vectorized: inner loop was already vectorized LOOP BEGIN at test.cpp(9,9) <Peeled> LOOP END LOOP BEGIN at test.cpp(9,9) remark #15399: vectorization support: unroll factor set to 8 remark #15300: LOOP WAS VECTORIZED remark #15442: entire loop may be executed in remainder remark #15448: unmasked aligned unit stride loads: 1 remark #15475: --- begin vector loop cost summary --- remark #15476: scalar loop cost: 10 remark #15477: vector loop cost: 28.000 remark #15478: estimated potential speedup: 2.580 remark #15479: lightweight vector operations: 6 remark #15480: medium-overhead vector operations: 1 remark #15488: --- end vector loop cost summary --- LOOP END LOOP BEGIN at test.cpp(9,9) <Remainder> remark #15301: REMAINDER LOOP WAS VECTORIZED LOOP END LOOP BEGIN at test.cpp(9,9) <Remainder> LOOP END LOOP END =========================================================================== Begin optimization report for: test30_cpp(std::__1::vector<double, std::__1::allocator<double>> &, int) Report from: Vector optimizations [vec] LOOP BEGIN at test.cpp(19,5) remark #15542: loop was not vectorized: inner loop was already vectorized LOOP BEGIN at test.cpp(21,39) <Peeled> LOOP END LOOP BEGIN at test.cpp(21,39) remark #15399: vectorization support: unroll factor set to 8 remark #15300: LOOP WAS VECTORIZED remark #15442: entire loop may be executed in remainder remark #15448: unmasked aligned unit stride loads: 1 remark #15475: --- begin vector loop cost summary --- remark #15476: scalar loop cost: 9 remark #15477: vector loop cost: 28.000 remark #15478: estimated potential speedup: 2.350 remark #15479: lightweight vector operations: 6 remark #15480: medium-overhead vector operations: 1 remark #15488: --- end vector loop cost summary --- LOOP END LOOP BEGIN at test.cpp(21,39) <Remainder> remark #15301: REMAINDER LOOP WAS VECTORIZED LOOP END LOOP BEGIN at test.cpp(21,39) <Remainder> LOOP END LOOP END =========================================================================== Begin optimization report for: test80_c(double *, double *, int, int) Report from: Vector optimizations [vec] LOOP BEGIN at test.cpp(31,5) remark #15542: loop was not vectorized: inner loop was already vectorized LOOP BEGIN at test.cpp(33,9) <Peeled> LOOP END LOOP BEGIN at test.cpp(33,9) remark #15399: vectorization support: unroll factor set to 4 remark #15300: LOOP WAS VECTORIZED remark #15442: entire loop may be executed in remainder remark #15448: unmasked aligned unit stride loads: 2 remark #15475: --- begin vector loop cost summary --- remark #15476: scalar loop cost: 62 remark #15477: vector loop cost: 92.000 remark #15478: estimated potential speedup: 2.620 remark #15479: lightweight vector operations: 13 remark #15480: medium-overhead vector operations: 1 remark #15488: --- end vector loop cost summary --- LOOP END LOOP BEGIN at test.cpp(33,9) remark #25460: No loop optimizations reported LOOP END LOOP BEGIN at test.cpp(33,9) <Remainder> remark #15301: REMAINDER LOOP WAS VECTORIZED LOOP END LOOP BEGIN at test.cpp(33,9) <Remainder> LOOP END LOOP END =========================================================================== Begin optimization report for: test80_cpp(const std::__1::vector<double, std::__1::allocator<double>> &, const std::__1::vector<double, std::__1::allocator<double>> &, int) Report from: Vector optimizations [vec] LOOP BEGIN at test.cpp(44,5) remark #15542: loop was not vectorized: inner loop was already vectorized LOOP BEGIN at test.cpp(46,39) <Peeled> LOOP END LOOP BEGIN at test.cpp(46,39) remark #15399: vectorization support: unroll factor set to 4 remark #15300: LOOP WAS VECTORIZED remark #15442: entire loop may be executed in remainder remark #15448: unmasked aligned unit stride loads: 2 remark #15475: --- begin vector loop cost summary --- remark #15476: scalar loop cost: 60 remark #15477: vector loop cost: 92.000 remark #15478: estimated potential speedup: 2.540 remark #15479: lightweight vector operations: 13 remark #15480: medium-overhead vector operations: 1 remark #15488: --- end vector loop cost summary --- LOOP END LOOP BEGIN at test.cpp(46,39) remark #25460: No loop optimizations reported LOOP END LOOP BEGIN at test.cpp(46,39) <Remainder> remark #15301: REMAINDER LOOP WAS VECTORIZED LOOP END LOOP BEGIN at test.cpp(46,39) <Remainder> LOOP END LOOP END ===========================================================================

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page