- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would like to understand why the report of the loop starts with the message "... was not vectorized" and right inside the loop it there's another loop, with the message "loop was vectorized". As I understand, the inner loop is treating itself... Does anyone have a clue?
Is it really that the compiler nested the loop (530, 6) inside itself?
...
LOOP BEGIN at suktmig2d_OpenMP.c(530,6) inlined into suktmig2d_OpenMP.c(302,3) remark #25399: memcopy generated remark #15542: loop was not vectorized: inner loop was already vectorized remark #25015: Estimate of max trip count of loop=8 LOOP BEGIN at suktmig2d_OpenMP.c(530,6) inlined into suktmig2d_OpenMP.c(302,3) remark #15389: vectorization support: reference datalo[k-?] has unaligned access [ suktmig2d_OpenMP.c(531,7) ] remark #15389: vectorization support: reference *(*(lowpass+nc*8)+(k+?-1)*4) has unaligned access [ suktmig2d_OpenMP.c(531,21) ] remark #15381: vectorization support: unaligned access used inside loop body remark #15305: vectorization support: vector length 8 remark #15309: vectorization support: normalized vectorization overhead 1.000 remark #15300: LOOP WAS VECTORIZED remark #15450: unmasked unaligned unit stride loads: 1 remark #15451: unmasked unaligned unit stride stores: 1 remark #15475: --- begin vector cost summary --- remark #15476: scalar cost: 4 remark #15477: vector cost: 0.750 remark #15478: estimated potential speedup: 4.000 remark #15488: --- end vector cost summary --- remark #25015: Estimate of max trip count of loop=3 LOOP END LOOP BEGIN at suktmig2d_OpenMP.c(530,6) inlined into suktmig2d_OpenMP.c(302,3) <Remainder loop for vectorization> remark #25015: Estimate of max trip count of loop=24 LOOP END LOOP END
...
- Tags:
- CC++
- Development Tools
- Intel® C++ Compiler
- Intel® Parallel Studio XE
- Intel® System Studio
- Optimization
- Parallel Computing
- Vectorization
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't think a definitive answer could be given without at least a working example. With inlining even then it may be obscure.
It is usual that an outer loop doesn't vectorize when the more useful inner loop vectorization is achieved. Memset is taken where the compiler judges it preferable to inline vectorization.
If this loop takes enough time to be worth further effort at optimizing, the comments about alignment may be the most important hints to be taken from the report. For example, you might be able to assert alignment if you can be assured of it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have this behavior even without inlining. It is like a loop inside the same loop.
Alignment is not possible because sometimes the loop is accessed beginning at index 0 others at 1 and so forth.
LOOP BEGIN at suktmig2d_OpenMP.c(530,6) remark #25399: memcopy generated remark #15542: loop was not vectorized: inner loop was already vectorized remark #25015: Estimate of max trip count of loop=8 LOOP BEGIN at suktmig2d_OpenMP.c(530,6) remark #15389: vectorization support: reference datalo[k-?] has unaligned access [ suktmig2d_OpenMP.c(531,7) ] remark #15389: vectorization support: reference *(*(lowpass+nc*8)+(k+?-1)*4) has unaligned access [ suktmig2d_OpenMP.c(531,21) ] remark #15381: vectorization support: unaligned access used inside loop body remark #15305: vectorization support: vector length 8 remark #15309: vectorization support: normalized vectorization overhead 1.000 remark #15300: LOOP WAS VECTORIZED remark #15450: unmasked unaligned unit stride loads: 1 remark #15451: unmasked unaligned unit stride stores: 1 remark #15475: --- begin vector cost summary --- remark #15476: scalar cost: 4 remark #15477: vector cost: 0.750 remark #15478: estimated potential speedup: 4.000 remark #15488: --- end vector cost summary --- remark #25015: Estimate of max trip count of loop=3 LOOP END LOOP BEGIN at suktmig2d_OpenMP.c(530,6) <Remainder loop for vectorization> remark #25015: Estimate of max trip count of loop=24 LOOP END LOOP END
The source code is in my github:
https://github.com/rodrigo-prado/kirchhoff-ccpe-2018/blob/master/CodigoKirchhoff/OpenMP/suktmig2d_OpenMP.c
The report is in:
Thanks for your answer!

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page