- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I'm using ifort to compile a scientific code with openmp parallelization, the relevant code's section is :
!$omp parallel do schedule(static,1)
do j = 2,n-1
do i = 2, m - 1
w(i,j) = 0.25 * (u(i-1,j)+u(i+1,j)+u(i,j-1)+u(i,j+1) )
end do
end do
!$omp end parallel do
where n and m are very big.
I compile with : ifort -o stencil -qopenmp -Ofast -fno-alias -qopt-report stencil.f90
The interesting part of report is this :
stencil.f90(#linewithstencil,23):remark #34055: adjacent dense (unit-strided stencil) loads are not optimized. Details: stride { 4 }, step { 8 }, types { F32-V128, F32-V128 }, number of elements { 4 }, select mask { 0x000000003 }.
stencil.f90(#linewithstencil,23):remark #34055: adjacent dense (unit-strided stencil) loads are not optimized. Details: stride { 4 }, step { 8 }, types { F32-V128, F32-V128 }, number of elements { 4 }, select mask { 0x000000003 }.
stencil.f90(#linewithstencil,23):remark #34055: adjacent dense (unit-strided stencil) loads are not optimized. Details: stride { 4 }, step { 8 }, types { F32-V128, F32-V128 }, number of elements { 4 }, select mask { 0x000000003 }.
stencil.f90#linewithstencil,23):remark #34055: adjacent dense (unit-strided stencil) loads are not optimized. Details: stride { 4 }, step { 8 }, types { F32-V128, F32-V128 }, number of elements { 4 }, select mask { 0x000000003 }.
Is compiler suggesting some improvement for perfomance ? I must say speedup is pretty bad, at least on my laptop. Thank you
Link Copied
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page