- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Recent versions of ifort have been optimizing the following loop (s255 from netlib.org vector benchmark), by use of omp simd:
x= b(n)
y= b(n-1)
!$omp simd
do i= 1,n
a(i)= (b(i)+x+y)*.333
y= x
x= b(i)
enddo
For a longer time, the older Intel specific directive
!dir$ simd firstprivate(x,y)
has been doing the job. I was somewhat uneasy about the omp simd, since the standard doesn't support any firstprivate, and the dir$ simd may give bad results if issued without including firstprivate.
The vectorization requires some kind of peeling to break the circular dependency. This can be written out, but it looks nicer, and performs better. when done by the directives.
I see a hint in OpenMP 4.5 standard that the new ordered clause may be required in this situation, but I wonder if I am understanding this correctly.
I don't know why the official benchmark uses such an inaccurate approximation for 1./3. but it's not relevant to the question about omp simd.
Link Copied
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page