Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

__restrict promotes incorrect vectorization, ICL 16.0

TimP
Honored Contributor III
414 Views

submitted as IPS 6000134001

s4113_ ( float * __restrict a, float * b,
        float * c,  int * ip) {

.......

      for (int i = 1; i <= i__2; ++i)
          a[ip] = b[ip] + c;

.....

icl -c -QxHost -Qopt-report4 s4113.cpp

ICL  16.0.0.110 vectorizes this (HSW laptop), although there is no way for the compiler to know whether changing the order of memory access will break it.  Besides, the vector code is slower, contrary to what opt_report says, apparently as a result of the cost of the scalar code being over-stated (as comparison with the case of CEAN notation shows).

So now it is dangerous to insert the __restrict or use the equivalent -Qrestrict extension, even though the arrays in fact don't overlap.

This vectorization might be expected if #pragma ivdep or #pragma simd were asserted, meaning that the programmer asserts there will be no repeated elements in ip[] and maybe doesn't care if vectorization is slower.  CEAN notation might imply such an assertion.

0 Kudos
11 Replies
Kittur_G_Intel
Employee
414 Views

Hi Tim,
I tried both on Linux* as well on Windows with the latest version (On windows, it's 16.0.0.110 Build 20150815 like you mention). It doesn't vectorize and I do get the below output:
------------------------------------------------------------------------------------------
Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par]

Non-optimizable loops:

LOOP BEGIN at C:\Users\kganesh1\Documents\tim\main.c(5,28)
   remark #15522: loop was not vectorized: loop control flow is too complex. Try using canonical loop form
LOOP END
===========================================================================

It does vectorize when I use #pragma simd or ivdep of course as you noted. Can you attach your screen shot with the compiler version output and the system info etc., and may be the preprocessed file so we can reproduce as I am not able to, thx

_Kittur

 

 

0 Kudos
TimP
Honored Contributor III
414 Views

Kittur,

Now I'm not permitted to sign in from Windows, but I am signed on on linux.  Something to do with routing paths, I guess.

[tim@tim-wsm net]$ icpc -V
Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 16.0.0.109 Build 20150815
Copyright (C) 1985-2015 Intel Corporation.  All rights reserved.


The linux compiler does report "seems inefficient" with a variety of -m settings, while the Windows one makes that report for SSE2 but for SSE4.1, AVX, or AVX2, it wants to vectorize.  I have to remove the __restrict to get a (correct) report of (possible) dependency.  As you can see, my big linux box is an old WSM, so I can't test the AVX version.

The upload server isn't responding, so I can't attach a pre-processed file or screenshot.  We got the warning you are planning a shutdown in 8 hrs.

If IPS comes up soon, I will attach it there.

Thanks,

Tim

0 Kudos
TimP
Honored Contributor III
414 Views

I was permitted to upload the preprocessed to ips from Linux.  Now I'm posting from Android.  So I guess only selected services were taken down early.

0 Kudos
Kittur_G_Intel
Employee
414 Views

Hi Tim,
Well, that's interesting as I can't reproduce on windows as well with SSE4.1, AVX or AVX2 and behaves just like that of icc on Linux. Also, I didn't see any issue in IPS or may be another support engineer might have taken that issue and will address accordingly. BTW, Tim, what's the issue number in IPS you attached the preprocessed file?
Thanks,
Kittur

0 Kudos
TimP
Honored Contributor III
414 Views

I shouldn't have jumped to conclusions about correctness for /arch: up through AVX2, as the stores by extract are scalar and can be in order.  I note that the MIC and AVX512 vectorization is done by vscatterdps instructions, which I suppose may not work in order.  My MIC test case doesn't test for repeated indices to see if that may be troublesome.

Today, the attachments server appears to be working even for Windows; pre-processed source attached.

0 Kudos
Kittur_G_Intel
Employee
414 Views

Tim, I still couldn't find out the issue number in IPS you attached? Can you share that number? thx

0 Kudos
TimP
Honored Contributor III
414 Views

It still appears as 6000134001 in my view.  I did include the comment that you were looking for it, so if anyone views it, they ought to notify you of whatever internal designation it has.

0 Kudos
TimP
Honored Contributor III
414 Views

The "new activity" notices are coming out as being sent via Drupal on email.amazonses, so there seems to be no way to keep them out of the spam box.  This is particularly hard to deal with when people make anonymous derogatory comments by "send author a message."

0 Kudos
Kittur_G_Intel
Employee
414 Views

Thanks Tim, yes my peer Amanda is looking at it as well and I'll try to attempt to reproduce with the preprocessed file you've attached there and update you accordingly, appreciate much.
_Kittur

0 Kudos
Kittur_G_Intel
Employee
414 Views

Tim, BTW on the  new activity notices you mention I've passed on that feedback to the group, fyi.

_Kittur

0 Kudos
Kittur_G_Intel
Employee
414 Views

Hi Tim,
I took the attached file from the issue and I could reproduce the issue this time. I've filed the issue with the developers and will keep you updated as soon as hear from them. Appreciate your patience till then.
_Kittur 

0 Kudos
Reply