About vectorizer and std::vector as class member

darietti7 · ‎05-05-2006

I am observing astrange behaviour with the vectorizer and a class having members of type std::vector

suppose the following code (test.cpp):

Code:

#include 
#include 

typedef std::vector DoubleVector;

struct Test
{
 DoubleVector v1;
 DoubleVector v2;
 
 void test1(const double x); 
 void test2(const double x);
 void test3(const double x);
 void test4(const double x);
};

void f1(const double x, const DoubleVector& v1, DoubleVector& v2)
{
 const DoubleVector::size_type size = v1.size();

 // this IS vectorized!
 for(DoubleVector::size_type i = 0; i < size; ++i)
 {
  v2 = exp(v1 * x);
 }
}

void Test::test1(const double x)
{
 // this IS vectorized!
 f1(x, v1, v2);
}

void Test::test2(const double x)
{
 const DoubleVector::size_type size = v1.size();
 
 // this IS NOT vectorized-
 for(DoubleVector::size_type i = 0; i < size; ++i)
 {
  v2 = exp(v1 * x);
 }
}

void Test::test3(const double x)
{
 const DoubleVector::size_type size = v1.size();
 
 // this IS NOT vectorized 
 // not even with the 'ignore dependencies' pragma
 #pragma ivdep
 for(DoubleVector::size_type i = 0; i < size; ++i)
 {
  v2 = exp(v1 * x);
 }
}

void Test::test4(const double x)
{
 const DoubleVector::size_type size = v1.size();
 
 const double* p1 = &v1[0];
 double* p2 = &v2[0];

 // this IS vectorized!
 // but goodbye to the vector abstraction
 for(DoubleVector::size_type i = 0; i < size; ++i)
 {
  p2 = exp(p1 * x);
 }
}

compiled with icl 9.0 W_CC_C_9.0.030

options /O3 /QxP /Qvec-report3

f1 is a stand-alone function using vectors passed as parameters, the for loop is vectorized

test.cpp(23) : (col. 2) remark: LOOP WAS VECTORIZED.

test1 uses f1 (with inline expansion?) so it is vectorized

test.cpp(32) : (col. 2) remark: LOOP WAS VECTORIZED.

great work with that vectorization!

And now the bad news :(

test2 uses vectors v1 & v2 wich are class members and the for loop is NOT vectorized

test.cpp(40) : (col. 2) remark: vector dependence: assumed FLOW dependence between reference at line 39 and this line 39.

test.cpp(40) : (col. 2) remark: loop was not vectorized: existence of vector dependence.

test3 is like test2 but I tried forcing the compiler into ignoring the inexistent dependencies: no success

test.cpp(55) : (col. 11) remark: loop was not vectorized: dereference too complex.

Lets help with that 'complex' dereference:

test4 unwraps the vectors with plain pointers, and the loop isvectorized: OK but bye bye to vector abstraction

test.cpp(68) : (col. 2) remark: LOOP WAS VECTORIZED.

So the solution is using pointers? but f1 didn't need any pointer exposure... maybe the "dereference too complex" is because v1 is actually this->v1 ?

Any help with this? please don't tell me I have to use pointers...

Thanks in advance

Intel_C_Intel · ‎05-08-2006

Dear Darietti7,

Thanks for this constructive feedback! Unfortunately, address analysis has to make conservative assumptions on the reference through this->v1 to the vector abstraction, which is responsible for the failure to vectorize test2/test3 (either data dependence analysis is too conservative or, when this is overridden with IVDEP, actual vectorization fails due to lack of subscript knowledge):

x.cpp(22) : (col. 2) remark: LOOP WAS VECTORIZED.
x.cpp(31) : (col. 2) remark: LOOP WAS VECTORIZED.
x.cpp(39) : (col. 2) remark: loop was not vectorized: existence of vector dependence.
x.cpp(54) : (col. 11) remark: loop was not vectorized: dereference too complex.
x.cpp(67) : (col. 2) remark: LOOP WAS VECTORIZED.

We always try to make the abstraction penalty in performance as low as possible but are not always successful. Perhaps that we can improve this test case in the future, but for now I am afraid I have to give you the answer you do not want to hear
Thanks again for the test case, much appreciated.

Aart Bik
http://www.aartbik.com/

darietti7 · ‎05-09-2006

I have been trying the iterator + algorithm approach (very STL'ish) :

Code:

#include 
#include 

#include 
#include 

typedef std::vector DoubleVector;

struct Test1
{
 DoubleVector v1;
 DoubleVector v2;

 void test2algo(const double x);
};

class exp_of_mult : std::unary_function
{
private:
 const double m_;
public:
 inline exp_of_mult(const double m) : m_(m) {}
 inline double operator() (const double x) { return exp(x * m_); }
};

void Test1::test2algo(const double x)
{
 // v2 = exp( v1 * x )
 // this IS vectorized
 std::transform(v1.begin(), v1.end(), v2.begin(), exp_of_mult(x) );
}

test1.cpp(31) : (col. 2) remark: LOOP WAS VECTORIZED.

this approach has the advantage that it allowsreusing the algorithm, but when no reuse is needed it may be confusing because it moves the 'logic' from a class method to a helper functor.

My conclusion is that anything that moves the this->v1 reference to a v1 function parameter or unwraps the vectors into pointers/iterators helps vectorization.

Our main concern is that we are developing finantial/scientific code that makes extensive use of std::vector and we were not expecting this 'abstraction penalty' only because an std::vector is a class member

Right now itwould be very costly for us to transform the code that usesthis->v1:: operator[ ]

Is it possible to issue some kind of 'request for feature' so that this issue is solved in a future version of the Intel compiler?

Message Edited by darietti7 on 05-09-200601:59 AM

Thanks for the info aartbik.My feedback is pure selfishness: ifyour compiler gets betterour code gets faster and we willall be happier :)

Message Edited by darietti7 on 05-09-200602:10 AM