Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

Please help with vectorization questions

diehard2
Beginner
350 Views
Hi guys,

I'm new to trying to vectorize loops, and I have a couple of questions. I have a user defined type

struct MyComplex{
double re;
double im;
MyComplex(double real = 0, double imaginary = 0)
{
re = real;
im = imaginary;
}
//overload addition, subtraction, multiplication, division
MyComplex& operator=(const MyComplex& comp)
{
re = comp.re;
im = comp.im;

return *this;
}

const MyComplex operator*(const double s)
{

return MyComplex(s*re,s*im);
}


friend MyComplex operator+(MyComplex lhs,MyComplex rhs)
{
return MyComplex(lhs.re+rhs.re,lhs.im+rhs.im);
}
etc, etc.
}

Now, I would like to be able to vectorize the scalar operations on these complex numbers

So, I can get the following to vectorize (with arrays of type MyComplex test,a,b)
for (int i = 0; i< counter; i++)
{

test.re = a.re+b.re;
test.im = a.im+b.im;
}


but not

for (int i = 0; i< counter; i++)
{
test = a+b;
}

The latter is preferable, as I could then have complicated complex addition, subtraction, and multiplication statements (as below), instead of a function for each (which would really complicate things)


for (int i = 0; i< counter; i++)
{
test = a+b+b*a/b;
}

The reason it won't vectorize is because of mixed data types. I understand the basic concept of mixed data types, but I don't quite understand why it applies here.

Sorry, that was long winded. My second question involves the vectorization of loops with trig functions. I've run across some things on the web which says that this is possible, but I haven't gotten it to work. For instance, even the following gives an "unvectorizable statement"

for (int i=0; i< counter; i++)
{
cos(3.43524525);

}

Any help would be greatly appreciated. Thanks.

~ Steve


0 Kudos
5 Replies
TimP
Honored Contributor III
350 Views
Some of the complex math can be vectorized, when using C99 complex types (supported as an extension to C++), with SSE3 options (e.g. -O3 -QxP -Qansi_alias).
I've sometimes wished for better diagnostics about vectorizability, but I think the decision not to provide warnings about dead code was conscious. Intel compilers spend a lot of time pruning dead code, even without the requirement for human readable reports.
Elementary (not complex) math functions can be vectorized by automatic substitution of svml library calls. Unfortunately, disabling svml vectorization is lumped into correctness options like -fp:precise. By now, you should see the importance of including more relevant information in your message.
0 Kudos
diehard2
Beginner
350 Views
Hi Tim,

Thanks for the reply. I considered using the c99 extensions, but they don't work in c++

from mathimf.h
#if !defined(__cplusplus) /* No _Complex or GNU __complex__ types available for C++ */

As I need other c++ constructs, I can't switch to C. So, I'm stuck with my own complex struct. I don't quite understand your comment about more relevant information and dead code. I gave the reason the compiler says it isn't vectorizing, I just don't understand the message. My basic question is if there is an easy way to vectorize the easily readable loop, and if there is a way to vectorize loops with trig functions in the real number system. Thanks for any help.

~ Steve
0 Kudos
TimP
Honored Contributor III
350 Views
You haven't told us whether you asked the compiler to vectorize with SSE3, or whether you used -fp:precise. You can't ask the vectorizer to make a loop out of the dead code you quote. How would it know what you want? If you mean to fill an array with a constant, the compiler can vectorize that with several reasonable ways of expressing it, including some ways which are peculiar to C++. If you mean to make an array of cos() values of an array argument, svml functions can enable that.
It's just a fact that you can't always write strict C++ and get full optimization. icpc chose to do it by permitting the relevant C99 syntax as an extension, in some cases with a command line option to enable use of the extension. g++ has its own extensions (__restrict__ for example) which are disabled when you set strictness options. In the latter case, you can easily switch back and forth between icpc and g++ with -D parameters along with enabling the extensions.

0 Kudos
diehard2
Beginner
350 Views
Hi Tim,

Thanks for the response. I am not using the high precision option, and I am optimizing for the SSE3 instructions on the Intel 9.1 Compiler for Windows. For a more useful example,
double asdf[20000];
double b[20000];
for(int i = 0; i< 20000;i++)
{
b = i;
}

#pragma ivdep
#pragma vector always
for(int i = 0; i< 20000;i++)
{
test = cos(b);
}

This also does not vectorize. The error is that the loop body contains an unvectorizable statement. Does anyone have an example of a loop vectorizing with a sin or cos? I glanced at the svml, and that looks quite scary! Thanks for the help.

~ Steve
0 Kudos
TimP
Honored Contributor III
350 Views
Taking a guess on how you might have made a program from your fragment, by adding int main(){} and declaring test[] as double, I get:
icl -QxB -Qc99 dh.c
Intel C++ Compiler for 32-bit applications, Version 9.1 Build 20070109Z
Copyright (C) 1985-2007 Intel Corporation. All rights reserved.

dh.c
dh.c(5) : (col. 3) remark: LOOP WAS VECTORIZED.
dh.c(12) : (col. 3) remark: LOOP WAS VECTORIZED.
Microsoft Incremental Linker Version 7.10.6030
Copyright (C) Microsoft Corporation. All rights reserved.

-out:dh.exe
dh.obj
So the compiler is not making harsh judgements that I can see about your example. Any SSE2 vectorization option would be OK for this one.
0 Kudos
Reply