Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

How to multiply vectors in order to get a matrix

apolo74
Beginner
578 Views
Hi there,

I need to multiply two vectors of dimensions aa=Mx1 and bb=1xN, to generate a matrix cc=MxN. So a typical IPP vector is a 1xM and I guess I have to transpose it first. In Matlab is something like (note that vector aa is being transposed):

aa = [1 2 3 4];
bb = [5 8 6];
cc = aa' * bb

cc =
5 8 6
10 16 12
15 24 18
20 32 24

It seems that ippm has this kind of functionality but for small matrices. I need to work with vectors between 100 and 400 elements... so the final matrix will be around 400x400. I then tried with ipps and ippi but I can't make it to transpose and then multiply to create the respective matrix of this kind of vector multiplication. Any help and suggestions will be greatly appreciated.

Boris
0 Kudos
5 Replies
apolo74
Beginner
578 Views
Sorry for that, it was a stupid question... but just in case someone else needs to see it:

Ipp32f src1[4] = {1.0f, 2.0f, 3.0f, 4.0f};
Ipp32f src2[3] = {5.0f, 8.0f, 6.0f};
Ipp32f dst[4*3] = {0.0f};

for( int i=0; i<4; i++)
ippsMulC_32f( src2, src1, dst+3*i, 3 );


I wonder if there is a better way of doing it... I just don't like FOR loops, a wate of computation time. Hope you ways have a better solution.

Boris
0 Kudos
PaulF_IntelCorp
Employee
578 Views
Hello Boris,

Did you review the matrix multiply functions? After all, a vector is simply a matrix with one side equal to one. See this link to the documentation:

http://software.intel.com/sites/products/documentation/hpc/ipp/ippm/index.htm

Paul
0 Kudos
apolo74
Beginner
578 Views
Hi Paul,

yes I checked that library but according to the documentation ippm is optimized for working with small vectors and matrices (3x3, 4x4, 5x5 and 6x6; vectors of length up to 6). And I need to work with matrices and vectors of length up to 360.

Boris
0 Kudos
PaulF_IntelCorp
Employee
578 Views
Boris,

Those functions will still work on larger matrices, as well. By optimized for small vectors it simply means the performance falls off as the matrices get larger. If you are using the Intel compiler you may find that the compiler generates better results, as it can optimize some of thos operations and you won't incur the overhead of managing parameters and calls to the IPP functions.

Paul
0 Kudos
Chao_Y_Intel
Moderator
578 Views


Boris,

ippsMulC looks to be good choice for such functionality if the N is large.

for( int i=0; i ippsMulC_32f( src2, src1, dst+N*i, N );

Using ippsMulC make it use the vectorized code. Another optimization opportunity is to threaded the "for" loops with Intel Cilk, TBB, or OpenMP* (if needed, no high level threading in the application).

Thanks,
Chao

0 Kudos
Reply