Intel® C++ Compiler
Support and discussions for creating C++ code that runs on platforms based on Intel® processors.

pointer matrix multiplication


A is a 4x4 matrix loaded as a short 2x8 matrix. i want to multiply a constant (identity) matrix C WithA and store results in B. Advice how i can go about it especially thatA isin pointer format. Code below shows my intentions.

tom.cpp file
void tom::multiply(void* btr)


short*A =(short*)btr;

int j,i;

short C[4][4]={1,1,1,1,2,1,-1,-2,1,-1,-1,1,1,-2,2,-1};

short B[4][4];

for (i=0; i<4; ++i)

for (j=0; j<4; ++j)



for (k=0; k<4; ++k)




0 Kudos
3 Replies
Black Belt
If your matrix C will be literal, I suggest unrolling the 4x4 array by hand and thus bypassing the *1 and *-1 (use -= for *-1). This way 12/16'ths of the multiplications and C can be eliminated. (as well as initialization of B to 0)

You can use #defines and #ifdefs to customize the unrolling in the event that C changes.

#define C_00 1
#define C_01 1
#define C_33 -1

#if (C_00 == 1)
B[0][0] = A[0][0];
#elif (C_00 == -1)
B[0][0] = -A[0][0];
B[0][0] = A[0][0] * C_00;

*** Note,

Experiment without the #if using only

B[0][0] = A[0][0] * C_00;

as the compiler may optimize out the *1 and *-1

Jim Dempsey
New Contributor II
that looks a great optimization and multiply reduction, which escapes our programming sense usually..

As for the short * A getting converted to A, how will it happen.
Also,can void * pass 2-D array, if it passes, I thought it will pass 1-D array like (1*16), or void * parameter should be changed to something like short * btr[], or btr[][4].

Or, Aindexed in form of something like A[i*4+j]?? is there other method with minimal change of semantics?
Black Belt


The original poster had

short* A = (short*)btr;

so I can only assume the btr points to a contiguous array of 16 shorts (1D array)
Had this been

short** A = (short**)btr;

then this would have been an array of pointers to 1D arrays of 4 shorts
This would be inefficient programming but may be what is required


short* A = (short*)btr;

The user would have to convert the logical (pseudo code) A and B into a lineal index, preferrably using literal constants

#define foobar(X,i,j) X[i*4+j]

foobar(B,2,3) += foobar(A,3,2) * C_23; // something like this

Change the macro name to something you find suitable.

Jim Dempsey