- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there!
In our FEM-based application we have used several IPP matrix/vector array operations for small matrices, and the results have been rewarding. But, unfortunately, I noticed that e.g. operation "transposed matrix array x constant array" is missing from release 5.2.
I know that "matrix array x constant array" can be handled by using "vector array x constant array", since column stride (= stride1 ) is not needed in such a (trivial) case. My question is, how to accomplish the original task when (single) matrix transpositions are involved? At the time, I have contented myself with the routine as follows:
case 1.
-- "this" represents the constant array
-- "ma" represents the matrix array to be transposed
-- "pDst" is the resulting matrix array
//----------------------------------------------------------------
template
IPPMatrixArray& IPPVectorArray::operator* (const IPPMatrixArray& ma)
{
Subscript count = this->NoOfVecs(), isize = ma.Size()/count, rows = ma.NoOfRows(), icols = ma.Size()/(rows*ma.NoOfSmallMatrices());
if ((this->Size() == count) && (count == ma.NoOfSmallMatrices())){
IPPMatrixArray *pDst = new IPPMatrixArray(icols, rows, count);
// Standard description for source matrices
int src1Stride2 = sizeof(T);
int src1Stride0 = (icols*rows)*sizeof(T);
int src1Stride1 = icols*sizeof(T);
// Standard description for destination matrices
int dstStride2 = sizeof(T);
int dstStride1 = rows*sizeof(T);
int dstStride0 = (rows*icols)*sizeof(T);
IppStatus status;
T ival;
//
// NOTE. The following for-loop should be replaced by "transposed matrix array - constant array" operation
//
for (Subscript i = 0; i < count; i++){
ival = this->ptr;
// multiply (i+1)-th transposed single matrix by (i+1)-th constant
#ifdef ETYPE_FLOAT
status = ippmMul_tc_32f(
ma.GetPtr()+i*isize, src1Stride1, src1Stride2, ival,
pDst->GetPtr()+i*isize, dstStride1, dstStride2, rows, icols );
#else if ETYPE_DOUBLE
&nbs p; status = ippmMul_tc_64f(
ma.GetPtr()+i*isize, src1Stride1, src1Stride2, ival,
pDst->GetPtr()+i*isize, dstStride1, dstStride2, rows, icols );
#endif
// if( status < ippStsNoErr ) {
// printf( "-- error %d, %s ", status, ippGetStatusString( status ));
// }
}
return *pDst;
}
else{
if (count != ma.NoOfSmallMatrices())
cerr << " Fatal error in IPPVectorArray::operator* (const IPPMatrixArray&): Source array dimensions do not match!
";
if (this->Size() != count)
cerr << " Fatal error in IPPVectorArray::operator* (const IPPMatrixArray&): Left-hand side should be constant array!
";
this->size = 0;
this->no_of_vecs = 0;
return (IPPMatrixArray&)*this;
}
};
//----------------------------------------------------------------
It is yet remarkably faster than the following, simple two-step approach:
case 2.
-- vector array x constant array, w/ "ippmMul_vaca" ( = matrix array x constant array)
-- matrix array transposition, w/ "ippmTranspose_ma"
The slow-down in case 2 was significant, I tested it with 3-by-4 matrices of array size 1000000. I think that even case 1 could be speeded-up, if the proper function was available. Have I missed something..? Perhaps some stride-related thing or..?
Regards,
Villesamuli
In our FEM-based application we have used several IPP matrix/vector array operations for small matrices, and the results have been rewarding. But, unfortunately, I noticed that e.g. operation "transposed matrix array x constant array" is missing from release 5.2.
I know that "matrix array x constant array" can be handled by using "vector array x constant array", since column stride (= stride1 ) is not needed in such a (trivial) case. My question is, how to accomplish the original task when (single) matrix transpositions are involved? At the time, I have contented myself with the routine as follows:
case 1.
-- "this" represents the constant array
-- "ma" represents the matrix array to be transposed
-- "pDst" is the resulting matrix array
//----------------------------------------------------------------
template
IPPMatrixArray
{
Subscript count = this->NoOfVecs(), isize = ma.Size()/count, rows = ma.NoOfRows(), icols = ma.Size()/(rows*ma.NoOfSmallMatrices());
if ((this->Size() == count) && (count == ma.NoOfSmallMatrices())){
IPPMatrixArray
// Standard description for source matrices
int src1Stride2 = sizeof(T);
int src1Stride0 = (icols*rows)*sizeof(T);
int src1Stride1 = icols*sizeof(T);
// Standard description for destination matrices
int dstStride2 = sizeof(T);
int dstStride1 = rows*sizeof(T);
int dstStride0 = (rows*icols)*sizeof(T);
IppStatus status;
T ival;
//
// NOTE. The following for-loop should be replaced by "transposed matrix array - constant array" operation
//
for (Subscript i = 0; i < count; i++){
ival = this->ptr;
// multiply (i+1)-th transposed single matrix by (i+1)-th constant
#ifdef ETYPE_FLOAT
status = ippmMul_tc_32f(
ma.GetPtr()+i*isize, src1Stride1, src1Stride2, ival,
pDst->GetPtr()+i*isize, dstStride1, dstStride2, rows, icols );
#else if ETYPE_DOUBLE
&nbs p; status = ippmMul_tc_64f(
ma.GetPtr()+i*isize, src1Stride1, src1Stride2, ival,
pDst->GetPtr()+i*isize, dstStride1, dstStride2, rows, icols );
#endif
// if( status < ippStsNoErr ) {
// printf( "-- error %d, %s ", status, ippGetStatusString( status ));
// }
}
return *pDst;
}
else{
if (count != ma.NoOfSmallMatrices())
cerr << " Fatal error in IPPVectorArray
if (this->Size() != count)
cerr << " Fatal error in IPPVectorArray
this->size = 0;
this->no_of_vecs = 0;
return (IPPMatrixArray
}
};
It is yet remarkably faster than the following, simple two-step approach:
case 2.
-- vector array x constant array, w/ "ippmMul_vaca" ( = matrix array x constant array)
-- matrix array transposition, w/ "ippmTranspose_ma"
The slow-down in case 2 was significant, I tested it with 3-by-4 matrices of array size 1000000. I think that even case 1 could be speeded-up, if the proper function was available. Have I missed something..? Perhaps some stride-related thing or..?
Regards,
Villesamuli
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Villesamuli,
I'm sorry for the delay with answer, our expert was on vacation. So his answer is
Both casesare correct but not effective.
IPP Small matrix domain doesnt provide such functionality yet.
You might submit feature request through IPP Technical Support channel, then it will be revised at the next version planning stage.
Regards,
Vladimir

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page