Issue during replacing ipp DCT function with MKL DCT function

Ashwin_A_1 · ‎12-22-2014

Hi,

I want to replace my IPP based DCT function with mkl based DCT function .

I am getting different output data when I will cross check with the ipp DCT vs mkl DCT function output.

I used below functions to get the DCT by usng IPP.lib function calls :

ippsDCTFwdInitAlloc_32f
ippsDCTFwd_32f
ippsDCTFwdFree_32f

Below is my code :

//pfa of the fileinput.txt

int main(int argc, char* argv[]){

   float *dpar;
   float *out;
   MKL_INT *ipar;
   MKL_INT tt_type,stat,n_1,nn;
   FILE *fp,*fw,*fonce;
   fp = fopen( "D:\\dump\\fileinput.txt","r" );
   if(fp == NULL){
       cout<<"file not created properly"<<endl;
   }
   DFTI_DESCRIPTOR_HANDLE handle = 0;
   int n = 65; //Hardcoded to run for my code TODO:going to change after integrating into my main codebase
   nn = (MKL_INT)n;
   tt_type = MKL_STAGGERED_COSINE_TRANSFORM;

   n_1 = nn + 1 ;
   out = (float*)malloc((n+1)*sizeof(float));
   dpar= (float*)malloc((5*n_1/2+2)*sizeof(float));
   ipar= (MKL_INT*)malloc((128)*sizeof(int));
   s_init_trig_transform(&n_1,&tt_type,ipar,dpar,&stat);
   for (int srcSize =0 ;srcSize< n ; srcSize++)
   {
       fscanf(fp,"%f\n",&out[srcSize]);
}
   fclose(fp);
if (stat != 0)
   {
       printf("\n============================================================================\n");
       printf("FFTW2MKL FATAL ERROR: MKL TT initialization has failed with status=%d\n",(MKL_INT)stat);
       printf("Please refer to the Trigonometric Transform Routines Section of MKL Manual\n");
       printf("to find what went wrong...\n");
       printf("============================================================================\n");
       return NULL;
   }
   ipar[10] = 1; //nx, that is, the number of intervals along the x-axis, in the Cartesian case.
   ipar[14] = n_1; //specifies the internal partitioning of the dpar array.
   ipar[15] = 1; //value of ipar[14]+1,Specifies the internal partitioning of the dpar array.
   s_commit_trig_transform(out,&handle,ipar,dpar,&stat);
   if (stat != 0)
   {
       printf("\n============================================================================\n");
       printf("FFTW2MKL FATAL ERROR: MKL TT commit step has failed with status=%d\n",(MKL_INT)stat);
       printf("Please refer to the Trigonometric Transform Routines Section of MKL Manual\n");
       printf("to find what went wrong...\n");
       printf("============================================================================\n");
       return NULL;
   }
   s_forward_trig_transform(out,&handle,ipar,dpar,&stat);
   if (stat != 0)
   {
       printf("\n============================================================================\n");
       printf("FFTW2MKL FATAL ERROR: MKL TT commit step has failed with status=%d\n",(MKL_INT)stat);
       printf("Please refer to the Trigonometric Transform Routines Section of MKL Manual\n");
       printf("to find what went wrong...\n");
       printf("============================================================================\n");
       return NULL;
   }
free_trig_transform(&handle,ipar,&stat);
   printf("\n===== DCT GOT OVER ======== \n");

return 0;

}

Ashwin_A_1 · ‎12-24-2014

More update :

I took 8x8 Matrix as input to to above mentioned code with tt_type = MKL_COSINE_TRANSFORM;

255   255   255   255   255   255   255   255
255   255   255   255   255   255   255   255
255   255   255   255   255   255   255   255
255   255   255   255   255   255   255   255
255   255   255   255   255   255   255   255
255   255   255   255   255   255   255   255
255   255   255   255   255   255   255   255
255   255   255   255   255   255   255   255

DCT from above code is wrongly came as -

32385 -727.349243 -619.955444   -458.675903   -267.323547   -74.151535 92.318069 207.505844
764.404419 717.50531 602.316772 435.846893 242.6754 51.322887   -109.956779   -217.350159
-762.619629   -706.582764   -583.869446   -412.598663   -218.058334   -28.972321   126.74057 226.088898
759.648132 694.603516 564.651367 388.979492   193.525208 7.147152 -142.635284 -233.70639
-755.498901   -681.59967 -544.712891   -365.049927   -169.138 14.096443 157.598663 240.180206
750.180115 667.597656 524.094971 340.860809   144.948898   -34.714363   -171.600922 -245.499771
-743.705566   -652.633789   -502.85144 -316.473694   -121.019257   54.652824   184.604614 249.648224
736.088623 636.739502 481.026154 291.940552 97.400169   -73.870857   -196.583725 -252.620102

It is totally wrong .

Can anyone tell me the mistake in my above code .

Please make a note that I did Direct DCT by initializing tt_type = MKL_COSINE_TRANSFORM;

Ying_H_Intel · ‎12-29-2014

Hi Ashwin,

Just let you know, I'm investigating the problem with our developer, but may be a little delay as new year holiday. Sorry for the delay

Happy new year!

Thanks

Ying

Ying_H_Intel · ‎01-15-2015

Hi Ashwin,

We find the problem:

s_init_trig_transform(&n_1,&tt_type,ipar,dpar,&stat);

the n_1 in init functionality – number of intervals, for cosine transform. So it is equal number of points minus one.

if you try the below fix,

int n = 64; //Hardcoded to run for my code TODO:going to change after integrating into my main codebase

nn = (MKL_INT)n;
tt_type = MKL_COSINE_TRANSFORM; ;

n_1 = n - 1 ;

You will get the correct result.

Ipar[10] and other are internal parameters, we didn’t allow to change them.

===== DCT GOT OVER ========

pDst 16065.000000

pDst 0.000000

pDst 0.000122

pDst 0.000052

pDst 0.000000

pDst 0.000052

pDst 0.000036

pDst 0.000145

pDst -0.000002

pDst 0.000220

But please notes It is at 8 time greater than ipp one because ipp result is divided by sqrt N.

Best Regards,

Ying

and in case if you want to use theMKL_STAGGERED_COSINE_TRANSFORM, according the documents,

/f[64] for staggered cosine transforms = 0;

int n = 65;

tt_type = MKL_STAGGERED_COSINE_TRANSFORM

//; // MKL_COSINE_TRANSFORM; //MKL_STAGGERED_COSINE_TRANSFORM;

float pSrc[65]={

255, 255 , 255 , 255 , 255 , 255, 255 , 255,

255 , 255 , 255 , 255 , 255 , 255 , 255, 255,

255 , 255 , 255, 255 , 255 , 255 , 255, 255,

255, 255 , 255 , 255, 255 , 255 , 255 , 255,

255 , 255, 255 , 255 , 255 , 255 , 255 , 255,

255 , 255, 255, 255, 255, 255 , 255 , 255,

255, 255 , 255 , 255 , 255, 255 , 255 , 255,

255, 255, 255, 255, 255, 255, 255, 255,

0};

n_1 = n - 1 ;

s_init_trig_transform(&n_1,&tt_type,ipar,spar,&stat);

Palanivel_G_ · ‎02-24-2015

Hi Ying,

I tried your suggested fixes and below are the comparative results between IPP and MKL implementations.

Input : 8x8 with all elements 255

IPP Output:

2040   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0
0   0   0   0   0   0   0   0

MKL Output:

510   0   -2.12484e-007   -8.95694e-008   2.02821e-006   1.98985e-006   -2.44188e-007   1.36767e-006
1.25338e-008   8.65079e-007   -4.82064e-007   1.93344e-006   2.85854e-008   1.90692e-006   0   1.90692e-006
-2.95537e-008   1.90544e-006   6.00573e-007   2.19466e-006   -1.37825e-007   2.03849e-006   -6.08452e-008   1.85739e-006
-3.83681e-008   2.36938e-006   8.6844e-009   2.35244e-006   0   2.35244e-006   -1.61479e-008   2.35742e-006
4.02007e-007   2.39765e-006   -1.16682e-007   2.33411e-006   -7.94424e-008   2.23449e-006   9.64763e-008   2.57738e-006
1.70139e-009   2.56381e-006   0   2.56381e-006   -1.02961e-008   2.57161e-006   3.07488e-007   2.49331e-006
-1.0582e-007   2.47736e-006   -8.96722e-008   2.42256e-006   1.75386e-007   2.66651e-006   -2.62343e-009   2.65502e-006
0   2.65502e-006   -6.2848e-009   2.66475e-006   2.39635e-007   2.50137e-006   -9.76727e-008   2.52111e-006

The output is ~4 times lesser than the DCT output.

Could you please let me know what is going on wrong here?.

The code snippet used in MKL are as below,

int n = 64; // 8x8 input, so 64 co-efficients
nn = (MKL_INT)n;
tt_type = MKL_COSINE_TRANSFORM;

n_1 = n - 1 ; // no of intervals
out = (float*) malloc ((n + 1) * sizeof(float));
dpar= (float*) malloc (((5 * (n_1 / 2)) + 2) * sizeof(float));
ipar= (MKL_INT*) malloc ((128) * sizeof(int));

s_init_trig_transform(&n_1, &tt_type, ipar, dpar, &stat);

/*
** Populate the input
*/
for (int k = 0; k < n; k++) {
out = 255.0;
}

s_commit_trig_transform(out,&handle,ipar,dpar,&stat);

s_forward_trig_transform(out,&handle,ipar,dpar,&stat);

Thanks in advance.

Best regards

Palanivel

Ying_H_Intel · ‎02-24-2015

Hi Palanivel,

I upload the whole cpp file. You may check what is difference for the result. (sorry, I upload again)

Regards,

Ying

Palanivel_G_ · ‎02-24-2015

Hi Ying,

Thanks for the reply.

Request you to please upload the complete cpp file.

Thanks,

Palanivel

Ying_H_Intel · ‎03-03-2015

Hi Palanivel,

I attached the test code with MKL_COSINE_TRANSFORM in # 6 and theMKL_STAGGERED_COSINE_TRANSFORM in this reply.

Best Regards,

Ying