I attached a complete example

David_P_8 · ‎03-27-2013

I've having trouble doing a 2D cosign transform. I tried a simple test with an 8x8 matrix by doing a forward transform and then a reverse, but I don't get the origil matrix back. I attached the code that does the 2D transform. What am I missing?

Thanks,

Zhang_Z_Intel · ‎03-27-2013

Hi,

I had a quick look at your code. There are two things you may want to check:

There isn't a s_trig_transform() function defined in the current MKL release. It sould be either s_forward_trig_transform() or s_backward_trig_transform(). Which version of MKL are you using?

For staggered cosine transform, the input array must be size (n+1), where n is the size of transform. And the last element must be set to 0.0. Did you take care of this?

SergeyKostrov · ‎03-27-2013

>>...There isn't a s_trig_transform() function defined in the current MKL release... I tried to find it in MKL 10.3.12 ( headers & samples ) without success. It looks like older version of MKL ( before version 10 ) is used.

David_P_8 · ‎03-28-2013

Oh, I accidentatly deleted some lines when sanitizing the code I posted. I'm using a function pointer called s_tring_transform to select forward or backward based on the direction input variable. Here's the lines:

    void (*trig_transform)(Float *, DFTI_DESCRIPTOR_HANDLE *, int *, Float *, int *);
   if(direction < 0)
        trig_transform = s_forward_trig_transform;
    else
        trig_transform = s_backward_trig_transform;

I also made sure the size of the input array was n+1 and ended with a zero. I've also tried using MKL_COSINE_TRANSFORM and MKL_STAGGERED_COSINE_TRANSFORM, but I still couldn't get the original 2D array back.

SergeyKostrov · ‎03-28-2013

Thanks and I see the difference in sources of the test. Why wouldn't you add a main function with a call to dct2 function and parameters you use ( in order to reproduce your issues completely )?

David_P_8 · ‎03-28-2013

I attached a complete example and below the output I get. I print out the original matrix, after the forward_dct, and after the backward_dct. I don't completely understand dct, but I thought I should end up with a matrix similar to what I started with.

11 12 13 14 15 16 17 18
21 22 23 24 25 26 27 28
31 32 33 34 35 36 37 38
41 42 43 44 45 46 47 48
51 52 53 54 55 56 57 58
61 62 63 64 65 66 67 68
71 72 73 74 75 76 77 78
81 82 83 84 85 86 87 88

   183.047    4.26973   -13.6921     12.576   -13.3736    13.0362   -13.3722    13.1317
-39.3769 -0.193602    3.04743   -2.74132    2.72896   -2.75311    2.72749   -2.74693
-22.2031   -3.91173    1.05791   -1.36401    1.37637   -1.35223    1.37785    -1.3584
   14.6098    3.40551 -0.551685   0.857791   -0.87015   0.846003 -0.871624   0.852176
-22.2031   -3.91173    1.05791   -1.36401    1.37637   -1.35223    1.37785    -1.3584
    18.813    3.68572 -0.831899      1.138   -1.15036    1.12622   -1.15184    1.13239
-22.2031   -3.91173    1.05791   -1.36401    1.37637   -1.35223    1.37785    -1.3584
   19.7666     3.7493 -0.895476    1.20158   -1.21394    1.18979   -1.21541    1.19597

-6.58086    36.9207   -18.6316    35.9385   -10.7853    49.3869    4.92431    64.7248
   20.2915    15.3686    36.3422    20.3508    32.4959    10.9024    20.7864 -0.435492
   44.1692    26.1707    32.1184    25.1885    39.9647    38.6369    55.6743    53.9748
   34.3433    41.3168     50.394     46.299    46.5477    36.8507    34.8381    25.5127
   65.4192    44.9207    53.3684    43.9385    61.2147    57.3869    76.9243    72.7248
   55.0329    60.6273    71.0836    65.6095    67.2373    56.1611    55.5277    44.8232
   86.6691    63.6707    74.6184    62.6885    82.4647    76.1369    98.1743    91.4748
   76.1557    79.5044    92.2064    84.4866    88.3601    75.0382    76.6505    63.7003

SergeyKostrov · ‎03-28-2013

Thanks David for the updated test case and data.

Zhang_Z_Intel · ‎03-29-2013

David,

MKL engineers are going to take a look at the problem you reported. Meanwhile, I'd suggest you to take a look at the linear transform functions in Intel Integrated Performance Primitives (Intel IPP). If what you want is 8x8 DCT, then IPP has already had a function for this. Please take a look at the IPP reference manual, chapter 10: https://software.intel.com/sites/products/documentation/doclib/ipp_sa/71/ipp_manual/ippi.pdf

Thanks.

SergeyKostrov · ‎03-29-2013

Hi David, >>...I print out the original matrix, after the forward_dct, and after the backward_dct. I don't completely understand dct, >>but I thought I should end up with a matrix similar to what I started with... >> >>11 12 13 14 15 16 17 18 >>21 22 23 24 25 26 27 28 >>31 32 33 34 35 36 37 38 >>41 42 43 44 45 46 47 48 >>51 52 53 54 55 56 57 58 >>61 62 63 64 65 66 67 68 >>71 72 73 74 75 76 77 78 >>81 82 83 84 85 86 87 88 I did a verification with an image processing application that allows to do DCT and IDCT for an image. I've created a source image that has 64 blocks ( your original 8x8 matrix ) and I was able to complete the test. Here are screenshoots: [ Source 512x512 image ]

[ DCT512x512 image ]

[ Output 512x512 image ]

However, I don't know if the same algorithm is used in MKL but the algorithm in the application I used ( implemented in C/C++ with processing on a GPU / very fast with more than 128 frames ( 512x512 ) per second ) is based on a classic one described on en.wikipedia.org/wiki/Discrete_cosine_transform.

Bernard · ‎03-30-2013

>>>implemented in C/C++ with processing on a GPU / very fast with more than 128 frames ( 512x512 ) per second ) is based on a classic one described on>>>

What GPU do you have?

SergeyKostrov · ‎03-30-2013

>>...What GPU do you have? That actually doesn't matter because this is Not related to a problem / issue with DCT functions of MKL.

David_P_8 · ‎04-01-2013

Sergey, thanks for the test results. I'm using dct in an advanced image processing algorithm that does a 2D dct on the entire image. I was getting odd results, which is why I tried testing dct on a small 8x8 matrix. At least now I know where the problem is.

It looks like the Intel IPP library would do the 2D transform for me, but I don't have a license for that, just MKL. I wonder how hard this is to implement.

SergeyKostrov · ‎04-01-2013

Hi David, >> 183.047 4.26973 -13.6921 12.576 -13.3736 13.0362 -13.3722 13.1317 >> -39.3769 -0.193602 3.04743 -2.74132 2.72896 -2.75311 2.72749 -2.74693 >> -22.2031 -3.91173 1.05791 -1.36401 1.37637 -1.35223 1.37785 -1.3584 >> 14.6098 3.40551 -0.551685 0.857791 -0.87015 0.846003 -0.871624 0.852176 >> -22.2031 -3.91173 1.05791 -1.36401 1.37637 -1.35223 1.37785 -1.3584 >> 18.813 3.68572 -0.831899 1.138 -1.15036 1.12622 -1.15184 1.13239 >> -22.2031 -3.91173 1.05791 -1.36401 1.37637 -1.35223 1.37785 -1.3584 >> 19.7666 3.7493 -0.895476 1.20158 -1.21394 1.18979 -1.21541 1.19597 It looks wrong... So, I've done another verification with ippsDCTFwd_32f and ippsDCTInv_32f functions from Digital Signal Processing domain of IPP. This is what I have after Forward DCT was applied: 396.00000 -182.92722 0.00000000 -19.884794 0.00000000 -6.8192778 0.00000000 -3.1852565 0.00000000 -1.6359950 0.00000000 -0.75465244 0.00000000 -0.0095326053 0.00000000 1.8327000 0.00000000 -2.6851103 0.00000000 -0.97898060 0.00000000 -0.55619341 0.00000000 -0.32662323 0.00000000 -0.15689079 0.00000000 0.0079023652 0.00000000 0.24817109 0.00000000 1.1542168 0.00000000 -1.3432851 0.00000000 -0.45008636 0.00000000 -0.23733211 0.00000000 -0.11914375 0.00000000 -0.024629775 0.00000000 0.078209206 0.00000000 0.24637732 0.00000000 0.92925960 0.00000000 -0.98908645 0.00000000 -0.30937901 0.00000000 -0.14793864 0.00000000 -0.056380428 0.00000000 0.020121917 0.00000000 0.10802869 0.00000000 0.25874245 0.00000000 0.88681680 After I've applied Inverse DCT I have almost the same input matrix with rounding errors (!) since I used Single Precision IPP functions: 11.805534 12.030663 12.527290 13.352337 14.522095 15.975449 17.574728 19.147879 20.553452 21.736849 22.749239 23.718441 24.784969 26.033978 27.456161 28.956169 30.403439 31.699112 32.825127 33.851059 34.896088 36.066406 37.400642 38.851196 40.310585 41.668491 42.869469 43.941322 44.979618 46.096813 47.362919 48.768784 50.231216 51.637081 52.903187 54.020382 55.058678 56.130531 57.331509 58.689415 60.148804 61.599358 62.933594 64.103912 65.148941 66.174873 67.300888 68.596558 70.043831 71.543839 72.966019 74.215027 75.281555 76.250763 77.263153 78.446548 79.852119 81.425270 83.024551 84.477905 85.647659 86.472710 86.969337 87.194466 [ Edited ] I fixed an issue in my test codes and reconstruction works well. Let me know if you need the source code of IPP based test and it is ~40 C/C++ code lines.

SergeyKostrov · ‎04-01-2013

[ Source codes ( Without error processing ) ] ... IppStatus st = ippStsNoErr; const int LEN = 64; Ipp32f fInpS[LEN] = { // Input data set ( signal ) 11, 12, 13, 14, 15, 16, 17, 18, 21, 22, 23, 24, 25, 26, 27, 28, 31, 32, 33, 34, 35, 36, 37, 38, 41, 42, 43, 44, 45, 46, 47, 48, 51, 52, 53, 54, 55, 56, 57, 58, 61, 62, 63, 64, 65, 66, 67, 68, 71, 72, 73, 74, 75, 76, 77, 78, 81, 82, 83, 84, 85, 86, 87, 88, }; Ipp32f fOut[LEN] = { 0.0f }; // Calculated DCT for input data set ( signal ) Ipp32f fInpR[LEN] = { 0.0f }; // Reconstructed Input data set ( signal ) IppsDCTFwdSpec_32f *pFwdSpec = NULL; IppsDCTInvSpec_32f *pInvSpec = NULL; st = ::ippsDCTFwdInitAlloc_32f( &pFwdSpec, LEN, ippAlgHintNone ); // Initializes DCT structure ( for Forward computations ) st = ::ippsDCTInvInitAlloc_32f( &pInvSpec, LEN, ippAlgHintNone ); // Initializes DCT structure ( for Inverse computations ) st = ::ippsDCTFwd_32f( fInpS, fOut, pFwdSpec, 0 ); // Does Forward DCT ( No Scaling / Last parameter is 0 ) st = ::ippsDCTInv_32f( fOut, fInpR, pInvSpec, 0 ); // Does Inverse DCT ( No Scaling / Last parameter is 0 ) st = ::ippsDCTFwdFree_32f( pFwdSpec ); st = ::ippsDCTInvFree_32f( pInvSpec ); ... [ Reconstructed data set ] 11.000008 12.000004 13.000000 14.000000 15.000004 16.000004 17.000004 18.000004 21.000004 22.000006 23.000004 24.000000 25.000002 26.000004 27.000000 28.000002 31.000000 31.999998 33.000000 34.000000 35.000000 36.000000 37.000000 38.000000 41.000000 41.999996 43.000004 44.000000 45.000000 46.000000 47.000000 47.999992 51.000008 52.000000 53.000000 54.000000 55.000000 55.999996 57.000004 58.000000 61.000000 62.000000 63.000000 64.000000 65.000000 66.000000 67.000000 68.000000 71.000000 72.000000 73.000000 74.000000 75.000000 76.000000 76.999992 78.000000 81.000000 82.000000 83.000000 84.000000 85.000000 86.000000 87.000000 87.999992 Note: Everything is correct with Inverse processing and I fixed a small issue in my test case. All rounding errors you saw are not related to limitations of single precision IPP functions.

SergeyKostrov · ‎04-01-2013

David, Here are two notes and please take a look: [ Note 1 ] Use Double precision functions for improved accuracy of calculations in case of a large source image: ... IPPAPI ( IppStatus, ippsDCTFwdFree_64f, ( IppsDCTFwdSpec_64f *pDCTSpec ) ) IPPAPI ( IppStatus, ippsDCTInvFree_64f, ( IppsDCTInvSpec_64f *pDCTSpec ) ) ... and let me know if you need help [ Note 2 - Regarding a DCT algorithm implemented in DSP domain of IPP ( from IPP docs ) ] ... If length is a power of 2, the functions use an efficient algorithm that is significantly faster than the direct computation of DCT. For other values of length, these functions use the direct formulas; however, the symmetry of the cosine function is taken into account, which allows to perform about half of the multiplication operations in the direct formulas. ...

David_P_8 · ‎04-02-2013

I ended up writing my own 1D dct solution, which was fairly easy, but it is slow. I can at least continue algorithm development now.

Hopefully the MKL will get fixed or I may have to find another library or get an IPP license.

SergeyKostrov · ‎04-02-2013

>>...Hopefully the MKL will get fixed... Thanks for the update. It is still Not clear if MKL's DCT is broken and I'm glad to see that you have a workaround.

Rodney_H_ · ‎11-21-2013

I read this thread with interest some months later and just thought I'd post a folllow-up for anyone that might consider using this library.

For me the attached code reproduces the initial matrix perfectly upon transforming and inverting, using the staggered cosine.

For the staggered sine a warning is issued but the matrix is still reproduced.

I'm using MKL 11.0.5.1

Regards

Rodney

P.S. File attachment doesn't seem to work so here's the code:

#include "cmath"

#include "mkl_trig_transforms.h"

#include <string>

#include <iostream>

#include <fstream>

#include <iomanip>

#include "mkl.h"

#define _USE_MATH_DEFINES

using namespace std;

void dct2(float** data, MKL_INT num_rows, MKL_INT num_cols, int direction) {

// compute 2D forward cosine transform

DFTI_DESCRIPTOR_HANDLE handle;

MKL_INT stat = 0;

MKL_INT ipar[128];

float * par = new float[(3*num_rows/2) + 2]();

// use function pointers to handle whether float is a float or double

// and direction of the transform

void (*s_trig_transform)(float *, DFTI_DESCRIPTOR_HANDLE *, MKL_INT *, float *, MKL_INT *);

if(direction < 0)

s_trig_transform = s_forward_trig_transform;

else

s_trig_transform = s_backward_trig_transform;

MKL_INT tt_type = MKL_STAGGERED_SINE_TRANSFORM;

// first do the rows

s_init_trig_transform(&num_cols, &tt_type, ipar, par, &stat);

if(stat != 0){

cerr << "Error initializing row transform: " << stat << endl;

return;

}

s_commit_trig_transform(data[0], &handle, ipar, par, &stat);

if(stat != 0){

cerr << "Error committing row transform: " << stat << endl;

return;

}

for(unsigned int row = 0; row < num_rows; ++row){

s_trig_transform(data[row], &handle, ipar, par, &stat);

if(stat != 0){

cerr << "Error in row transform: " << stat << endl;

return;

}

free_trig_transform(&handle, ipar, &stat);

if(stat != 0){

cerr << "Error freeing row transform: " << stat << endl;

return;

}

// now do the cols

s_init_trig_transform(&num_rows, &tt_type, ipar, par, &stat);

if(stat != 0){

cerr << "Error initializing col transform: " << stat << endl;

return;

}

float * temp_col = new float[num_rows+1]();

s_commit_trig_transform(temp_col, &handle, ipar, par, &stat);

if(stat != 0){

cerr << "Error committing col transform: " << stat << endl;

return;

}

for(unsigned int col = 0; col < num_cols; ++col){

// copy the column into a contiguous vector

for(unsigned int row = 0; row < num_rows; ++row){

temp_col[row] = data[row][col];

}

s_trig_transform(temp_col, &handle, ipar, par, &stat);

if(stat != 0){

cerr << "Error in col transform: " << stat << endl;

return;

}

// copy the results back

for(unsigned int row = 0; row < num_rows; ++row){

data[row][col] = temp_col[row];

}

free_trig_transform(&handle, ipar, &stat);

if(stat != 0){

cerr << "Error freeing col transform: " << stat << endl;

return;

}

delete[] temp_col;

delete[] par;

//#]

}

float log2(float x) {

return log(x)/log(2);

}

int main(int argc, char ** argv){

int size = 8;

float ** temp = new float*[size+1]();

temp[0] = new float[(size+1)*(size+1)]();

for (unsigned int row = 1; row < size+1; row++) {

temp[row] = &(temp[0][row*(size+1)]);

}

for(int i = 0; i < size; ++i){

for(int j = 0; j < size; ++j){

//if(i == 0 || i == size-1

// || j == 0 || j == size-1)

// temp = 0;

//else

temp[i][j] = (i+1)*10 + j + 1;

cout << temp[i][j] << " ";

}

cout << endl;

}

cout << endl;

dct2(temp, size, size, -1);

for(int i = 0; i < size; ++i){

for(int j = 0; j < size; ++j){

cout <<setw(10)<< temp[i][j] << " ";

}

cout << endl;

}

cout << endl;

dct2(temp, size, size, 1);

for(int i = 0; i < size; ++i){

for(int j = 0; j < size; ++j){

cout <<setw(10)<< temp[i][j] << " ";

}

cout << endl;

}

cout << endl;

return 0;

}

Trigonometric Transform