Add comments,

Dmitry_G_2 · ‎12-29-2015

Hello,

I modified an example from the intel website to use std::complex<double> type with the zgemm function from MKL.

Unfortunately I am getting wrong results for complex matrix multiplication. I am wondering what I am doing wrong.

Thank in advance

Dmitry

=========================================================================

#include <complex>
#include <iostream>
#include <iomanip>
#define MKL_Complex16 std::complex<double>
#include "mkl.h"
typedef std::complex<double> Complex;
const int Arows = 2, Acols = 3, Brows = 3, Bcols = 2, Crows = 2, Ccols = 2;

int main()
{
       const char *notrans = "n";
       const char *trans = "t";
       const char *conj = "c";

       Complex A[Arows][Acols], A1[Arows][Acols], B[Brows][Bcols], B1[Brows][Bcols], C[Crows][Ccols], C1[Crows][Ccols];

       Complex alpha(1.0, 0.0);
       Complex beta(0.0, 0.0);


       for (int i = 0; i < Arows; i++){
         for(int j = 0; j <Acols; j++){
           A = Complex(i+1,j+1);
           A1 = Complex(i+1,j+1);
         }
       }

       for (int i = 0; i < Brows; i++){
         for(int j = 0; j < Bcols; j++){
           B = -Complex(j+1,i+1);
           B1 = -Complex(j+1,i+1);
         }
       }

       for (int i = 0; i < Crows; i++){
         for(int j = 0; j < Ccols; j++){
           C = Complex(0.0, 0.0);
           C1 = Complex(0.0, 0.0);
         }
       }
       Complex* pma = &(A[0][0]);
       Complex* pmb = &(B[0][0]);
       Complex* pmc = &(C[0][0]);

       std::cout << "Matrix A" << std::endl;
       for (int i = 0; i < Arows; i++){
         for(int j = 0; j < Acols; j++){
           std::cout << std::setw(4) << A;
         }
         std::cout << std::endl;
       }

       std::cout << "Matrix B" << std::endl;
       for (int i = 0; i < Brows; i++){
         for(int j = 0; j < Bcols; j++){
           std::cout << std::setw(4) << B;
         }
         std::cout << std::endl;
       }

       int iArows = Arows, iAcols = Acols, iBrows = Brows, iBcols = Bcols, iCrows = Crows, iCcols = Ccols;


       cblas_zgemm( CblasRowMajor, CblasNoTrans, CblasNoTrans, iArows, iBcols, iAcols, &alpha, pma, iAcols, pmb, iBrows, &beta, pmc, iArows);

       for (int i = 0; i < Arows; i++){
        for (int j = 0; j < Bcols; j++){
          for (int k = 0; k < Acols; k++){
                    C1 += A1 * B1;
          };
        };
      };

       std::cout << "From zgemm" << std::endl;
       for (int i = 0; i < Crows; i++){
         for(int j = 0; j < Ccols; j++){
           std::cout <<std::setw(4) << C;
         }
         std::cout << std::endl;
       }

       std::cout << "From direct" << std::endl;
       for (int i = 0; i < Crows; i++){
         for(int j = 0; j < Ccols; j++){
           std::cout << std::setw(4) << C1;
         }
         std::cout << std::endl;
       }


       return 0;
}

========================================================================

Output:

(1,1)(1,2)(1,3)
(2,1)(2,2)(2,3)
Matrix B
(-1,-1)(-2,-1)
(-1,-2)(-2,-2)
(-1,-3)(-2,-3)
From zgemm
(4,-12)(5,-15)
(0,-16)(0,-20)
From direct
(11,-12)(8,-18)
(8,-18)(2,-24)

mecej4 · ‎12-29-2015

You passed the wrong values of ldA, ldB and ldC in the argument list to cblas_zgemm. The correct invocation is:

cblas_zgemm( CblasRowMajor, CblasNoTrans, CblasNoTrans, 
                        iArows, iBcols, iAcols, 
                       &alpha, pma, iAcols,  pmb, iBcols, 
                       &beta, pmc, iCcols);

Since all three matrices are stored as C-native (i.e., row-major) arrays, the ldA, ldB and ldC arguments should be the column-counts of A, B and C.

You may find clarification on this issue by looking at the following lines from the file cblas_zgemmx.c in the MKL examples.

      if( layout == CblasRowMajor ) {
          lda=cmaxa;
          ldb=cmaxb;
          ldc=cmaxc;
      } else {
          lda=rmaxa;
          ldb=rmaxb;
          ldc=rmaxc;
      }

I suspect that there are a number of errors in the C version of the documentation (https://software.intel.com/en-us/node/520775) in this regard. The MKL-C documentation was recently separated from the MKL-Fortran documentation and, given that Fortran stores 2-D arrays by columns and C does so by rows, such confusion is quite common. The Fortran routine zgemm (and its siblings) provide only for native (column-major) arrays, but the C routines provide for both native (row-major) and alien (column-major) arrays, so the C-documentation cannot avoid more complexity.

Ying_H_Intel · ‎12-29-2015

Add comments,

You may understand the lda, ldb, ldc as the distance of row when row-major , the value is column counts. or the distance of column when column-major, the value is row counts.

Best Regards,

Ying

Dmitry_G_2 · ‎01-07-2016

Thank you very much!

Dmitry_G_2 · ‎02-23-2016

Sorry, but I still have trouble with zgemm. Now I want to calculate AT * B. My code is now

// A*B
// cblas_zgemm( CblasRowMajor, CblasNoTrans, CblasNoTrans, iArows, iBcols, iAcols, &alpha, pma, iAcols, pmb, iBcols, &beta, pmc, iCcols);
//AT*B
cblas_zgemm( CblasRowMajor, CblasTrans, CblasNoTrans, iArows, iBcols, iAcols, &alpha, pma, iArows, pmb, iBcols, &beta, pmc, iCcols);

It looks like I have only to change the lda parameter compared to the previous case (A * B ).

But I have still different results:

Matrix A
(1,1)(1,2)
(2,1)(2,2)
(3,1)(3,2)
Matrix B
(-1,-1)(-2,-1)
(-1,-2)(-2,-2)
(-1,-3)(-2,-3)
From zgemm
(2,-8)(-1,-11)
(0,-10)(-4,-13)
From direct
(0,-28)(-11,-34)
(6,-20)(0,-26)

Could you please correct or explain this?

Best wishes

Dmitry

Ying_H_Intel · ‎02-24-2016

Hi Dmitry,

I change the code as below and get the result

#include <complex>
#include <iostream>
#include <iomanip>
#define MKL_Complex16 std::complex<double>
#include "mkl.h"
typedef std::complex<double> Complex;
const int Arows = 3, Acols = 2, Brows = 3, Bcols = 2, Crows = 2, Ccols = 2;

int main()
{
        const char *notrans = "n";
        const char *trans = "t";
        const char *conj = "c";

        Complex A[Arows][Acols], A1[Arows][Acols], B[Brows][Bcols], B1[Brows][Bcols], C[Crows][Ccols], C1[Crows][Ccols];

        Complex alpha(1.0, 0.0);
        Complex beta(0.0, 0.0);


        for (int i = 0; i < Arows; i++){
          for(int j = 0; j <Acols; j++){
            A = Complex(i+1,j+1);
            A1 = Complex(i+1,j+1);
          }
        }

        for (int i = 0; i < Brows; i++){
          for(int j = 0; j < Bcols; j++){
            B = -Complex(j+1,i+1);
            B1 = -Complex(j+1,i+1);
          }
        }

        for (int i = 0; i < Crows; i++){
          for(int j = 0; j < Ccols; j++){
            C = Complex(0.0, 0.0);
            C1 = Complex(0.0, 0.0);
          }
        }
        Complex* pma = &(A[0][0]);
        Complex* pmb = &(B[0][0]);
        Complex* pmc = &(C[0][0]);

        std::cout << "Matrix A" << std::endl;
        for (int i = 0; i < Arows; i++){
          for(int j = 0; j < Acols; j++){
            std::cout << std::setw(4) << A;
          }
          std::cout << std::endl;
        }

       std::cout << "Matrix B" << std::endl;
        for (int i = 0; i < Brows; i++){
          for(int j = 0; j < Bcols; j++){
            std::cout << std::setw(4) << B;
          }
          std::cout << std::endl;
        }

        int iArows = Arows, iAcols = Acols, iBrows = Brows, iBcols = Bcols, iCrows = Crows, iCcols = Ccols;


        cblas_zgemm( CblasRowMajor, CblasTrans, CblasNoTrans, iAcols, iBcols, iArows, &alpha, pma, iAcols, pmb, iBcols, &beta, pmc, iCcols);

        for (int i = 0; i < Acols; i++){
         for (int j = 0; j < Bcols; j++){
           for (int k = 0; k < Arows; k++){
                     C1 += A1 * B1;
           };
         };
       };

       std::cout << "From zgemm" << std::endl;
        for (int i = 0; i < Crows; i++){
          for(int j = 0; j < Ccols; j++){
            std::cout <<std::setw(4) << C;
          }
          std::cout << std::endl;
        }

        std::cout << "From direct" << std::endl;
        for (int i = 0; i < Crows; i++){
          for(int j = 0; j < Ccols; j++){
            std::cout << std::setw(4) << C1;
          }
          std::cout << std::endl;
        }


        return 0;
}.

MKL C manual have some explanation and table about m, n, k, lda etc.

The key is as the m is Op(A)'s row, not A's row. So mxk * kxn = mxn in your case, (2x3)* (3x2 ) = 2 * 2 , m=2, n=2, k=3.

const int Arows = 3, Acols = 2, Brows = 3, Bcols = 2, Crows = 2, Ccols = 2;

cblas_zgemm( CblasRowMajor, CblasTrans, CblasNoTrans, iAcols, iBcols, iArows, &alpha, pma, iAcols, pmb, iBcols, &beta, pmc, iCcols);

m INTEGER. Specifies the number of rows of the matrix op(A) and of the
matrix C. The value of m must be at least zero.
n INTEGER. Specifies the number of columns of the matrix op(B) and the
number of columns of the matrix C.
The value of n must be at least zero.
k INTEGER. Specifies the number of columns of the matrix op(A) and the
number of rows of the matrix op(B).
The value of k must be at least zero.

lda Specifies the leading dimension of a as declared in the calling
(sub)program.
transa=CblasNoTrans transa=CblasTrans or
transa=CblasConjTrans
Layout =
CblasColMajor
lda must be at least
max(1, m).
lda must be at least
max(1, k)
Layout =
CblasRowMajor
lda must be at least
max(1, k)
lda must be at least
max(1, m).

Best Regards,

Ying

Matrix A
(1,1)(1,2)
(2,1)(2,2)
(3,1)(3,2)
Matrix B
(-1,-1)(-2,-1)
(-1,-2)(-2,-2)
(-1,-3)(-2,-3)
From zgemm
(0,-17)(-6,-20)
(6,-20)(0,-26)
From direct
(0,-17)(-6,-20)
(6,-20)(0,-26)

Dmitry_G_2 · ‎02-26-2016

Thank you very much!

std::complex in zgemm