Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

std::complex in zgemm

Dmitry_G_2
Beginner
956 Views

Hello,

I modified an example from the intel website to use std::complex<double>  type with the zgemm function from MKL.

Unfortunately I am getting wrong results for complex matrix multiplication. I am wondering what I am doing wrong.

Thank in advance

Dmitry

 

=========================================================================

 

#include <complex>
#include <iostream>
#include <iomanip>
#define MKL_Complex16 std::complex<double>
#include "mkl.h"
typedef std::complex<double> Complex;
const int Arows = 2, Acols = 3, Brows = 3, Bcols = 2, Crows = 2, Ccols = 2;

int main()
{
       const char *notrans = "n";
       const char *trans = "t";
       const char *conj = "c";
              
       Complex A[Arows][Acols], A1[Arows][Acols], B[Brows][Bcols], B1[Brows][Bcols], C[Crows][Ccols], C1[Crows][Ccols];
       
       Complex alpha(1.0, 0.0);
       Complex beta(0.0, 0.0);
       
        
       for (int i = 0; i < Arows; i++){
         for(int j = 0; j <Acols; j++){
           A = Complex(i+1,j+1);
           A1 = Complex(i+1,j+1);      
         }
       }
       
       for (int i = 0; i < Brows; i++){
         for(int j = 0; j < Bcols; j++){
           B = -Complex(j+1,i+1);
           B1 = -Complex(j+1,i+1);
         }
       }
       
       for (int i = 0; i < Crows; i++){
         for(int j = 0; j < Ccols; j++){
           C = Complex(0.0, 0.0);
           C1 = Complex(0.0, 0.0);
         }
       }
       Complex* pma = &(A[0][0]);
       Complex* pmb = &(B[0][0]);
       Complex* pmc = &(C[0][0]);
       
       std::cout << "Matrix A" << std::endl;
       for (int i = 0; i < Arows; i++){
         for(int j = 0; j < Acols; j++){
           std::cout << std::setw(4) << A;
         }
         std::cout << std::endl;
       }

       std::cout << "Matrix B" << std::endl;
       for (int i = 0; i < Brows; i++){
         for(int j = 0; j < Bcols; j++){
           std::cout << std::setw(4) << B;
         }
         std::cout << std::endl;
       }
       
       int iArows = Arows, iAcols = Acols, iBrows = Brows, iBcols = Bcols, iCrows = Crows, iCcols = Ccols;
       
       
       cblas_zgemm( CblasRowMajor, CblasNoTrans, CblasNoTrans, iArows, iBcols, iAcols, &alpha, pma, iAcols,  pmb, iBrows, &beta, pmc, iArows);
      
       for (int i = 0; i < Arows; i++){
        for (int j = 0; j < Bcols; j++){
          for (int k = 0; k < Acols; k++){
                    C1 += A1 * B1;
          };
        };
      };
      

       std::cout << "From zgemm" << std::endl;
       for (int i = 0; i < Crows; i++){
         for(int j = 0; j < Ccols; j++){
           std::cout <<std::setw(4) << C;
         }
         std::cout << std::endl;
       }
       
       std::cout << "From direct" << std::endl;
       for (int i = 0; i < Crows; i++){
         for(int j = 0; j < Ccols; j++){
           std::cout << std::setw(4) << C1;
         }
         std::cout << std::endl;
       }
       
       
       return 0;
}

========================================================================

Output:

(1,1)(1,2)(1,3)
(2,1)(2,2)(2,3)
Matrix B
(-1,-1)(-2,-1)
(-1,-2)(-2,-2)
(-1,-3)(-2,-3)
From zgemm
(4,-12)(5,-15)
(0,-16)(0,-20)
From direct
(11,-12)(8,-18)
(8,-18)(2,-24)

 

 

 

0 Kudos
6 Replies
mecej4
Honored Contributor III
956 Views

You passed the wrong values of ldA, ldB and ldC in the argument list to cblas_zgemm. The correct invocation is:

cblas_zgemm( CblasRowMajor, CblasNoTrans, CblasNoTrans, 
                        iArows, iBcols, iAcols, 
                       &alpha, pma, iAcols,  pmb, iBcols, 
                       &beta, pmc, iCcols);

Since all three matrices are stored as C-native (i.e., row-major) arrays, the ldA, ldB and ldC arguments should be the column-counts of A, B and C.

You may find clarification on this issue by looking at the following lines from the file cblas_zgemmx.c in the MKL examples.

      if( layout == CblasRowMajor ) {
          lda=cmaxa;
          ldb=cmaxb;
          ldc=cmaxc;
      } else {
          lda=rmaxa;
          ldb=rmaxb;
          ldc=rmaxc;
      }

I suspect that there are a number of errors in the C version of the documentation (https://software.intel.com/en-us/node/520775) in this regard. The MKL-C documentation was recently separated from the MKL-Fortran documentation and, given that Fortran stores 2-D arrays by columns and C does so by rows, such confusion is quite common. The Fortran routine zgemm (and its siblings) provide only for native (column-major) arrays, but the C routines provide for both native (row-major) and alien (column-major) arrays, so the C-documentation cannot avoid more complexity.

0 Kudos
Ying_H_Intel
Employee
956 Views

Add comments,

You may understand the lda, ldb, ldc as the distance of row  when row-major , the value is column counts.  or the distance of column when column-major, the value is row counts. 

Best Regards,

Ying

0 Kudos
Dmitry_G_2
Beginner
956 Views

Thank you very much!

0 Kudos
Dmitry_G_2
Beginner
956 Views

Sorry, but I still have trouble with zgemm. Now I want to calculate AT * B. My code is now

// A*B
 //      cblas_zgemm( CblasRowMajor, CblasNoTrans, CblasNoTrans, iArows, iBcols, iAcols, &alpha, pma, iAcols,  pmb, iBcols, &beta, pmc, iCcols);
 //AT*B
         cblas_zgemm( CblasRowMajor, CblasTrans,   CblasNoTrans, iArows, iBcols, iAcols, &alpha, pma, iArows,  pmb, iBcols, &beta, pmc, iCcols);

It looks like I have only to change the lda parameter compared to the previous case (A * B ).

But I have still different results:

Matrix A
(1,1)(1,2)
(2,1)(2,2)
(3,1)(3,2)
Matrix B
(-1,-1)(-2,-1)
(-1,-2)(-2,-2)
(-1,-3)(-2,-3)
From zgemm
(2,-8)(-1,-11)
(0,-10)(-4,-13)
From direct
(0,-28)(-11,-34)
(6,-20)(0,-26)

Could you please correct or explain this?

Best wishes

Dmitry

 

0 Kudos
Ying_H_Intel
Employee
956 Views

Hi Dmitry,

I change the code as below and get the result

#include <complex>
 #include <iostream>
 #include <iomanip>
 #define MKL_Complex16 std::complex<double>
 #include "mkl.h"
 typedef std::complex<double> Complex;
 const int Arows = 3, Acols = 2, Brows = 3, Bcols = 2, Crows = 2, Ccols = 2;

int main()
 {
        const char *notrans = "n";
        const char *trans = "t";
        const char *conj = "c";
              
        Complex A[Arows][Acols], A1[Arows][Acols], B[Brows][Bcols], B1[Brows][Bcols], C[Crows][Ccols], C1[Crows][Ccols];
       
        Complex alpha(1.0, 0.0);
        Complex beta(0.0, 0.0);
       
        
        for (int i = 0; i < Arows; i++){
          for(int j = 0; j <Acols; j++){
            A = Complex(i+1,j+1);
            A1 = Complex(i+1,j+1);     
          }
        }
       
        for (int i = 0; i < Brows; i++){
          for(int j = 0; j < Bcols; j++){
            B = -Complex(j+1,i+1);
            B1 = -Complex(j+1,i+1);
          }
        }
       
        for (int i = 0; i < Crows; i++){
          for(int j = 0; j < Ccols; j++){
            C = Complex(0.0, 0.0);
            C1 = Complex(0.0, 0.0);
          }
        }
        Complex* pma = &(A[0][0]);
        Complex* pmb = &(B[0][0]);
        Complex* pmc = &(C[0][0]);
       
        std::cout << "Matrix A" << std::endl;
        for (int i = 0; i < Arows; i++){
          for(int j = 0; j < Acols; j++){
            std::cout << std::setw(4) << A;
          }
          std::cout << std::endl;
        }

       std::cout << "Matrix B" << std::endl;
        for (int i = 0; i < Brows; i++){
          for(int j = 0; j < Bcols; j++){
            std::cout << std::setw(4) << B;
          }
          std::cout << std::endl;
        }
       
        int iArows = Arows, iAcols = Acols, iBrows = Brows, iBcols = Bcols, iCrows = Crows, iCcols = Ccols;
       
       
        cblas_zgemm( CblasRowMajor, CblasTrans, CblasNoTrans, iAcols, iBcols, iArows, &alpha, pma, iAcols,  pmb, iBcols, &beta, pmc, iCcols);
      
        for (int i = 0; i < Acols; i++){
         for (int j = 0; j < Bcols; j++){
           for (int k = 0; k < Arows; k++){
                     C1 += A1 * B1;
           };
         };
       };
      

       std::cout << "From zgemm" << std::endl;
        for (int i = 0; i < Crows; i++){
          for(int j = 0; j < Ccols; j++){
            std::cout <<std::setw(4) << C;
          }
          std::cout << std::endl;
        }
       
        std::cout << "From direct" << std::endl;
        for (int i = 0; i < Crows; i++){
          for(int j = 0; j < Ccols; j++){
            std::cout << std::setw(4) << C1;
          }
          std::cout << std::endl;
        }
       
       
        return 0;
 }.

 

MKL C  manual  have some explanation and table about m, n, k, lda etc.

The key is as the m is Op(A)'s row,  not A's row.  So   mxk * kxn = mxn   in your case,  (2x3)* (3x2 ) = 2 * 2 , m=2, n=2, k=3.

const int Arows = 3, Acols = 2, Brows = 3, Bcols = 2, Crows = 2, Ccols = 2;

cblas_zgemm( CblasRowMajor, CblasTrans, CblasNoTrans, iAcols, iBcols, iArows, &alpha, pma, iAcols,  pmb, iBcols, &beta, pmc, iCcols);

m INTEGER. Specifies the number of rows of the matrix op(A) and of the
matrix C. The value of m must be at least zero.
n INTEGER. Specifies the number of columns of the matrix op(B) and the
number of columns of the matrix C.
The value of n must be at least zero.
k INTEGER. Specifies the number of columns of the matrix op(A) and the
number of rows of the matrix op(B).
The value of k must be at least zero.

 

 

lda Specifies the leading dimension of a as declared in the calling
(sub)program.
transa=CblasNoTrans transa=CblasTrans or
transa=CblasConjTrans
Layout =
CblasColMajor
lda must be at least
max(1, m).
lda must be at least
max(1, k)
Layout =
CblasRowMajor
lda must be at least
max(1, k)
lda must be at least
max(1, m).

Best Regards,

Ying

Matrix A
(1,1)(1,2)
(2,1)(2,2)
(3,1)(3,2)
Matrix B
(-1,-1)(-2,-1)
(-1,-2)(-2,-2)
(-1,-3)(-2,-3)
From zgemm
(0,-17)(-6,-20)
(6,-20)(0,-26)
From direct
(0,-17)(-6,-20)
(6,-20)(0,-26)
 

 

0 Kudos
Dmitry_G_2
Beginner
956 Views

Thank you very much!

0 Kudos
Reply