Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Question about ldb in mkl_?omatcopy for transpose of column-major matrices

davydden1
Beginner
916 Views

Dear all,

I need to do out-of-place transpose of the matrix (actually I have plenty of them if various sizes/shapes). I have a question about ldb parameter in mkl_?omatcopy. Documentation says that if a matrix is in column major format and I am doing transpose, then ldb must be at least equal to number of columns in B. Is it indeed the case or it’s just a typo?

My matrices are stored with minimum leading dimension — number of rows for column major format. So that means I am out of luck using this function if number of columns in B is more than number of rows? That is unfortunate as I would expect this function to handle such cases internally in the best possible way.

Denis.

0 Kudos
6 Replies
davydden1
Beginner
916 Views

On the related note, I have issues using this function on macOS for anything but square matrices (it simply does not work). Given that I store everything in column major format and the leading dimension is number of rows, I can not see where I could mis-use this function, see below, compiled with:

clang++ -std=c++11  -m64 -I/opt/intel/mkl/include example.cc -o example -L/opt/intel/mkl/lib -Wl,-rpath,/opt/intel/mkl/lib -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl

// compile with:
// clang++ -std=c++11  -m64 -I/opt/intel/mkl/include example.cc -o example -L/opt/intel/mkl/lib -Wl,-rpath,/opt/intel/mkl/lib -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl

#include <iostream>
#include <vector>

#include <mkl.h>

void print(const std::vector<double> &A, const unsigned int m, const unsigned int n)
{
  for (unsigned int i = 0; i < m; ++i)
    {
      for (unsigned int j = 0; j < n; ++j)
        std::cout << A[i+j*m] << " ";

      std::cout << std::endl;
    }
}

bool compare(const std::vector<double> & A, const std::vector<double> &B, const unsigned int Am, const unsigned int An)
{
  bool ret = true;
  for (unsigned int i = 0; i < Am; ++i)
    for (unsigned int j = 0; j < An; ++j)
      ret &= (A[i+j*Am] == B[j+i*An]);

  return ret;
}

void test(const unsigned int m, const unsigned int n)
{
  std::cout << "Test " << m << "x" << n << std::endl;
  const auto &Am = n;
  const auto &An = m;
  const auto &Bm = m;
  const auto &Bn = n;
  std::vector<double> A(Am*An,0.);
  std::vector<double> B(Bm*Bn,0.);

  for (unsigned int i = 0; i < Am*An; ++i)
    A = 1+i;

  mkl_domatcopy('C', 'T', Bm, Bn, 1., A.data(), Am, B.data(), Bm);

  std::cout << "Input:" << std::endl;
  print(A,Am,An);
  std::cout << "Output:" << std::endl;
  print(B,Bm,Bn);
  std::cout << "Correct: " << compare(A,B,Am,An) << std::endl<< std::endl;
}

int main(int argc, char **argv)
{
  test(11,27);
  test(15,4);
  test(10,10);
}

 

0 Kudos
Gennady_F_Intel
Moderator
916 Views

We would  recommend you take a look at the domatcopy.c example (mklroot\examples\transc\source\ dir)

0 Kudos
davydden1
Beginner
916 Views

Hi Gennady, 

Thanks for the prompt reply. I quickly translated your example to C++ and noticed the following. The mkl_domatcopy is being called with rows=2 and cols=4 (3rd and 4th argument). However, the matrix B (which is clear both from in-code comments and its dst_stride=2) is 4x2. That is, it has 4 rows and 2 columns and is stored in row-major format  (stride=2).

However the MKL documentation states:

rows    The number of rows in matrix B (the destination matrix).

cols       The number of columns in matrix B (the destination matrix).

To me this indicates that MKL's documentation is confusing and not consistent with what is actually meant by rows/cols. Probably, what is actually meant is the number of rows/cols which define a submatrix of A, on which the operation will be applied.

As another proof, If I swap rows/cols in my code above, everything works for all cases!

I would recommend you to reword the documentation to avoid confusion for other users.

Regards, Denis.

0 Kudos
Ye_C_1
Beginner
916 Views

I have the same question.

I think the manual is wrong, the rows is the row number of destination matrix B without operation.

But I am not sure.

 

davydden wrote:

Hi Gennady, 

Thanks for the prompt reply. I quickly translated your example to C++ and noticed the following. The mkl_domatcopy is being called with rows=2 and cols=4 (3rd and 4th argument). However, the matrix B (which is clear both from in-code comments and its dst_stride=2) is 4x2. That is, it has 4 rows and 2 columns and is stored in row-major format  (stride=2).

However the MKL documentation states:

rows    The number of rows in matrix B (the destination matrix).

cols       The number of columns in matrix B (the destination matrix).

To me this indicates that MKL's documentation is confusing and not consistent with what is actually meant by rows/cols. Probably, what is actually meant is the number of rows/cols which define a submatrix of A, on which the operation will be applied.

As another proof, If I swap rows/cols in my code above, everything works for all cases!

I would recommend you to reword the documentation to avoid confusion for other users.

Regards, Denis.

0 Kudos
Pamela_H_Intel
Moderator
916 Views

Folks,

It looks like the documentation is incorrect. I will verify with the MKL developers and get back to this thread..

Pamela

0 Kudos
Pamela_H_Intel
Moderator
916 Views

Here is the corrected info. We will update the official documentation. Thank you for pointing out this issue!

According to our examples the description of parameters should be as follows:

rows - the number of rows in matrix A

cols - the number of columns in matrix A

alpha - this parameter scales the input matrix by alpha.

a - array of size at least lda*rows in case of Row-major ordering (ordering = 'R'). And lda*cols in case of Column-major ordering (ordering = 'C')

lda - If ordering = 'R' or 'r', lda represents the number of elements in array a between adjacent rows of matrix A; lda must be at least equal to cols. 

        If ordering = 'C' or 'c', lda represents the number of elements in array a between adjacent columns of matrix A; lda must be at least equal to rows.

b - If trans == 'R' or 'N' then it is array of size at least ldb*rows in case of Row-major ordering (ordering = 'R'). And ldb*cols in case of Column-major ordering (ordering = 'C').

     If trans == 'T' or 'C' then it is array of size at least ldb*cols in case of Row-major ordering (ordering = 'R'). And ldb*rows in case of Column-major ordering (ordering = 'C').

ldb - If ordering = 'R' or 'r', lda represents the number of elements in array a between adjacent rows of matrix B; ldb must be at least equal to cols if trans=='R' or 'N'. And rows if trans=='C' or 'T'

        If ordering = 'C' or 'c', lda represents the number of elements in array a between adjacent columns of matrix B; ldb must be at least equal to rows if trans=='R' or 'N'. And cols if trans=='C' or 'T'

 

The same changes should be applied to ?omatcopy2 routine + the following:

stridea - If ordering = 'R' or 'r', stridea represents the number of elements in array 'a' between adjacent columns of matrix A. stridea must be at least 1.

              If ordering = 'C' or 'c', stridea represents the number of elements in array 'a' between adjacent rows of matrix A. stridea must be at least 1.

strideb - If ordering = 'R' or 'r', strideb represents the number of elements in array 'b' between adjacent columns of matrix B. strideb must be at least 1.

              If ordering = 'C' or 'c', strideb represents the number of elements in array 'b' between adjacent rows of matrix B. strideb must be at least 1.

0 Kudos
Reply