Solved: Re: Re:cblas_zgemm not producing correct results

jeff222 · ‎11-17-2021

Hello,

I am trying to run cblas_zgemm() to multiply a 3X1 by a 1X3 matrix to create a 3X3 matrix.

Using C Unit Test I compiled and ran my software on a 64 bit machine

Not sure what went wrong i believe i followed the example code exactly.

Any help on what might be going wrong will be appreciated

Thanks,

I have included below

- source code of cblas_zgemm()

- test code using CUnit Test

- Output

Example Source Code:

void mkl_matrixMultiply(vector *result, vector *operand1, int op1_row, int op1_col, int trans1, vector *operand2, int op2_row, int op2_col, int trans2, int order)
{
/*Looping Index */
int i;

MKL_INT m,n,k, lda, ldb, ldc;
MKL_Complex16 alpha, beta;
/*Matrix Declaration*/
MKL_Complex16 *a, *b, *c;

CBLAS_ORDER cblas_order;
CBLAS_TRANSPOSE transA, transB;

/*matrixC = alpha*matrixA*matrixB */
a = (MKL_Complex16*)malloc(sizeof(MKL_Complex16));
b = (MKL_Complex16*)malloc(sizeof(MKL_Complex16));
c = (MKL_Complex16*)malloc(sizeof(MKL_Complex16));

/*SCALAR */
alpha.real = 1.0;
alpha.imag = 1.0;
beta.real = 1.0;
beta.imag = 1.0;

/*Transpose*/
if(trans1 == 0)
{
transA = CblasNoTrans;
}
else
{
transA = CblasTrans;
}

if(trans2 == 0)
{
transB = CblasNoTrans;
}
else
{
transB = CblasTrans;
}
/*Organization of data */
if(order == 0)
{
cblas_order = CblasColMajor;
}
else
{
cblas_order = CblasRowMajor;
}

m = op1_row; /* Matrix A row */
n = op2_col; /* Matrix B Column */
k = op1_col; /* Matrix B Row, Matrix A Column */

lda = 1;
ldb = 1;
ldc = 1;

/* Convert the vector arrays into complext_type arrays of data for Win32 library calls. */
/* Add the vector input values into a temporary complex array. */
for (i = 0; i < k; i++)
{
/* A = M x K */
a[i].real = operand1->real[i];
a[i].imag = operand1->imag[i];

/* B = K X N */
b[i].real = operand2->real[i];
b[i].imag = operand2->imag[i];
}

cblas_zgemm(cblas_order, transA, transB, m, n, k, &alpha,
a, lda, b, ldb, &beta, c, ldc);

for( i = 0; i < 9; i++ )
{
result->real[i] = c[i].real;
result->imag[i] = c[i].imag;
}

free(a);
free(b);
free(c);
}/* end of mkl_matrixMultiply */

Test Code:

static void test_mkl_matrixMultiply(void)
{
#define s_Row 3
#define s_Col 1
#define t_Row 3
#define t_Col 1

int op1_col = s_Col;
int op1_row = s_Row;
int trans1 = 0;

int op2_row = t_Row;
int op2_col = t_Col;
int trans2 = 1;

vector *s = vnewz(3);
vector *t = vnewz(3);
vector *r = vnewz(9);

float treal[3] = {
15.9840, 8.1929, 9.3919};

float timag[3] = {
1.5051, 4.9810, 3.9310};

float sreal[3] = {
15.9840, 8.1929, 9.3919};

float simag[3] = {
1.5051, 4.9810, 3.9310};

float result_real[9] = {253.22, 123.46, 144.20, 123.46, 42.31, 57.37, 144.20, 57.37, 72.76};
float result_imag[9] = {48.12, 91.95, 76.97, 91.95, 81.26, 78.99, 76.97, 78.99, 73.84};

int i;

if ((NULL == r) || (NULL == s) ||(NULL == t) )
{
printf("Failed to locate vectors. Test discarded\n");
if (r)
{
vfree(r);
}
if (s)
{
vfree(s);
}
if (t)
{
vfree(t);
}
return;
}

for ( i = 0; i < 3; i++)
{
s->real[i] = sreal[i];
s->imag[i] = simag[i];

t->real[i] = treal[i];
t->imag[i] = timag[i];
}

/* column Major*/
int order = 0;

mkl_matrixMultiply(r, s, op1_row, op1_col, trans1, t, op2_row, op2_col, trans2, order); /* USES INTEL MKL */

for (i = 0; i < 9; i++)
{
result_real[i] *= 0.001;
result_imag[i] *= 0.001;
CU_ASSERT(fabs(result_real[i] - r->real[i]) < 0.0005);
CU_ASSERT(fabs(result_imag[i] - r->imag[i]) < 0.0005);
}

#if 1 /*def V_LIB_TST_VERBOSE*/
printf("\n");
for (i = 0; i < 9; i++)
{
printf("\t%d: r->real = %f - rreal= %f\n", i, r->real[i], result_real[i]);
printf("\t r->imag = %f - rimag = %f\n", r->imag[i], result_imag[i]);
}
#endif

if (r)
{
vfree(r);
}
if (s)
{
vfree(s);
}
if (t)
{
vfree(t);
}
return;

} /* test_mkl_matrixMultiply */

Output

r->real is the output from cblas_zgemm

rreal is output from matlab

Test: mkl_matrixMultiply ...
0: r->real = 205.107895 - rreal= 0.253220
r->imag = 301.337982 - rimag = 0.048120
1: r->real = 0.000000 - rreal= 0.123460
r->imag = 0.000000 - rimag = 0.091950
2: r->real = 0.000000 - rreal= 0.144200
r->imag = 0.000000 - rimag = 0.076970
3: r->real = 0.000000 - rreal= 0.123460
r->imag = 0.000000 - rimag = 0.091950
4: r->real = 0.000000 - rreal= 0.042310
r->imag = 0.000000 - rimag = 0.081260
5: r->real = 0.000000 - rreal= 0.057370
r->imag = 0.000000 - rimag = 0.078990
6: r->real = 0.000000 - rreal= 0.144200
r->imag = 0.000000 - rimag = 0.076970
7: r->real = 0.000000 - rreal= 0.057370
r->imag = 0.000000 - rimag = 0.078990
8: r->real = 0.000000 - rreal= 0.072760
r->imag = 0.000000 - rimag = 0.073840

ShanmukhS_Intel · ‎12-30-2021

Hi,

Reminder:

Is your issue resolved? Kindly let us know if we could close this thread from our end.

Below are the CBLAS_ZGEMM data calculations.

C B L A S _ Z G E M M EXAMPLE PROGRAM

INPUT DATA

M=3 N=3 K=1

ALPHA =( 1.0, 0.0 ) BETA =( 1.0, 0.0 )

TRANSA = CblasNoTrans TRANSB = CblasNoTrans

LAYOUT = CblasRowMajor

ARRAY A LDA=1

( 83.00, 10.00)

( 31.00, 30.00)

( 23.00, 4.00)

ARRAY B LDB=3

( 83.00,-10.00) ( 31.00,-30.00) ( 23.00, -4.00)

ARRAY C LDC=3

( 0.00, 0.00) ( 0.00, 0.00) ( 0.00, 0.00)

OUTPUT DATA

ARRAY C LDC=3

(6989.00, 0.00) (2873.00,-2180.00) (1949.00,-102.00)

(2873.00,2180.00) (1861.00, 0.00) (833.00,566.00)

(1949.00,102.00) (833.00,-566.00) (545.00, 0.00)

Best Regards,

Shanmukh.SS

View solution in original post

ShanmukhS_Intel · ‎11-18-2021

Hi,

Thanks for posting in intel communities. We would like to request you to share the complete sample reproducer, so that we could investigate your issue further.

Best Regards,

Shanmukh.SS

jeff222 · ‎11-22-2021

hi shanmukh,

I am not sure what you are looking for could you elaborate on what you need from me?

Thanks,

ShanmukhS_Intel · ‎11-25-2021

Hi,

We will try reproducing the issue at our end. To perform the cblas_zgemm calculations, we need a sample reproducer. Could you please share us the sample working reproducer and the steps to reproduce, so that we could look into your issue further.

Best Regards,

Shanmukh.SS

jeff222 · ‎11-29-2021

Hi,

What do you mean by reproducer? does that mean source code,

I am going to explain what I am trying to achieve to the best of my ability. And have shown my source code below

Since posting my question I have had a little bit more success. I am trying to multiply op(a) * op(b)

op(a) = { 83 + 10i, 31 + 30i, 23 + 4i} row 3 (m) col 1 (k)

op(b) = {83 - 10i, 31 - 30i, 23 - 4i} row 1 (k) col 3 (n)

the resultant vector from cblas_zgemm

6989 + 6989i, 693 + 5053i, 1847 + 2051i,

5053 + 693i, 1861 + 1861i, 1399 + 267i,

2051 + 1847i, 267 + 1399i, 545 + 545i

Matlab results of same multiplication

6989 + 0000i, 2873 - 2180i, 1949 - 102i
2873 + 2180i, 1861 + 0000i, 833 + 566i
1949 + 102i, 833 - 566i, 545 + 0000i

Seems like the diagonal matrix is being calculated correctly minus the imaginary part not being 0.

I have included the source code below.

Any help would be greatly appreciated

Thanks,
Jeff

Source code of calling

void mkl_matrixMultiply(vector *result, vector *operand1, int op1_row, int op1_col, int trans1, vector *operand2, int op2_row, int op2_col, int trans2, int order)
{
/*Looping Index */
int i;

MKL_INT m,n,k, lda, ldb, ldc;
MKL_Complex16 alpha, beta;
MKL_INT rmaxa, cmaxa, rmaxb, cmaxb, rmaxc, cmaxc;

/*Matrix Declaration*/
MKL_Complex16 *a, *b, *c;

/*Column_Major || Row_Major*/
CBLAS_LAYOUT layout;
CBLAS_TRANSPOSE transA, transB;

/*SCALAR */
alpha.real = 1.0;
alpha.imag = 1.0;

beta.real = 0.0;
beta.imag = 0.0;

m = 3; /* Matrix A row */
n = 3; /* Matrix B Column */
k = 1; /* Matrix B Row, Matrix A Column */

transA = CblasNoTrans;
transB = CblasNoTrans;

layout = CblasRowMajor;

if( transA == CblasNoTrans )
{
rmaxa = m + 1;
cmaxa = k;
}
else
{
rmaxa = k + 1;
cmaxa = m;
}

if( transB == CblasNoTrans )
{
rmaxb = k + 1;
cmaxb = n;
}
else
{
rmaxb = n + 1;
cmaxb = k;
}

rmaxc = m + 1;
cmaxc = n;

if( layout == CblasRowMajor )
{
lda=cmaxa;
ldb=cmaxb;
ldc=cmaxc;
}
else
{
lda=rmaxa;
ldb=rmaxb;
ldc=rmaxc;
}

/*matrixC = alpha*matrixA*matrixB */
a = (MKL_Complex16*)mkl_calloc(rmaxa*cmaxa,(sizeof(MKL_Complex16)),64);
b = (MKL_Complex16*)mkl_calloc(rmaxb*cmaxb,(sizeof(MKL_Complex16)),64);
c = (MKL_Complex16*)mkl_calloc(rmaxc*cmaxc,(sizeof(MKL_Complex16)),64);

/* Convert the vector arrays into complext_type arrays of data for Win32 library calls. */
/* Add the vector input values into a temporary complex array. */
for (i = 0; i < 3; i++)
{
/* A = M x K */
a[i].real = operand1->real[i];
a[i].imag = operand1->imag[i];

/* B = K X N */
b[i].real = operand2->real[i];
b[i].imag = operand2->imag[i];
}

cblas_zgemm(layout, transA, transB, m, n, k, &alpha,
a, lda, b, ldb, &beta, c, ldc);

for( i = 0; i < 9; i++ )
{
result->real[i] = c[i].real;
result->imag[i] = c[i].imag;
}

mkl_free(a);
mkl_free(b);
mkl_free(c);

} /* end of mkl_matrixMultiply */

jeff222 · ‎11-29-2021

Currently Using 2018 version of intel MKL

jeff222 · ‎11-29-2021

Hi,

I am not sure what you mean by reproducer i am sorry.

If you could kindly inform me what information you need i can give you the answers to the best of my ability

I will try and best describe what i am trying to accomplish as well as show the matlab results v cblas_zgemm results

Using 2018 MKL library

Since posting my question I have had a little bit more success. I am trying to multiply op(a) * op(b)

op(a) = { 83 + 10i, 31 + 30i, 23 + 4i} row 3 (m) col 1 (k)

op(b) = {83 - 10i, 31 - 30i, 23 - 4i} row 1 (k) col 3 (n)

the resultant vector from cblas_zgemm

6989 + 6989i, 693 + 5053i, 1847 + 2051i,

5053 + 693i, 1861 + 1861i, 1399 + 267i,

2051 + 1847i, 267 + 1399i, 545 + 545i

Matlab results of same multiplication

6989 + 0000i, 2873 - 2180i, 1949 - 102i
2873 + 2180i, 1861 + 0000i, 833 + 566i
1949 + 102i, 833 - 566i, 545 + 0000i

Seems like the diagonal matrix is being calculated correctly minus the imaginary not being 0 for the diagonal.

Source code of calling

void mkl_matrixMultiply(vector *result, vector *operand1, int op1_row, int op1_col, int trans1, vector *operand2, int op2_row, int op2_col, int trans2, int order)
{
/*Looping Index */
int i;

MKL_INT m,n,k, lda, ldb, ldc;
MKL_Complex16 alpha, beta;
MKL_INT rmaxa, cmaxa, rmaxb, cmaxb, rmaxc, cmaxc;

/*Matrix Declaration*/
MKL_Complex16 *a, *b, *c;

/*Column_Major || Row_Major*/
CBLAS_LAYOUT layout;
CBLAS_TRANSPOSE transA, transB;

/*SCALAR */
alpha.real = 1.0;
alpha.imag = 1.0;

beta.real = 0.0;
beta.imag = 0.0;

m = 3; /* Matrix A row */
n = 3; /* Matrix B Column */
k = 1; /* Matrix B Row, Matrix A Column */

transA = CblasNoTrans;
transB = CblasNoTrans;

layout = CblasRowMajor;

if( transA == CblasNoTrans )
{
rmaxa = m + 1;
cmaxa = k;
}
else
{
rmaxa = k + 1;
cmaxa = m;
}

if( transB == CblasNoTrans )
{
rmaxb = k + 1;
cmaxb = n;
}
else
{
rmaxb = n + 1;
cmaxb = k;
}

rmaxc = m + 1;
cmaxc = n;

if( layout == CblasRowMajor )
{
lda=cmaxa;
ldb=cmaxb;
ldc=cmaxc;
}
else
{
lda=rmaxa;
ldb=rmaxb;
ldc=rmaxc;
}

/*matrixC = alpha*matrixA*matrixB */
a = (MKL_Complex16*)mkl_calloc(rmaxa*cmaxa,(sizeof(MKL_Complex16)),64);
b = (MKL_Complex16*)mkl_calloc(rmaxb*cmaxb,(sizeof(MKL_Complex16)),64);
c = (MKL_Complex16*)mkl_calloc(rmaxc*cmaxc,(sizeof(MKL_Complex16)),64);

/* Convert the vector arrays into complext_type arrays of data for Win32 library calls. */
/* Add the vector input values into a temporary complex array. */
for (i = 0; i < 3; i++)
{
/* A = M x K */
a[i].real = operand1->real[i];
a[i].imag = operand1->imag[i];

/* B = K X N */
b[i].real = operand2->real[i];
b[i].imag = operand2->imag[i];
}

cblas_zgemm(layout, transA, transB, m, n, k, &alpha,
a, lda, b, ldb, &beta, c, ldc);

for( i = 0; i < 9; i++ )
{
result->real[i] = c[i].real;
result->imag[i] = c[i].imag;
}

mkl_free(a);
mkl_free(b);
mkl_free(c);

} /* end of mkl_matrixMultiply */

ShanmukhS_Intel · ‎12-01-2021

Hi,

Could you please share us the sample reproducer (the complete source code which you are working on to perform the desired functionalities), the example code followed by you so that we could investigate the issue.

Best Regards,

Shanmukh.SS

jeff222 · ‎12-01-2021

Hi,

Since posting my question I have had a little bit more success. I am trying to multiply op(a) * op(b)

op(a) = { 83 + 10i, 31 + 30i, 23 + 4i} row 3 (m) col 1 (k)

op(b) = {83 - 10i, 31 - 30i, 23 - 4i} row 1 (k) col 3 (n)

the resultant vector from cblas_zgemm

6989 + 0i, 693 + 5053i, 1847 + 2051i,

5053 + 693i, 1861 + 0i, 1399 + 267i,

2051 + 1847i, 267 + 1399i, 545 + 0i

Matlab results of same multiplication

6989 + 0000i, 2873 - 2180i, 1949 - 102i
2873 + 2180i, 1861 + 0000i, 833 + 566i
1949 + 102i, 833 - 566i, 545 + 0000i

Seems like the diagonal matrix is being calculated correctly minus the imaginary not being 0 for the diagonal.

Below I have shared the sample code from 2018 version

Thanks,

Jeff

example source code

/*******************************************************************************
* Copyright 1999-2018 Intel Corporation.
*
* This software and the related documents are Intel copyrighted materials, and
* your use of them is governed by the express license under which they were
* provided to you (License). Unless the License provides otherwise, you may not
* use, modify, copy, publish, distribute, disclose or transmit this software or
* the related documents without Intel's prior written permission.
*
* This software and the related documents are provided as is, with no express
* or implied warranties, other than those that are expressly stated in the
* License.
*******************************************************************************/

/*
! Content:
! C B L A S _ Z G E M M Example Program Text ( C Interface )
!******************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include "mkl.h"

#include "mkl_example.h"

int main(int argc, char *argv[])
{
FILE *in_file;
char *in_file_name;

MKL_INT m, n, k;
MKL_INT lda, ldb, ldc;
MKL_INT rmaxa, cmaxa, rmaxb, cmaxb, rmaxc, cmaxc;
MKL_Complex16 alpha, beta;
MKL_Complex16 *a, *b, *c;
CBLAS_LAYOUT layout;
CBLAS_TRANSPOSE transA, transB;
MKL_INT ma, na, mb, nb;

printf("\n C B L A S _ Z G E M M EXAMPLE PROGRAM\n");

/* Get input parameters */

if( argc == 1 ) {
printf("\n You must specify in_file data file as 1-st parameter");
return 1;
}
in_file_name = argv[1];

/* Get input data */

if( (in_file = fopen( in_file_name, "r" )) == NULL ) {
printf("\n ERROR on OPEN '%s' with mode=\"r\"\n", in_file_name);
return 1;
}
if( GetIntegerParameters(in_file, &m, &n, &k) != 3 ) {
printf("\n ERROR of m, n, k reading\n");
return 1;
}
if( GetScalarsZ(in_file, &alpha, &beta) != 2 ) {
printf("\n ERROR of alpha, beta reading\n");
return 1;
}
if( GetCblasCharParameters(in_file, &transA, &transB, &layout) != 3 ) {
printf("\n ERROR of transA, transB, layout reading\n");
return 1;
}

if( transA == CblasNoTrans ) {
rmaxa = m + 1;
cmaxa = k;
ma = m;
na = k;
} else {
rmaxa = k + 1;
cmaxa = m;
ma = k;
na = m;
}
if( transB == CblasNoTrans ) {
rmaxb = k + 1;
cmaxb = n;
mb = k;
nb = n;
} else {
rmaxb = n + 1;
cmaxb = k;
mb = n;
nb = k;
}
rmaxc = m + 1;
cmaxc = n;
a = (MKL_Complex16 *)mkl_calloc(rmaxa*cmaxa, sizeof(MKL_Complex16), 64);
b = (MKL_Complex16 *)mkl_calloc(rmaxb*cmaxb, sizeof(MKL_Complex16), 64);
c = (MKL_Complex16 *)mkl_calloc(rmaxc*cmaxc, sizeof(MKL_Complex16), 64);
if ( a == NULL || b == NULL || c == NULL ) {
printf("\n Can't allocate memory arrays");
return 1;
}
if( layout == CblasRowMajor ) {
lda=cmaxa;
ldb=cmaxb;
ldc=cmaxc;
} else {
lda=rmaxa;
ldb=rmaxb;
ldc=rmaxc;
}
if( GetArrayZ(in_file, &layout, GENERAL_MATRIX, &ma, &na, a, &lda) != 0 ) {
printf("\n ERROR of array A reading\n");
return 1;
}
if( GetArrayZ(in_file, &layout, GENERAL_MATRIX, &mb, &nb, b, &ldb) != 0 ) {
printf("\n ERROR of array B reading\n");
return 1;
}
if( GetArrayZ(in_file, &layout, GENERAL_MATRIX, &m, &n, c, &ldc) != 0 ) {
printf("\n ERROR of array C reading\n");
return 1;
}
fclose(in_file);

/* Print input data */

printf("\n INPUT DATA");
printf("\n M="INT_FORMAT" N="INT_FORMAT" K="INT_FORMAT, m, n, k);
printf("\n ALPHA =(%5.1f,%5.1f ) BETA =(%5.1f,%5.1f )",
alpha.real, alpha.imag, beta.real, beta.imag);
PrintParameters("TRANSA, TRANSB", transA, transB);
PrintParameters("LAYOUT", layout);
PrintArrayZ(&layout, FULLPRINT, GENERAL_MATRIX, &ma, &na, a, &lda, "A");
PrintArrayZ(&layout, FULLPRINT, GENERAL_MATRIX, &mb, &nb, b, &ldb, "B");
PrintArrayZ(&layout, FULLPRINT, GENERAL_MATRIX, &m, &n, c, &ldc, "C");

/* Call ZGEMM subroutine ( C Interface ) */

cblas_zgemm(layout, transA, transB, m, n, k, &alpha,
a, lda, b, ldb, &beta, c, ldc);

/* Print output data */

printf("\n\n OUTPUT DATA");
PrintArrayZ(&layout, FULLPRINT, GENERAL_MATRIX, &m, &n, c, &ldc, "C");

mkl_free(a);
mkl_free(b);
mkl_free(c);

return 0;
}

Source code written by me

void mkl_matrixMultiply(vector *result, vector *operand1, int op1_row, int op1_col, int trans1, vector *operand2, int op2_row, int op2_col, int trans2, int order)
{
/*Looping Index */
int i;

MKL_INT m,n,k, lda, ldb, ldc;
MKL_Complex16 alpha, beta;
MKL_INT rmaxa, cmaxa, rmaxb, cmaxb, rmaxc, cmaxc;

/*Matrix Declaration*/
MKL_Complex16 *a, *b, *c;

/*Column_Major || Row_Major*/
CBLAS_LAYOUT layout;
CBLAS_TRANSPOSE transA, transB;

/*SCALAR */
alpha.real = 1.0;
alpha.imag = 1.0;

beta.real = 0.0;
beta.imag = 0.0;

m = 3; /* Matrix A row */
n = 3; /* Matrix B Column */
k = 1; /* Matrix B Row, Matrix A Column */

transA = CblasNoTrans;
transB = CblasNoTrans;

layout = CblasRowMajor;

if( transA == CblasNoTrans )
{
rmaxa = m + 1;
cmaxa = k;
}
else
{
rmaxa = k + 1;
cmaxa = m;
}

if( transB == CblasNoTrans )
{
rmaxb = k + 1;
cmaxb = n;
}
else
{
rmaxb = n + 1;
cmaxb = k;
}

rmaxc = m + 1;
cmaxc = n;

if( layout == CblasRowMajor )
{
lda=cmaxa;
ldb=cmaxb;
ldc=cmaxc;
}
else
{
lda=rmaxa;
ldb=rmaxb;
ldc=rmaxc;
}

/*matrixC = alpha*matrixA*matrixB */
a = (MKL_Complex16*)mkl_calloc(rmaxa*cmaxa,(sizeof(MKL_Complex16)),64);
b = (MKL_Complex16*)mkl_calloc(rmaxb*cmaxb,(sizeof(MKL_Complex16)),64);
c = (MKL_Complex16*)mkl_calloc(rmaxc*cmaxc,(sizeof(MKL_Complex16)),64);

/* Convert the vector arrays into complext_type arrays of data for Win32 library calls. */
/* Add the vector input values into a temporary complex array. */
for (i = 0; i < 3; i++)
{
/* A = M x K */
a[i].real = operand1->real[i];
a[i].imag = operand1->imag[i];

/* B = K X N */
b[i].real = operand2->real[i];
b[i].imag = operand2->imag[i];
}

cblas_zgemm(layout, transA, transB, m, n, k, &alpha,
a, lda, b, ldb, &beta, c, ldc);

for( i = 0; i < 9; i++ )
{
result->real[i] = c[i].real;
result->imag[i] = c[i].imag;
}

mkl_free(a);
mkl_free(b);
mkl_free(c);

} /* end of mkl_matrixMultiply */

jeff222 · ‎12-01-2021

Hi,

Since posting my question I have had a little bit more success. I am trying to multiply op(a) * op(b)

op(a) = { 83 + 10i, 31 + 30i, 23 + 4i} row 3 (m) col 1 (k)

op(b) = {83 - 10i, 31 - 30i, 23 - 4i} row 1 (k) col 3 (n)

the resultant vector from cblas_zgemm

6989 + 0i, 693 + 5053i, 1847 + 2051i,

5053 + 693i, 1861 + 0i, 1399 + 267i,

2051 + 1847i, 267 + 1399i, 545 + 0i

Matlab results of same multiplication

6989 + 0000i, 2873 - 2180i, 1949 - 102i
2873 + 2180i, 1861 + 0000i, 833 + 566i
1949 + 102i, 833 - 566i, 545 + 0000i

Seems like the diagonal matrix is being calculated correctly minus the imaginary not being 0 for the diagonal.

Below I have shared the sample code from 2018 version

Thanks,

Jeff

example source code

/*******************************************************************************
* Copyright 1999-2018 Intel Corporation.
*
* This software and the related documents are Intel copyrighted materials, and
* your use of them is governed by the express license under which they were
* provided to you (License). Unless the License provides otherwise, you may not
* use, modify, copy, publish, distribute, disclose or transmit this software or
* the related documents without Intel's prior written permission.
*
* This software and the related documents are provided as is, with no express
* or implied warranties, other than those that are expressly stated in the
* License.
*******************************************************************************/

/*
! Content:
! C B L A S _ Z G E M M Example Program Text ( C Interface )
!******************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include "mkl.h"

#include "mkl_example.h"

int main(int argc, char *argv[])
{
FILE *in_file;
char *in_file_name;

MKL_INT m, n, k;
MKL_INT lda, ldb, ldc;
MKL_INT rmaxa, cmaxa, rmaxb, cmaxb, rmaxc, cmaxc;
MKL_Complex16 alpha, beta;
MKL_Complex16 *a, *b, *c;
CBLAS_LAYOUT layout;
CBLAS_TRANSPOSE transA, transB;
MKL_INT ma, na, mb, nb;

printf("\n C B L A S _ Z G E M M EXAMPLE PROGRAM\n");

/* Get input parameters */

if( argc == 1 ) {
printf("\n You must specify in_file data file as 1-st parameter");
return 1;
}
in_file_name = argv[1];

/* Get input data */

if( (in_file = fopen( in_file_name, "r" )) == NULL ) {
printf("\n ERROR on OPEN '%s' with mode=\"r\"\n", in_file_name);
return 1;
}
if( GetIntegerParameters(in_file, &m, &n, &k) != 3 ) {
printf("\n ERROR of m, n, k reading\n");
return 1;
}
if( GetScalarsZ(in_file, &alpha, &beta) != 2 ) {
printf("\n ERROR of alpha, beta reading\n");
return 1;
}
if( GetCblasCharParameters(in_file, &transA, &transB, &layout) != 3 ) {
printf("\n ERROR of transA, transB, layout reading\n");
return 1;
}

if( transA == CblasNoTrans ) {
rmaxa = m + 1;
cmaxa = k;
ma = m;
na = k;
} else {
rmaxa = k + 1;
cmaxa = m;
ma = k;
na = m;
}
if( transB == CblasNoTrans ) {
rmaxb = k + 1;
cmaxb = n;
mb = k;
nb = n;
} else {
rmaxb = n + 1;
cmaxb = k;
mb = n;
nb = k;
}
rmaxc = m + 1;
cmaxc = n;
a = (MKL_Complex16 *)mkl_calloc(rmaxa*cmaxa, sizeof(MKL_Complex16), 64);
b = (MKL_Complex16 *)mkl_calloc(rmaxb*cmaxb, sizeof(MKL_Complex16), 64);
c = (MKL_Complex16 *)mkl_calloc(rmaxc*cmaxc, sizeof(MKL_Complex16), 64);
if ( a == NULL || b == NULL || c == NULL ) {
printf("\n Can't allocate memory arrays");
return 1;
}
if( layout == CblasRowMajor ) {
lda=cmaxa;
ldb=cmaxb;
ldc=cmaxc;
} else {
lda=rmaxa;
ldb=rmaxb;
ldc=rmaxc;
}
if( GetArrayZ(in_file, &layout, GENERAL_MATRIX, &ma, &na, a, &lda) != 0 ) {
printf("\n ERROR of array A reading\n");
return 1;
}
if( GetArrayZ(in_file, &layout, GENERAL_MATRIX, &mb, &nb, b, &ldb) != 0 ) {
printf("\n ERROR of array B reading\n");
return 1;
}
if( GetArrayZ(in_file, &layout, GENERAL_MATRIX, &m, &n, c, &ldc) != 0 ) {
printf("\n ERROR of array C reading\n");
return 1;
}
fclose(in_file);

/* Print input data */

printf("\n INPUT DATA");
printf("\n M="INT_FORMAT" N="INT_FORMAT" K="INT_FORMAT, m, n, k);
printf("\n ALPHA =(%5.1f,%5.1f ) BETA =(%5.1f,%5.1f )",
alpha.real, alpha.imag, beta.real, beta.imag);
PrintParameters("TRANSA, TRANSB", transA, transB);
PrintParameters("LAYOUT", layout);
PrintArrayZ(&layout, FULLPRINT, GENERAL_MATRIX, &ma, &na, a, &lda, "A");
PrintArrayZ(&layout, FULLPRINT, GENERAL_MATRIX, &mb, &nb, b, &ldb, "B");
PrintArrayZ(&layout, FULLPRINT, GENERAL_MATRIX, &m, &n, c, &ldc, "C");

/* Call ZGEMM subroutine ( C Interface ) */

cblas_zgemm(layout, transA, transB, m, n, k, &alpha,
a, lda, b, ldb, &beta, c, ldc);

/* Print output data */

printf("\n\n OUTPUT DATA");
PrintArrayZ(&layout, FULLPRINT, GENERAL_MATRIX, &m, &n, c, &ldc, "C");

mkl_free(a);
mkl_free(b);
mkl_free(c);

return 0;
}

Source code written by me

void mkl_matrixMultiply(vector *result, vector *operand1, int op1_row, int op1_col, int trans1, vector *operand2, int op2_row, int op2_col, int trans2, int order)
{
/*Looping Index */
int i;

MKL_INT m,n,k, lda, ldb, ldc;
MKL_Complex16 alpha, beta;
MKL_INT rmaxa, cmaxa, rmaxb, cmaxb, rmaxc, cmaxc;

/*Matrix Declaration*/
MKL_Complex16 *a, *b, *c;

/*Column_Major || Row_Major*/
CBLAS_LAYOUT layout;
CBLAS_TRANSPOSE transA, transB;

/*SCALAR */
alpha.real = 1.0;
alpha.imag = 1.0;

beta.real = 0.0;
beta.imag = 0.0;

m = 3; /* Matrix A row */
n = 3; /* Matrix B Column */
k = 1; /* Matrix B Row, Matrix A Column */

transA = CblasNoTrans;
transB = CblasNoTrans;

layout = CblasRowMajor;

if( transA == CblasNoTrans )
{
rmaxa = m + 1;
cmaxa = k;
}
else
{
rmaxa = k + 1;
cmaxa = m;
}

if( transB == CblasNoTrans )
{
rmaxb = k + 1;
cmaxb = n;
}
else
{
rmaxb = n + 1;
cmaxb = k;
}

rmaxc = m + 1;
cmaxc = n;

if( layout == CblasRowMajor )
{
lda=cmaxa;
ldb=cmaxb;
ldc=cmaxc;
}
else
{
lda=rmaxa;
ldb=rmaxb;
ldc=rmaxc;
}

/*matrixC = alpha*matrixA*matrixB */
a = (MKL_Complex16*)mkl_calloc(rmaxa*cmaxa,(sizeof(MKL_Complex16)),64);
b = (MKL_Complex16*)mkl_calloc(rmaxb*cmaxb,(sizeof(MKL_Complex16)),64);
c = (MKL_Complex16*)mkl_calloc(rmaxc*cmaxc,(sizeof(MKL_Complex16)),64);

/* Convert the vector arrays into complext_type arrays of data for Win32 library calls. */
/* Add the vector input values into a temporary complex array. */
for (i = 0; i < 3; i++)
{
/* A = M x K */
a[i].real = operand1->real[i];
a[i].imag = operand1->imag[i];

/* B = K X N */
b[i].real = operand2->real[i];
b[i].imag = operand2->imag[i];
}

cblas_zgemm(layout, transA, transB, m, n, k, &alpha,
a, lda, b, ldb, &beta, c, ldc);

for( i = 0; i < 9; i++ )
{
result->real[i] = c[i].real;
result->imag[i] = c[i].imag;
}

mkl_free(a);
mkl_free(b);
mkl_free(c);

} /* end of mkl_matrixMultiply */

jeff222 · ‎12-01-2021

Hi,

Since posting my question I have had a little bit more success. I am trying to multiply op(a) * op(b)

op(a) = { 83 + 10i, 31 + 30i, 23 + 4i} row 3 (m) col 1 (k)

op(b) = {83 - 10i, 31 - 30i, 23 - 4i} row 1 (k) col 3 (n)

the resultant vector from cblas_zgemm

6989 + 0i, 693 + 5053i, 1847 + 2051i,

5053 + 693i, 1861 + 0i, 1399 + 267i,

2051 + 1847i, 267 + 1399i, 545 + 0i

Matlab results of same multiplication

6989 + 0000i, 2873 - 2180i, 1949 - 102i
2873 + 2180i, 1861 + 0000i, 833 + 566i
1949 + 102i, 833 - 566i, 545 + 0000i

Seems like the diagonal matrix is being calculated correctly minus the imaginary not being 0 for the diagonal.

Below I have shared the sample code from 2018 version

Thanks,

Jeff

example source code

/*******************************************************************************
* Copyright 1999-2018 Intel Corporation.
*
* This software and the related documents are Intel copyrighted materials, and
* your use of them is governed by the express license under which they were
* provided to you (License). Unless the License provides otherwise, you may not
* use, modify, copy, publish, distribute, disclose or transmit this software or
* the related documents without Intel's prior written permission.
*
* This software and the related documents are provided as is, with no express
* or implied warranties, other than those that are expressly stated in the
* License.
*******************************************************************************/

/*
! Content:
! C B L A S _ Z G E M M Example Program Text ( C Interface )
!******************************************************************************/

#include <stdio.h>
#include <stdlib.h>
#include "mkl.h"

#include "mkl_example.h"

int main(int argc, char *argv[])
{
FILE *in_file;
char *in_file_name;

MKL_INT m, n, k;
MKL_INT lda, ldb, ldc;
MKL_INT rmaxa, cmaxa, rmaxb, cmaxb, rmaxc, cmaxc;
MKL_Complex16 alpha, beta;
MKL_Complex16 *a, *b, *c;
CBLAS_LAYOUT layout;
CBLAS_TRANSPOSE transA, transB;
MKL_INT ma, na, mb, nb;

printf("\n C B L A S _ Z G E M M EXAMPLE PROGRAM\n");

/* Get input parameters */

if( argc == 1 ) {
printf("\n You must specify in_file data file as 1-st parameter");
return 1;
}
in_file_name = argv[1];

/* Get input data */

if( (in_file = fopen( in_file_name, "r" )) == NULL ) {
printf("\n ERROR on OPEN '%s' with mode=\"r\"\n", in_file_name);
return 1;
}
if( GetIntegerParameters(in_file, &m, &n, &k) != 3 ) {
printf("\n ERROR of m, n, k reading\n");
return 1;
}
if( GetScalarsZ(in_file, &alpha, &beta) != 2 ) {
printf("\n ERROR of alpha, beta reading\n");
return 1;
}
if( GetCblasCharParameters(in_file, &transA, &transB, &layout) != 3 ) {
printf("\n ERROR of transA, transB, layout reading\n");
return 1;
}

if( transA == CblasNoTrans ) {
rmaxa = m + 1;
cmaxa = k;
ma = m;
na = k;
} else {
rmaxa = k + 1;
cmaxa = m;
ma = k;
na = m;
}
if( transB == CblasNoTrans ) {
rmaxb = k + 1;
cmaxb = n;
mb = k;
nb = n;
} else {
rmaxb = n + 1;
cmaxb = k;
mb = n;
nb = k;
}
rmaxc = m + 1;
cmaxc = n;
a = (MKL_Complex16 *)mkl_calloc(rmaxa*cmaxa, sizeof(MKL_Complex16), 64);
b = (MKL_Complex16 *)mkl_calloc(rmaxb*cmaxb, sizeof(MKL_Complex16), 64);
c = (MKL_Complex16 *)mkl_calloc(rmaxc*cmaxc, sizeof(MKL_Complex16), 64);
if ( a == NULL || b == NULL || c == NULL ) {
printf("\n Can't allocate memory arrays");
return 1;
}
if( layout == CblasRowMajor ) {
lda=cmaxa;
ldb=cmaxb;
ldc=cmaxc;
} else {
lda=rmaxa;
ldb=rmaxb;
ldc=rmaxc;
}
if( GetArrayZ(in_file, &layout, GENERAL_MATRIX, &ma, &na, a, &lda) != 0 ) {
printf("\n ERROR of array A reading\n");
return 1;
}
if( GetArrayZ(in_file, &layout, GENERAL_MATRIX, &mb, &nb, b, &ldb) != 0 ) {
printf("\n ERROR of array B reading\n");
return 1;
}
if( GetArrayZ(in_file, &layout, GENERAL_MATRIX, &m, &n, c, &ldc) != 0 ) {
printf("\n ERROR of array C reading\n");
return 1;
}
fclose(in_file);

/* Print input data */

printf("\n INPUT DATA");
printf("\n M="INT_FORMAT" N="INT_FORMAT" K="INT_FORMAT, m, n, k);
printf("\n ALPHA =(%5.1f,%5.1f ) BETA =(%5.1f,%5.1f )",
alpha.real, alpha.imag, beta.real, beta.imag);
PrintParameters("TRANSA, TRANSB", transA, transB);
PrintParameters("LAYOUT", layout);
PrintArrayZ(&layout, FULLPRINT, GENERAL_MATRIX, &ma, &na, a, &lda, "A");
PrintArrayZ(&layout, FULLPRINT, GENERAL_MATRIX, &mb, &nb, b, &ldb, "B");
PrintArrayZ(&layout, FULLPRINT, GENERAL_MATRIX, &m, &n, c, &ldc, "C");

/* Call ZGEMM subroutine ( C Interface ) */

cblas_zgemm(layout, transA, transB, m, n, k, &alpha,
a, lda, b, ldb, &beta, c, ldc);

/* Print output data */

printf("\n\n OUTPUT DATA");
PrintArrayZ(&layout, FULLPRINT, GENERAL_MATRIX, &m, &n, c, &ldc, "C");

mkl_free(a);
mkl_free(b);
mkl_free(c);

return 0;
}

Source code written by me

void mkl_matrixMultiply(vector *result, vector *operand1, int op1_row, int op1_col, int trans1, vector *operand2, int op2_row, int op2_col, int trans2, int order)
{
/*Looping Index */
int i;

MKL_INT m,n,k, lda, ldb, ldc;
MKL_Complex16 alpha, beta;
MKL_INT rmaxa, cmaxa, rmaxb, cmaxb, rmaxc, cmaxc;

/*Matrix Declaration*/
MKL_Complex16 *a, *b, *c;

/*Column_Major || Row_Major*/
CBLAS_LAYOUT layout;
CBLAS_TRANSPOSE transA, transB;

/*SCALAR */
alpha.real = 1.0;
alpha.imag = 1.0;

beta.real = 0.0;
beta.imag = 0.0;

m = 3; /* Matrix A row */
n = 3; /* Matrix B Column */
k = 1; /* Matrix B Row, Matrix A Column */

transA = CblasNoTrans;
transB = CblasNoTrans;

layout = CblasRowMajor;

if( transA == CblasNoTrans )
{
rmaxa = m + 1;
cmaxa = k;
}
else
{
rmaxa = k + 1;
cmaxa = m;
}

if( transB == CblasNoTrans )
{
rmaxb = k + 1;
cmaxb = n;
}
else
{
rmaxb = n + 1;
cmaxb = k;
}

rmaxc = m + 1;
cmaxc = n;

if( layout == CblasRowMajor )
{
lda=cmaxa;
ldb=cmaxb;
ldc=cmaxc;
}
else
{
lda=rmaxa;
ldb=rmaxb;
ldc=rmaxc;
}

/*matrixC = alpha*matrixA*matrixB */
a = (MKL_Complex16*)mkl_calloc(rmaxa*cmaxa,(sizeof(MKL_Complex16)),64);
b = (MKL_Complex16*)mkl_calloc(rmaxb*cmaxb,(sizeof(MKL_Complex16)),64);
c = (MKL_Complex16*)mkl_calloc(rmaxc*cmaxc,(sizeof(MKL_Complex16)),64);

/* Convert the vector arrays into complext_type arrays of data for Win32 library calls. */
/* Add the vector input values into a temporary complex array. */
for (i = 0; i < 3; i++)
{
/* A = M x K */
a[i].real = operand1->real[i];
a[i].imag = operand1->imag[i];

/* B = K X N */
b[i].real = operand2->real[i];
b[i].imag = operand2->imag[i];
}

cblas_zgemm(layout, transA, transB, m, n, k, &alpha,
a, lda, b, ldb, &beta, c, ldc);

for( i = 0; i < 9; i++ )
{
result->real[i] = c[i].real;
result->imag[i] = c[i].imag;
}

mkl_free(a);
mkl_free(b);
mkl_free(c);

} /* end of mkl_matrixMultiply */

ShanmukhS_Intel · ‎12-08-2021

Hi,

Thanks for sharing the requested details. We are working on this issue internally, we will get back to you soon with an update.

Best Regards,

Shanmukh.SS

jeff222 · ‎12-08-2021

would i still be able to used the 2018 version of mkl?

jeff222 · ‎12-08-2021

is there another Matrix multiplication that could be used?

ShanmukhS_Intel · ‎12-15-2021

Hi,

Could you please let us know which version of Matlab you are using and which version of MKL is being used by that particular Matlab. Since Matlab uses MKL as a background, we need the info to check in that corresponding version of MKL.

Best Regards,

Shanmukh.SS

jeff222 · ‎12-15-2021

Hello,

I am not using MKL with matlab. Matlab was simply used to verify the results of the matrix multiplication being performed by cblas_zgemm. The version of matlab used MATLAB R2018B.

i believe the MKL version being used is the 2019 version of MKL. As i am seeing 2019 everywhere in the top level of the intel directory installed onto the machine.

I will be integrating the matrix multiplication onto a system using an intel processor.

Currently i am using CUnit test on a 64 bit machine to UT the results before trying it out on the system.

These are the libraries that i have built in when using CUnit Test

libmkl_core.a

libmkl_gf_lp64.a

libmkl_sequential.a

I have had success using this same set up to verify the results of lapacke_zgetri.

When a solution is found would intel provide me with a patch? Or would i have to reinstall the latest version of MKL?

Thanks,

Jeff

ShanmukhS_Intel · ‎12-23-2021

Hi,

>>When a solution is found would intel provide me with a patch? Or would i have to reinstall the latest version of MKL?

We would like to inform you that the shared functionality is working fine and it's not a bug.

Kindly execute the code with correct alpha and beta values mentioned below in your code and check the result. We would also like to inform you that the code got executed fine with correct results (matching with MATLAB results shared by you) from our side.

alpha.real = 1.0; alpha.imag = 0.0;

beta.real = 1.0; beta.imag = 0.0;

Best Regards,

Shanmukh.SS

ShanmukhS_Intel · ‎12-30-2021

Hi,

Reminder:

Is your issue resolved? Kindly let us know if we could close this thread from our end.

Below are the CBLAS_ZGEMM data calculations.

C B L A S _ Z G E M M EXAMPLE PROGRAM

INPUT DATA

M=3 N=3 K=1

ALPHA =( 1.0, 0.0 ) BETA =( 1.0, 0.0 )

TRANSA = CblasNoTrans TRANSB = CblasNoTrans

LAYOUT = CblasRowMajor

ARRAY A LDA=1

( 83.00, 10.00)

( 31.00, 30.00)

( 23.00, 4.00)

ARRAY B LDB=3

( 83.00,-10.00) ( 31.00,-30.00) ( 23.00, -4.00)

ARRAY C LDC=3

( 0.00, 0.00) ( 0.00, 0.00) ( 0.00, 0.00)

OUTPUT DATA

ARRAY C LDC=3

(6989.00, 0.00) (2873.00,-2180.00) (1949.00,-102.00)

(2873.00,2180.00) (1861.00, 0.00) (833.00,566.00)

(1949.00,102.00) (833.00,-566.00) (545.00, 0.00)

Best Regards,

Shanmukh.SS

jeff222 · ‎12-31-2021

Thank you for all your help and collaboration. I appreciate it

ShanmukhS_Intel · ‎01-03-2022

Hi,

Thanks for accepting our solution. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.

Best Regards,

Shanmukh.SS