Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library
- Waht is wrong with cblas_dgemv

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

pawan_k_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-06-2014
10:09 AM

444 Views

Waht is wrong with cblas_dgemv

Hello,

I am interested in the following operation:

Y = A^T * X

using the cblas_dgemv or cblas_dgemm routine in MKL. In the code I use small letters: y, a, and x.

I have a test example as follows:

#include <cstdio>

#include <cstdlib>

#include "mkl.h"

void main()

{

double a[6] = {2., 3., 1., 1., 0., 2. }; // 3 x 2 matrix

double x[3] = {1., 1., 1.};

double y[2] = {0., 0.};

int m = 3;

int n = 2;

double alpha = 1.0;

double beta = 1.0;

int incx = 1;

int incy = 1;

int i,j;

for (i=0; i<m; i++){

for (j=0; j<n; j++)

printf("%g ", a[i*n + j]);

printf("\n");

}

cblas_dgemv(CblasRowMajor, CblasTrans, m, n, alpha, a, m, x, incx, beta, y, incy );

for (i=0; i<n; i++)

printf("%g\n", y*);*

}

After running the code, I get wrong results. Can someone point out the mistake?

Regards,

Pawan

2 Solutions

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-06-2014
01:03 PM

444 Views

The seventh argument to cblas_dgemv should be n. The convention is that a matrix is passed by specifying four attributes:

- the address of the base, i.e., the address of the (1,1) element in Fortran or the [0][0] element in C/C++,
- the number of rows, m
- the number of columns, n
- the stride from one row to the next, if using row-major order as in C/C++, or the stride from one column to the next, if using column-major order as in Fortran.

You are passing a 3 X 2 matrix using row-major order; therefore, the stride from each row to the next is 2.

There is some avoidable confusion in the MKL documentation; this is caused by using Fortran-centric terms such as "leading dimension", which a C/C++ programmer may take to be something different.

Ying_H_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-06-2014
10:25 PM

444 Views

Hi Pawan,

Thank you a lot for your comments about dgemv and dgemm.

Just add some

based on the documentation of dgemm: C_{mxn=}op(A)_{mxk}*op(B)_{kxn}

m INTEGER. Specifies the number of rows of the matrix **op(A) **and of the matrix C. The value of m must be at least zero

k, INTEGER. Specifies the number of columns of the matrix op(A) and the number of rows of the matrix op(B).

Where the documetnation of dgemv, the m. n is from A and LDA is A.

m INTEGER. Specifies the number of rows of the matrix A. The value of m must be at least zero.

n INTEGER. Specifies the number of columns of the matrix A. The value of n must be at least zero.

So you are exctaly right, that for dgemm, we are required to put number of rows and number of colums of A^T when transA=T.

But **LDA is the stride of A both the dgemv and dgemm, whatever transA= N or T.**

I guess, you may have tried

1. cblas_dgemm(CblasRowMajor, CblasTrans, CblasNoTrans, 2, 1, 3, 1.0, a, 2, x, 1, 0, y, 1); get y=(6, 3)

2. cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 2, 1, 3, 1.0, a, 3, x, 1, 0, y, 1); get y=(3,6)

thus happen to get the conclustion of LDA=A.nrows when trans='N'. Actually, it is not correct because the A are different under the two cases. for example, if you change the original A = {2., 3., **5.,** 1., 0., 2. }, you will see the difference more obviously.

2 3 )T 1 m=3, n=2 2 3 5 1 m=2, n=3

5 1 * 1 = 7 1 0 2 * 1 = 10

(0 2 * 1 6 1 3

Hope it can lighten some confustion.

Best Regards,

Ying

Link Copied

13 Replies

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-06-2014
01:03 PM

445 Views

The seventh argument to cblas_dgemv should be n. The convention is that a matrix is passed by specifying four attributes:

- the address of the base, i.e., the address of the (1,1) element in Fortran or the [0][0] element in C/C++,
- the number of rows, m
- the number of columns, n
- the stride from one row to the next, if using row-major order as in C/C++, or the stride from one column to the next, if using column-major order as in Fortran.

You are passing a 3 X 2 matrix using row-major order; therefore, the stride from each row to the next is 2.

There is some avoidable confusion in the MKL documentation; this is caused by using Fortran-centric terms such as "leading dimension", which a C/C++ programmer may take to be something different.

pawan_k_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-06-2014
03:08 PM

444 Views

pawan_k_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-06-2014
04:23 PM

444 Views

For dgemm, it was confusing as well. I put an working

example below for someone in need.

Goal: C := alpha A^T * B + beta * C, with alpha = 1; beta = 0;

Use the following:

cblas_dgemm(CblasRowMajor, CblasTrans, CblasNoTrans, A.ncols(), B.ncols(), A.nrows(),

alpha, A.vec, A.ncols(), B.vec, B.ncols(), beta, C.vec, B.ncols())

Here I take:

LDA: stride for A = A.ncols(); // it would have been A.nrows(), if transA = 'N'

NOTE: Notice that this is a bit confusing when compared to dgemv, because for dgemm, the number of rows and number of columns of A^T are input. By this logic, one may take LDA to be A.nrows() above, because the rows of A^T have strides of A.nrows(). In other words, although for dgemm, we are required to put number of rows and number of columns of A^T, for LDA we must consider strides for A not for A^T.

LDB: stride for B = B.ncols();

LDC: stride for C = B.ncols();

Regards,

Pawan

Ying_H_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-06-2014
10:25 PM

445 Views

Hi Pawan,

Thank you a lot for your comments about dgemv and dgemm.

Just add some

based on the documentation of dgemm: C_{mxn=}op(A)_{mxk}*op(B)_{kxn}

m INTEGER. Specifies the number of rows of the matrix **op(A) **and of the matrix C. The value of m must be at least zero

k, INTEGER. Specifies the number of columns of the matrix op(A) and the number of rows of the matrix op(B).

Where the documetnation of dgemv, the m. n is from A and LDA is A.

m INTEGER. Specifies the number of rows of the matrix A. The value of m must be at least zero.

n INTEGER. Specifies the number of columns of the matrix A. The value of n must be at least zero.

So you are exctaly right, that for dgemm, we are required to put number of rows and number of colums of A^T when transA=T.

But **LDA is the stride of A both the dgemv and dgemm, whatever transA= N or T.**

I guess, you may have tried

1. cblas_dgemm(CblasRowMajor, CblasTrans, CblasNoTrans, 2, 1, 3, 1.0, a, 2, x, 1, 0, y, 1); get y=(6, 3)

2. cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, 2, 1, 3, 1.0, a, 3, x, 1, 0, y, 1); get y=(3,6)

thus happen to get the conclustion of LDA=A.nrows when trans='N'. Actually, it is not correct because the A are different under the two cases. for example, if you change the original A = {2., 3., **5.,** 1., 0., 2. }, you will see the difference more obviously.

2 3 )T 1 m=3, n=2 2 3 5 1 m=2, n=3

5 1 * 1 = 7 1 0 2 * 1 = 10

(0 2 * 1 6 1 3

Hope it can lighten some confustion.

Best Regards,

Ying

pawan_k_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-07-2014
01:19 AM

444 Views

Anas_A_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-06-2014
07:53 AM

444 Views

Hello!

I have a problem with the function dgemv so I thought to post it here. If I should open a new topic please let me know to fix it.

Let A1, A2, A3 be 4x4 matrices.

A1 = 5 6 7 8 , A2 = 21 22 23 24 , A3 = 37 38 39 40

9 10 11 12 25 26 27 28 41 42 43 44

13 14 15 16 29 30 31 32 45 46 47 48

17 18 19 20 33 34 35 36 49 50 51 52

I defined in this way in my code

double A[48]; double y[4]; for( int i=0; i<48; i++) {A= i+5;}.

I want to multiply the matrix A1 with the vector

double pi0 = {0.1, 0.2, 0.3, 0.4};

so I did this

cblas_dgemv(CblasRowMajor, CblasNoTrans, 4, 4, 1.0, &A[0:15], 4, pi0, 1, 0.0, y, 1);

The result I got is {21, 25, 29,31} , although the true result is {7, 11 , 15 ,19};

One solution that worked was to copy the first 15 values of the matrix A to another matrix B, then when I did

int B[16]; for(int i=0; i<16; i++) {B=A; } cblas_dgemv(CblasRowMajor, CblasNoTrans, 4, 4, 1.0, B, 4, pi0, 1, 0.0, y, 1);

it worked.

The problem is that I don't understand what went wrong in the first way. Also if I first copy the matrix I think that this will require more time. Please, help me.

Thank you.

pawan_k_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-06-2014
12:54 PM

444 Views

Hi

Replace &A[0:15] by A.

Does it work now?

cblas_dgemv(CblasRowMajor, CblasNoTrans, 4, 4, 1.0, A, 4, pi0, 1, 0.0, y, 1);

Regards,

Pawan

Anas_A_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-06-2014
03:36 PM

444 Views

Hi! Yes, now it works. Could I ask you something? How do we find the lda? I have understood that if we have a matrix mxn and we save it row-by-row then the lda is equal to the number of columns. But, how do we decide in a situation like here? I mean when we have 3 matrices which we have defined firstly the first matrix row-by-row, then the second matrix row-by-row and etc.

If we want to compute the product of the matrix A2 (or A3) with the vector pi0 how do we select the lda?

In general how do we decide the lda in we have P matrices (not 3). (I have a problem like this but with P matrices, where P =0,1,2,3,...)

If you know a book, notes , anything that could help me with that please tell me.

Thanks for helping me!

Ying_H_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-06-2014
06:04 PM

444 Views

Hi Anas,

Right, lda is generally equal to the number of columns when A is mxn.

About the complex situation like A[48], it is a 1-D array in C. and

A1={A[0], A[15]};

A2={A[16], A[31]};

A3={A[32], .. A[47];

As you see, the A1, A2, A3, can be accessed by regular patten as a 1-D C-array. The adress of A2[mxn] is &A[16]. and it's mxn elements are continous elements in A[16] .. A[31], so lda is still n. you can just move the first pointer to access A2, A3. for example,

if A2 * pio, call cblas_dgemv(CblasRowMajor, CblasNoTrans, 4, 4, 1.0, &A[16], 4, pi0, 1, 0.0, y, 1); or A+16

if A3* pio, call cblas_dgemv(CblasRowMajor, CblasNoTrans, 4, 4, 1.0, A+32 , pi0, 1, 0.0, y, 1); or &A[32]

The key point is how you access the array in C array. if only the array can be accessed by some regular pattern as a C array, then you can decide the input parameter of A.

Best Regards,

Ying

P.S. Please ignore the syntax I mentioned last time about Array notation like A[0:15]. It is special syntax only for Array notation, can't be taken as a C pointer to pass to MKL funtion.

Anas_A_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-09-2014
11:07 AM

444 Views

Eman_H_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-13-2016
06:08 AM

444 Views

can anyone please explain what does each of the parameters incX and incY indicate ?

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-13-2016
06:28 AM

444 Views

Eman H. wrote:

can anyone please explain what does each of the parameters incX and incY indicate ?

These are described in the MKL documentation, as well at http://www.netlib.org/lapack/explore-html/dc/da8/dgemv_8f.html .

The ?gemv routines compute α A x + β y or α A^{T} x + β y, where A is a matrix and x, y are vectors. Sometimes, we wish to extract a row from a matrix and regard that as a vector. In Fortran, storage is by columns, so the distance between successive row elements to be extracted is not equal to 1, as in a column vector of the same matrix, but is equal to the number of columns in the matrix. Similar considerations apply to the destination of the computed vector.

Ying_H_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-13-2016
07:53 PM

444 Views

Hi Eman,

In simple word, the incX is the distance of each element (who will involved the compute A*X) or the stride of X[1] and X[2]. In most of case, the vector(involved the compute) is continuous, it is 1.

In some times, if the element (will involve the compute) is not continuous, but can be **accessed by regular way**, for example, X[1], X[3], X[5], then the incX = 2.

Best Regards,

Ying

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.