 Mark as New
 Bookmark
 Subscribe
 Mute
 Subscribe to RSS Feed
 Permalink
 Email to a Friend
 Report Inappropriate Content
Hello,
I'm trying to call mkl_dbsrmm once instead of calling mkl_dbsrmv multiple times, but I'm getting different results, like:
X
1.000000e+00 2.000000e+00
2.000000e+00 3.000000e+00
1.000000e+00 2.000000e+00
2.000000e+00 3.000000e+00
Y = AX (bsrmm)
1.746125e01 8.390122e01
1.469818e+00 1.939678e+00
1.974694e01 2.000000e+00
3.128180e+00 2.000000e+00
Y = AX (bsrmv x2)
1.746125e01 3.594702e01
1.469818e+00 2.228417e+00
1.974694e01 5.919458e01
3.128180e+00 2.125353e+00
Is there a regression somewhere inside mkl_dbsrmm or am I doing something wrong? I've been dealing with csrmm in the past with no such problem. Thanks in advance.
PS: compile line is, with MKL 2017
icpc bsrmm.cpp L/opt/intel/mkl/lib/intel64 lmkl_core lmkl_rt lmkl_intel_lp64 lmkl_intel_thread liomp5
Link Copied
 Mark as New
 Bookmark
 Subscribe
 Mute
 Subscribe to RSS Feed
 Permalink
 Email to a Friend
 Report Inappropriate Content
Hi qweasd,
I think you misunderstand usage of bsrmm & bsrmv. You set matdescra parameter by zerobase but set indx, pntrb & pntre by onebase. Thus the result of bsrmm & bsrmv are totally different. Another point is that the bsrmm operate multiplication of Matrix & vector, you'd better define input value for bsrmm as X_v
int ia[M + 1] = { 0, 2, 4 }; //zerobase int ja[NNZ] = {0, 1, 0,1}; int pointerB={0,2}; int pointerE ={2,4}; double a[NNZ * BS * BS] = { 0.505208, 0.015657, 0.015657, 0.530954, 0.310105, 0.0630386, 0.00541174, 0.180264, 0.310105, 0.00541174, 0.0630386, 0.180264 , 0.511346, 0.000116003, 0.000116003, 0.524502}; double X[M* BS] = { {1, 2}, {1, 2}, {2, 3}, {2, 3} }; //input for bsrmm double X_v [M*BS]={{1,1,2,3},{2,2,3,3}}; //input for bsrmv double Y[M* BS] = { {0, 0}, {0, 0}, {0, 0}, {0, 0} }; double Temp [M * BS] = {{0, 0, 0, 0}, {0, 0, 0, 0} }; int m = M; int n = N; int k = K; int bs = BS; double alpha = 1.0; double beta = 0.0; char matdescra[6] = { 'T', 'L', 'N', 'C', '0', '0' }; mkl_dbsrmm("N", &m, &n, &k, &bs, &alpha, matdescra, a, ja, ia, &ia[1], &(X[0][0]), &m, &beta, &(Y[0][0]), &k); std::cout << std::scientific; std::cout << "X" << std::endl; for(int i = 0; i < M * BS; ++i) { for(int j = 0; j < N; ++j) { std::cout << X[M * BS * j + i] << "\t"; } std::cout << std::endl; } std::cout << "Y = AX (bsrmm)" << std::endl; for(int i = 0; i < M * BS; ++i) { for(int j = 0; j < N; ++j) { std::cout << Y << "\t"; } std::cout << std::endl; } std::cout << "Y = AX (bsrmv x2)" << std::endl; for(int i = 0; i < N; ++i) { mkl_dbsrmv("N", &m, &k, &bs, &alpha, matdescra, a, ja, pointerB, pointerE, &(X_v[0]), &beta, &(Temp[0])); } for(int j = 0; j < M*BS; j++) { for(int i = 0; i < N; i++) { std::cout << Temp << "\t"; } std::cout << std::endl; }
Best regards,
Fiona
 Mark as New
 Bookmark
 Subscribe
 Mute
 Subscribe to RSS Feed
 Permalink
 Email to a Friend
 Report Inappropriate Content
Thank you for your answer. Sorry, I set matdescra[4] to N instead of F. I have fixed the example now but the problem obviously remains. My BSR matrices use onebased indexing so I don't see how your code can help me. Plus, as stated in the example, I don't have such problem with mkl_dcsrmm which is also used for matrix * matrix products. Could you let me know if I'm calling mkl_dbsrmm right, or if there is a bug with the routine with onebased indexed matrices?
Thanks
 Mark as New
 Bookmark
 Subscribe
 Mute
 Subscribe to RSS Feed
 Permalink
 Email to a Friend
 Report Inappropriate Content
On a side note, if I change in your example {
'T'
,
'L'
,
'N'
,
'C'
,
'0'
,
'0'
}; to {
'G'
,
'0'
,
'0'
,
'C'
,
'0'
,
'0'
};
I get different results between, so there is another problem right there.
 Mark as New
 Bookmark
 Subscribe
 Mute
 Subscribe to RSS Feed
 Permalink
 Email to a Friend
 Report Inappropriate Content
Hi qweasd,
Sorry for my written mistake, the input of bsrmv should be
double X_v
In your program, there are 3 places lead errors:
1. 'F' could only used for onebased indexing, normally used for Fortran. However in c code, normally used with zerobased indexing. You'd better use G,0,0,C, or change all array's value & start from 1.
2. In bsrmm function, A is 4*4, B is 4*2 (B[0]={1,2}); the result C should be 4*2 as well, the last two value is 2 because you set onebased indexing that first two values of A, B has not been calculated.
3. In bsrmv function, A is 4*4, B is 2*4 (B[0]={1,2,1,2}) . How could you expect the result of this calculation equals to previous one? The result of bsrmv in your program is wrong. The fist column of result is the same that is because the equation is C=alpha*A*B+C. So, in your program, the bsrmv function is action doing work of Y=alpha*A*B+bsrmm_result.
4. seems you didn't use sparse matrix, length of a is 16, all values are nonzero. Thus, the functionality should be same as dgemm. I checked with Dgemm, the return is correct by using following code(fix value setting of X_v).
Please use following code, the result should be as same as Dgemm:
int ia[M + 1] = { 0, 2, 4}; //zerobase for C program int ja[NNZ] = {0,1, 0,1}; int pointerB={0,2}; int pointerE ={2,4}; double a[NNZ * BS * BS] = { 0.505208, 0.015657, 0.015657, 0.530954, 0.310105, 0.0630386, 0.00541174, 0.180264, 0.310105, 0.00541174, 0.0630386, 0.180264 , 0.511346, 0.000116003, 0.000116003, 0.524502}; double X[M* BS] = { {1, 2}, {1, 2}, {2, 3}, {2, 3} }; //input for srmm double X_v [M*BS]={{1,1,2,2},{2,2,3,3}}; //input for srmv double Y[M* BS] = { {0, 0}, {0, 0}, {0, 0}, {0, 0} }; double Temp [M * BS] = {{0, 0, 0, 0}, {0, 0, 0, 0} }; int m = M; int n = N; int k = K; int bs = BS; double alpha = 1.0; double beta = 0.0; char matdescra[6] = { 'G', '0', '0', 'C', '0', '0' }; mkl_dbsrmm("N", &m, &n, &k, &bs, &alpha, matdescra, a, ja, ia, &ia[1], &(X[0][0]), &m, &beta, &(Y[0][0]), &k); std::cout << std::scientific; std::cout << "X" << std::endl; for(int i = 0; i < M * BS; ++i) { for(int j = 0; j < N; ++j) { std::cout << X << "\t"; } std::cout << std::endl; } std::cout << "Y = AX (bsrmm)" << std::endl; for(int i = 0; i < M * BS; ++i) { for(int j = 0; j < N; ++j) { std::cout << Y << "\t"; } std::cout << std::endl; } std::cout << "Y = AX (bsrmv x2)" << std::endl; for(int i = 0; i < N; ++i) { mkl_dbsrmv("N", &m, &k, &bs, &alpha, matdescra, a, ja, pointerB, pointerE, &(X_v[0]), &beta, &(Temp[0])); } for(int j = 0; j < M*BS; j++) { for(int i = 0; i < N; i++) { std::cout << Temp << "\t"; } std::cout << std::endl; } std::cout << "Results are same!" << std::endl;
Or you could test with Dgemm function, you will find there's no problem with bsrmm.
Best regards,
Fiona
 Mark as New
 Bookmark
 Subscribe
 Mute
 Subscribe to RSS Feed
 Permalink
 Email to a Friend
 Report Inappropriate Content
Wow, that’s kind of funny and sad at the same time how this conversation is going nowhere.
Let me start over. In my application, I have BSR matrix A (of course it has hundred thousands of rows/columns so I cannot use gemm…). I need to compute the product of A and multiple vectors (stored columnwise as a matrix B)
For instance, if m = k = 2, lb = 2, and n = 2, with
B = [1, 2;
3, 4;
5, 6;
7, 8];
stored in memory as B = {1, 3, 5, 7, 2, 4, 6, 8 }; (this is how B must be stored if you need to solve A^1 B with PARDISO)
I want to compute C = A * B, stored in memory as C = { c_11, c_21, c_31, c_41, c_12, c_22, c_32, c_42 };
I cannot change how B is stored in memory because it is how PARDISO takes it as an argument and I don’t want to change the layout of B everytime I call bsrmm and then PARDISO.
In this thread answered by a colleague of yours https://software.intel.com/enus/forums/intelmathkernellibrary/topic/365430#comment1724657, he suggests to use a onebased CSR matrix for the routine csrmm given my layout of B and C, now what I need to do with A to get the desired results as a BSR matrix? I can do a loop over (B + j * lb * m), j = 0..n1 and call bsrmv n times, but I would rather use bsrmm one time. Are you telling me that the memory layout of the input matrices B are different between bsrmm and csrmm/PARDISO?
I’m attaching another example, the same matrix A is stored as a CSR and a BSR matrix. The calls to csrmm and bsrmv are the same, but not with bsrmm. I cannot change how B and C are stored, otherwise, how can I call PARDISO afterwards? I can change the call to bsrmm however (for example use a zerobased BSR matrix for A).
Thank you in advance for your help.
 Mark as New
 Bookmark
 Subscribe
 Mute
 Subscribe to RSS Feed
 Permalink
 Email to a Friend
 Report Inappropriate Content
Hi qweasd:
I think there are some errors in your code:
1. you print value of y like these: y[j * m * bs + i], is that mean the length of y is 22 ? However, you defined the length of y is 10 as: double y[k*m] (k=5, m=2). Please pay attention to the length of array.
2. You get correct value of csrmv & bsrmv, but it doesn't mean you calculated correctly, the length of y is incorrect. And Number of block rows of the matrix A is also incorrect.
3. By using srmv/crmv/srmm ,the matrix A is normally sparse matrix. Well, if you plan to calculate like a normally matrix in this program, you can. But, you should define the Number of block rows of the matrix A as 1, it means length of block's row not length of A's row. In this example code, the block is 1*1, you have 4 blocks for matrix A that block1 is 10, block2 is 2, block3 is 2, block4 is 10.
If you set Number of block rows of the matrix A, that means each block is a 2*2 matrix.
4. You should set Number of columns of the matrix C as k, not m. The number of columns of C is 5, not 2.
Please read developer guide to help you understand the usage completely. I attached the code below:
int m_block=1 int m_x=2; mkl_dbsrmm("N", &m_block, &k, &m_block, &bs, &one, matdescra, a, ja, ia, &ia[1], x, &m_x, &two, y, &m_x); std::cout << "Y = AX + 2Y (bsrmm F):" << std::endl; for(int i = 0; i <m_x; ++i) { for(int j = 0; j <k; ++j) { std::cout << y[j*m_x+i] << "\t"; } std::cout << std::endl; }
regards,
Fiona
 Mark as New
 Bookmark
 Subscribe
 Mute
 Subscribe to RSS Feed
 Permalink
 Email to a Friend
 Report Inappropriate Content
Alright, I now get the correct results. The documentation here https://software.intel.com/enus/node/520833 is quite misleading:
 ldb

Specifies the leading dimension (in blocks) of b as declared in the calling (sub)program.
 ldc

Specifies the leading dimension (in blocks) of c as declared in the calling (sub)program.
If I want my first example to return the correct result, I have to set ldb and ldc to M * BS, and not M. I think you should remove "(in blocks)" from the documentation.
 Subscribe to RSS Feed
 Mark Topic as New
 Mark Topic as Read
 Float this Topic for Current User
 Bookmark
 Subscribe
 Printer Friendly Page