Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6651 Discussions

How to use the function ‘cblas_dgemm_compute’ of mkl?

Are there any examples showing how to use those functions: cblas_dgemm_pack_get_size(), 
cblas_dgemm_pack(), cblas_dgemm_compute() ? I would like to realize a specialized GEMM with a packed matrix B. Thanks.
This is my code. I have two questions:
①The function cblas_dgemm_pack_get_size() returns a very large number 7767808 when I want to pack the matrix B whose element type is double and dimension equals to 256 * 256. I think the buffer size needed to store the packed B will not bigger than 256 * 256 * 8 * 2 Bytes. 
②The difference between C1 and C2 is larger than 1.0e-6. So I think there is a wrong.
int main(int argc, const char* argv[])
    // matrix parameters
    int M, N, K;
    int LDA, LDB, LDC;
    printf("[INPUT] input M N K\n");
    if(scanf("%d %d %d", &M, &N, &K) == 3){
        printf("[TRUE] true parameters for scanf\n");
        printf("[FALSE] false parameters for scanf\n");
    // matrix buffer, column major
    LDA = M, LDB = K, LDC = M;
    double *A = NULL,
                  *B = NULL, *B_PACK = NULL,
                  *C1 = NULL, *C2 = NULL;
    double alpha = 0.000001, beta = 0.000001;

    A = (double *) malloc (sizeof(double) * M * K);
    B = (double *) malloc (sizeof(double) * K * N);
    C1 = (double *) malloc (sizeof(double) * M * N);
    C2 = (double *) malloc (sizeof(double) * M * N);
    gen_matrix(A, M, K), gen_matrix(B, K, N), gen_matrix(C1, M, N);    // initialize matrix A、B、C1
    matrix_copy(C1, M, N, C2);  // copy the value from C1 to C2
    B_PACK = (double *) malloc (cblas_dgemm_pack_get_size(CblasBMatrix, M, N, K));
    cblas_dgemm_pack(CblasColMajor, CblasBMatrix, CblasNoTrans, M, N, K, alpha,   B, LDB, B_PACK);

    cblas_dgemm(CblasColMajor, CblasNoTrans, CblasNoTrans, M, N, K, alpha, A, LDA, B, LDB, beta, C1, LDC);
    cblas_dgemm_compute(CblasColMajor, CblasNoTrans, CblasNoTrans, M, N, K, A, LDA, B_PACK , LDB, beta, C2, LDC);

    double diff = max_abs_diff(M, N, C1, LDC, C2, LDC);
    printf("diff = %lf\n", diff);

    return 0;
Labels (2)
0 Kudos
0 Replies