Solved: How to optimize

FredCabral · ‎03-08-2024

Hi all !

I'm new using Intel MKL routines. I have the following function that multiplies two dense matrices:

void multiplyMatrices(double* A, double* B, double* C, int N) {

double alpha = 1.0, beta = 0.0;

cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans,

N, N, N, alpha, A, N, B, N, beta, C, N);

}

I compile with the followin flags

g++ -std=c++11 -O3 -I/opt/intel/oneapi/2022/mkl/latest/include c laicoMult.cpp -L/opt/intel/oneapi/2022/mkl/latest/lib/intel64 -Wl,-rpath,/opt/intel/oneapi/mkl/2024.0/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl -fopenmp -march=native -fopt-info-vec -ffast-math -ftree-vectorize -DARMA_DONT_USE_WRAPPER -o laicoMult

At the moment I'm using a Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz processor with 16GB DDR4.

The code is pretty fast but I'd like to know if there is something I could do to make this code faster.

Can anyone help me ?

Thanks in advance !

Gennady_F_Intel · ‎03-11-2024

yes, you could link against threaded version of MKL instead of sequential (-lmkl_sequential ) once. Please check with MKL Linker Adviser ( https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-link-line-advisor.html ) how to link against OpenMP or TBB threaded runtimes. You also might check MKL Developer Guide where you could find out many of such examples.

View solution in original post

Gennady_F_Intel · ‎03-11-2024

yes, you could link against threaded version of MKL instead of sequential (-lmkl_sequential ) once. Please check with MKL Linker Adviser ( https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-link-line-advisor.html ) how to link against OpenMP or TBB threaded runtimes. You also might check MKL Developer Guide where you could find out many of such examples.