Hi,
oneMKL gemm function can be called from the "host" as below
gemm_done = oneapi::mkl::blas::gemm(main_queue, transA, transB, m, n, k, alpha, A, ldA, B, ldB, beta, C, ldC, gemm_dependencies);
can gemm function also be called within user's kernel code? For example,
sycl::queue queue;
queue.submit([&](sycl::handler& cgh) {
cgh.parallel_for(range,[=](…) {
oneapi::mkl::blas::gemm(...); // calling routine from user’s kernel code
});
});
If so, do we need to pass "queue" again as the first argument to oneapi::mkl::blas::gemm(..)?
Thanks.
連結已複製
Hi Rajitha,
Thanks for posting in Intel communities.
By default, oneapi::mkl::blas::gemm can only be used on the host by passing the queue as an input parameter as per the example shown.
Thanks and Regards,
Praneeth Achanta
Hi @TestDR
As @PraneethA_Intel mentioned, oneMKL BLAS routines (including oneapi::mkl::blas::gemm) can only be called from host code currently, not from within a kernel.
The good news is that in the next oneAPI major release, we are planning to introduce experimental APIs that would let you call oneMKL GEMM within a kernel. If you're able to provide more details (how large the matrices are, how GEMM fits into the calculation your kernel performs), we could assess if these APIs would be a good fit for your use case.
Best, Peter
Hi Rajitha,
We assume that your issue is resolved. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.
Thanks and Regards,
Praneeth Achanta