Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Matrix Q of QR decomposition

Jorge_Lorente
Beginner
462 Views
Hello,
I need to obtain matrix Q of QR decomposition, so I've been using function *geqrf followed by *orgqr and it performs well. The problem is when I use this functions with threaded mkl where i've got good speed up with *geqrf, but no speed up with *orgqr. I've seen user's manual and it seems that *orgqr is not a threaded function. I have also tested function 'dormqr()' being matrix C an identity matrix and as expected, I have same results as using 'dorgqr()' function. As we can see in the manual, 'dormqr()' is a threaded function, so I've got matrix Q faster than with 'dorgqr()', but for my surprise when I run both functions in sequential MKL 'dormqr()' is near 3 times faster than 'dorgqr()'. How is it posible if both functions make the same and 'dormqr()' also make a matrix-matrix multiplication?

thanks!! :)
Jorge
0 Kudos
4 Replies
Gennady_F_Intel
Moderator
462 Views
Hi Jorge,
1. Yes you are right. The allfunctions ?(or/un)gqr are not threaded. We will do that in one of the next updates.
2. regardingdormqr : what size of the tasks?
--Gennady
0 Kudos
Jorge_Lorente
Beginner
462 Views
I use dorgqr and dormqr with same matrix dimensions (2250x2249) of matrix Q. Is it important for their performance??
Thanks Gennady
0 Kudos
Gennady_F_Intel
Moderator
462 Views
For the dormqr these sizes are enough to see the performance benefitsof using threading.
0 Kudos
Jorge_Lorente
Beginner
462 Views
I know, but my question is why dormqr is faster (near 3 times) than dorgqr if I'm executing both in sequential MKL??
0 Kudos
Reply