topic Matrix Q of QR decomposition in Intel® oneAPI Math Kernel Library

Matrix Q of QR decomposition

Jorge_Lorente — Thu, 24 Feb 2011 15:48:58 GMT

Hello,
I need to obtain matrix Q of QR decomposition, so I've been using function *geqrf followed by *orgqr and it performs well. The problem is when I use this functions with threaded mkl where i've got good speed up with *geqrf, but no speed up with *orgqr. I've seen user's manual and it seems that *orgqr is not a threaded function. I have also tested function 'dormqr()' being matrix C an identity matrix and as expected, I have same results as using 'dorgqr()' function. As we can see in the manual, 'dormqr()' is a threaded function, so I've got matrix Q faster than with 'dorgqr()', but for my surprise when I run both functions in sequential MKL 'dormqr()' is near 3 times faster than 'dorgqr()'. How is it posible if both functions make the same and 'dormqr()' also make a matrix-matrix multiplication?

thanks!! :)
Jorge

Matrix Q of QR decomposition

Gennady_F_Intel — Fri, 25 Feb 2011 08:34:55 GMT

Hi Jorge,

1. Yes you are right. The allfunctions ?(or/un)gqr are not threaded. We will do that in one of the next updates.

2. regardingdormqr : what size of the tasks?

--Gennady

Matrix Q of QR decomposition

Jorge_Lorente — Wed, 02 Mar 2011 15:50:12 GMT

I use dorgqr and dormqr with same matrix dimensions (2250x2249) of matrix Q. Is it important for their performance??
Thanks Gennady

Matrix Q of QR decomposition

Gennady_F_Intel — Fri, 11 Mar 2011 17:50:10 GMT

For the dormqr these sizes are enough to see the performance benefitsof using threading.

Matrix Q of QR decomposition

Jorge_Lorente — Mon, 14 Mar 2011 08:48:34 GMT

I know, but my question is why dormqr is faster (near 3 times) than dorgqr if I'm executing both in sequential MKL??