Matrix Q of QR decomposition

Jorge_Lorente · ‎02-24-2011

Hello,
I need to obtain matrix Q of QR decomposition, so I've been using function *geqrf followed by *orgqr and it performs well. The problem is when I use this functions with threaded mkl where i've got good speed up with *geqrf, but no speed up with *orgqr. I've seen user's manual and it seems that *orgqr is not a threaded function. I have also tested function 'dormqr()' being matrix C an identity matrix and as expected, I have same results as using 'dorgqr()' function. As we can see in the manual, 'dormqr()' is a threaded function, so I've got matrix Q faster than with 'dorgqr()', but for my surprise when I run both functions in sequential MKL 'dormqr()' is near 3 times faster than 'dorgqr()'. How is it posible if both functions make the same and 'dormqr()' also make a matrix-matrix multiplication?

thanks!! :)
Jorge

Gennady_F_Intel · ‎02-25-2011

Hi Jorge,

1. Yes you are right. The allfunctions ?(or/un)gqr are not threaded. We will do that in one of the next updates.

2. regardingdormqr : what size of the tasks?

--Gennady

Jorge_Lorente · ‎03-02-2011

I use dorgqr and dormqr with same matrix dimensions (2250x2249) of matrix Q. Is it important for their performance??
Thanks Gennady

Gennady_F_Intel · ‎03-11-2011

For the dormqr these sizes are enough to see the performance benefitsof using threading.

Jorge_Lorente · ‎03-14-2011

I know, but my question is why dormqr is faster (near 3 times) than dorgqr if I'm executing both in sequential MKL??