Hello,<BR />I need to obtain matrix Q of QR decomposition, so I've been
using function *geqrf followed by *orgqr and it performs well. The
problem is when I use this functions with threaded mkl where i've got
good speed up with *geqrf, but no speed up with *orgqr. I've seen user's
manual and it seems that *orgqr is not a threaded function. I have also tested function 'dormqr()' being matrix C an identity matrix and
as expected, I have same results as using 'dorgqr()' function. As we
can see in the manual, 'dormqr()' is a threaded function, so I've got
matrix Q faster than with 'dorgqr()', but for my surprise when I run
both functions in sequential MKL 'dormqr()' is near 3 times faster than
'dorgqr()'. How is it posible if both functions make the same and
thanks!! :)
Jorge
Hi Jorge,
1. Yes you are right. The all functions ?(or/un)gqr are not threaded. We will do that in one of the next updates.
2. regarding dormqr : what size of the tasks?
--Gennady
I use dorgqr and dormqr with same matrix dimensions (2250x2249) of matrix Q. Is it important for their performance??
Thanks Gennady
For the dormqr these sizes are enough to see the performance benefits of using threading.
I know, but my question is why dormqr is faster (near 3 times) than dorgqr if I'm executing both in sequential MKL??