topic Matrix Q of QR decomposition in IntelĀ® oneAPI Math Kernel Library
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Matrix-Q-of-QR-decomposition/m-p/794485#M2541
Hello,<BR />I need to obtain matrix Q of QR decomposition, so I've been
using function *geqrf followed by *orgqr and it performs well. The
problem is when I use this functions with threaded mkl where i've got
good speed up with *geqrf, but no speed up with *orgqr. I've seen user's
manual and it seems that *orgqr is not a threaded function. I have also tested function 'dormqr()' being matrix C an identity matrix and
as expected, I have same results as using 'dorgqr()' function. As we
can see in the manual, 'dormqr()' is a threaded function, so I've got
matrix Q faster than with 'dorgqr()', but for my surprise when I run
both functions in sequential MKL 'dormqr()' is near 3 times faster than
'dorgqr()'. How is it posible if both functions make the same and
'dormqr()' also make a matrix-matrix multiplication?<BR /><BR />thanks!! :)<BR />Jorge<BR />Thu, 24 Feb 2011 15:48:58 GMTJorge_Lorente2011-02-24T15:48:58ZMatrix Q of QR decomposition
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Matrix-Q-of-QR-decomposition/m-p/794485#M2541
Hello,<BR />I need to obtain matrix Q of QR decomposition, so I've been
using function *geqrf followed by *orgqr and it performs well. The
problem is when I use this functions with threaded mkl where i've got
good speed up with *geqrf, but no speed up with *orgqr. I've seen user's
manual and it seems that *orgqr is not a threaded function. I have also tested function 'dormqr()' being matrix C an identity matrix and
as expected, I have same results as using 'dorgqr()' function. As we
can see in the manual, 'dormqr()' is a threaded function, so I've got
matrix Q faster than with 'dorgqr()', but for my surprise when I run
both functions in sequential MKL 'dormqr()' is near 3 times faster than
'dorgqr()'. How is it posible if both functions make the same and
'dormqr()' also make a matrix-matrix multiplication?<BR /><BR />thanks!! :)<BR />Jorge<BR />Thu, 24 Feb 2011 15:48:58 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Matrix-Q-of-QR-decomposition/m-p/794485#M2541Jorge_Lorente2011-02-24T15:48:58ZMatrix Q of QR decomposition
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Matrix-Q-of-QR-decomposition/m-p/794486#M2542
Hi Jorge,<DIV>1. Yes you are right. The allfunctions ?(or/un)gqr are not threaded. We will do that in one of the next updates.</DIV><DIV>2. regardingdormqr : what size of the tasks?</DIV><DIV>--Gennady</DIV>Fri, 25 Feb 2011 08:34:55 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Matrix-Q-of-QR-decomposition/m-p/794486#M2542Gennady_F_Intel2011-02-25T08:34:55ZMatrix Q of QR decomposition
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Matrix-Q-of-QR-decomposition/m-p/794487#M2543
I use dorgqr and dormqr with same matrix dimensions (2250x2249) of matrix Q. Is it important for their performance??<BR />Thanks GennadyWed, 02 Mar 2011 15:50:12 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Matrix-Q-of-QR-decomposition/m-p/794487#M2543Jorge_Lorente2011-03-02T15:50:12ZMatrix Q of QR decomposition
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Matrix-Q-of-QR-decomposition/m-p/794488#M2544
<DIV id="_mcePaste">For the dormqr these sizes are enough to see the performance benefitsof using threading.</DIV>Fri, 11 Mar 2011 17:50:10 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Matrix-Q-of-QR-decomposition/m-p/794488#M2544Gennady_F_Intel2011-03-11T17:50:10ZMatrix Q of QR decomposition
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Matrix-Q-of-QR-decomposition/m-p/794489#M2545
I know, but my question is why dormqr is faster (near 3 times) than dorgqr if I'm executing both in sequential MKL??Mon, 14 Mar 2011 08:48:34 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Matrix-Q-of-QR-decomposition/m-p/794489#M2545Jorge_Lorente2011-03-14T08:48:34Z