topic Hi Nimrod, in IntelĀ® oneAPI Math Kernel Library
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inverse-FLOPS/m-p/969575#M16465
<P>Hi Nimrod,</P>
<P>Approximate flops formula for (S/D)POTRF is 1/3*N^3, (S/D)POTRI is 2/3*N^3, for complex case these multiplied by four.<BR />
More precise formulas for complex case which makes sence for such a small size are:</P>
<P>CPOTRF_FLOPS = 6 * N * (N * (N * 1./6. + .5) + 1./3.) + 2 * N * 1./6. * (N * N - 1.);</P>
<P>CPOTRI_FLOPS = 6 * N * (N * (N * 1./3. + 1.) + 2./3.) + 2 * N * (N * (N * 1./3. - .5) + 1./6.)</P>
<P> </P>
<P>Usually there is a difference for 32 and 64 bit code, which comes from richer set of registers in Intel 64 architecture and other improvements in x86-64 Application Binary Interface (ABI).</P>
<P>Unfortunately I don't have clock counts for these functions.</P>
<P> </P>
<P>W.B.R., Alexander<BR />
</P>Thu, 03 Apr 2014 09:48:34 GMTAlexander_K_Intel32014-04-03T09:48:34Zmatrix inverse FLOPS
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inverse-FLOPS/m-p/969574#M16464
<P>Hi , </P>
<P>What should be the required FLOPS for 16x16 MKL_Complex8 matrix inversion using cpotrf and than cpotri ?</P>
<P>How many CPU clocks it should take on ATOM E3826 CPU and I5-3470 CPU ?</P>
<P>Is there any performance difference using Linux 32bit operating system vs Linux 64bit operating system ? (for those specific CPUs)</P>
<P>Thanks , Nimrod</P>
<P> </P>Thu, 03 Apr 2014 05:22:32 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inverse-FLOPS/m-p/969574#M16464Nimrod_H_2014-04-03T05:22:32ZHi Nimrod,
https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inverse-FLOPS/m-p/969575#M16465
<P>Hi Nimrod,</P>
<P>Approximate flops formula for (S/D)POTRF is 1/3*N^3, (S/D)POTRI is 2/3*N^3, for complex case these multiplied by four.<BR />
More precise formulas for complex case which makes sence for such a small size are:</P>
<P>CPOTRF_FLOPS = 6 * N * (N * (N * 1./6. + .5) + 1./3.) + 2 * N * 1./6. * (N * N - 1.);</P>
<P>CPOTRI_FLOPS = 6 * N * (N * (N * 1./3. + 1.) + 2./3.) + 2 * N * (N * (N * 1./3. - .5) + 1./6.)</P>
<P> </P>
<P>Usually there is a difference for 32 and 64 bit code, which comes from richer set of registers in Intel 64 architecture and other improvements in x86-64 Application Binary Interface (ABI).</P>
<P>Unfortunately I don't have clock counts for these functions.</P>
<P> </P>
<P>W.B.R., Alexander<BR />
</P>Thu, 03 Apr 2014 09:48:34 GMThttps://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/matrix-inverse-FLOPS/m-p/969575#M16465Alexander_K_Intel32014-04-03T09:48:34Z