Inverting a very small matrix and MKL help

dboggs · ‎03-08-2018

My application will need to invert many thousands of small (2x2, 3x3, 4x4) matrices. Regrettably, F90 includes only a few matrix ops but not inverse, not even determinant.

For the 2x2 and 3x3 versions I am seriously considering coding this using a "by hand" algorithm much like we learned in elementary linear algebra, using the adjoint matrix and determinate. Code for this is farily simple and can be found on line, along with claims that it is faster than LAPACK--which has a lot of overhead and is really designed for efficiency on much much much larger matrices. Does anyone have experience with this?

In the grander scheme of things, I WOULD like to try MKL just for the learning and experience. I have never used MKL before. Doing this for the first time, trying to follow the Intel documentation, is daunting. Can some one give me a simple example of using say the MKL > LAPACK subroutine ?pftri, including the necessary setup steps (e.g. set project properties to link with the necessary MKL library, but I think there must be more), USE and INCLUDE files, etc? Surely it is simpler than it seems for a novice going at it the first time.

And btw, I'm well aware of the advice to not use the matrix inverse method to solve a system of linear equations. I'm not doing that and I really want to get the matrix inverses--so please dispense with that lecture!

mecej4 · ‎03-08-2018

Are the matrices symmetric? Positive definite?

Have you looked at the MKL example source codes that are included as Zip files with the MKL or Parallel Studio distributions?

Once you have found a suitable example for the problem type that you have, you may find the MKL Link Line Advisor tool helpful. For selecting the Lapack routine to use, see https://software.intel.com/en-us/articles/intel-mkl-function-finding-advisor: .https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/ . Compiling and linking can be as simple as adding /Qmkl to your usual ifort command line.

Brian_Murphy · ‎03-08-2018

Put this in a module, and you can do Ainv = .i. A

 use BLAS95
 use LAPACK95
 use mkl_pardiso
 use F95_PRECISION, only: WP => DP 
 implicit none

 interface operator(.i.)
  module procedure myInvert
 end interface

  function myInvert(a)
   real(WP), dimension(:,:), intent(in) :: a
   real(WP), dimension(size(a,1),size(a,2)) :: myInvert
   integer, dimension(size(a,1)) :: ipiv
   integer :: info
   myInvert = a
   call getrf(myInvert, ipiv, info)
   call getri(myInvert, ipiv, info)
  end function myInvert

dboggs · ‎03-09-2018

Thanks Brian, this is the type of example I think I needed. I tried to condense it to the bare bones, as follows:

   PROGRAM MKLinvert
      use blas95
      USE LAPACK95 !, only: getrf
      !use mkl95_lapack ! appears in Intel mkl samples
      !USE mkl_pardiso
      USE F95_PRECISION, ONLY: WP => DP
      IMPLICIT NONE
      
      integer ipiv(2), info
      
      REAL(WP)  :: A(2, 2)
      
      A(1,1) = 10.
      A(1,2) = 3.
      A(2,1) = A(1,2)
      A(2,2) = 5.
      
      CALL getrf (A, ipiv, info)
      !CALL getri (A)
      
      PRINT *, A
   
   END PROGRAM MKLinvert

This will not compile, giving me error unresolved external symbol _DGETRF_F95 referenced in function _MAIN_. I have tried specifying the path to the file lapack.f90 (where the interfaces in LAPACK95 are defined), as well as placing this file directly in my project directory. Also, I suspect that USE BLAS95 is not needed, and I don't know about USE mkl_pardiso. Have tried all combinations.

Any ideas?

mecej4 · ‎03-09-2018

Here is a corrected version of your program. [Anyone reading this thread later should note that this approach (computing the inverse explicitly) should not be used for solving simultaneous linear equations. The Lapack95 interface is compact and convenient, but behind the scenes the calls get converted to Lapack calls, with scratch arrays allocated, used and deallocated with every call to GETRF and GETRI.]

File mkl_inv.f90:

PROGRAM MKLinvert
   USE LAPACK95, only: getrf, getri
   USE F95_PRECISION, ONLY: DP
   IMPLICIT NONE
   
   integer ipiv(2), info
   
   REAL(DP)  :: A(2, 2)
   
   A = reshape([10d0, 3d0, 3d0, 5d0], [2,2])
   
   CALL getrf (A, ipiv, info)
   CALL getri (A, ipiv)
   
   PRINT '(2ES12.3)', A

END PROGRAM MKLinvert

Compile, link and run (showing IA32 version):

> ifort /Qmkl mkl_inv.f90 mkl_lapack95.lib
> mkl_inv
   1.220E-01  -7.317E-02
  -7.317E-02   2.439E-01

TimP · ‎03-10-2018

For performance on small matrices, you would try the MKL_DIRECT_CALL facility. Still, MKL may not show advantage until the problem is large enough to benefit from both multi-threading and AVX.

mecej4 · ‎03-10-2018

If the performance of the matrix inversion is considered important, you should spend some time studying the nature of the matrices to see if there are patterns that can be exploited.

For example, are the 4 X 4 matrices block-diagonal? ABD (Almost Block Diagonal)? There are special methods for such matrices. Would tearing algorithms speed up the calculation?

dboggs · ‎03-10-2018

mecej4: Thank you for this "corrected version," although I don't see what you "corrected"--can you point out what was wrong? Granted, I did not show the compile library or the link library. In fact I didn't realize I had to link with an mkl library at all! Thanks for providing this info. It's difficult if not impossible to figure this out from the documentation. Much better to ask on this forum.

But I now have it running, still with some trial and error. I do have a couple of comments that I would like to be addressed:

(1) I should have noted that I am using VS. It's kind of odd that VS will lead you through the mkl library (Project > Properties > Fortran > Libraries, but there is nothing like that under Properties > Linker. Hence my omission.

(2) That said, when assigning the mkl library in the > Fortran > Libraries section, the only options are /Qmkl: parallel, /Qmkl:sequential, and /Qmkl:clustered. I had no idea so I just guessed. Now I see the only way to get it right is to avoid this and add /Qmkl manually as an Additional Command Line Option. Is there a better suggestion?

(3) Similarly, I specified the linker library mkl_lapack95.lib as an Additional Dependencies (manual command line option would have worked also).

(4) I find it is not necessary to give the path to either of these libraries. How does ifort know where they are?

(5) The MKL Link Line Advisor tool was not very helpful. It asked several questions that I didn't understand; the suggested link line included (correctly) mkl_lapack95.lib, and also three other libs that are not needed. It also indicated the use of two compiler options specifying the path of two include directories that are not needed.

(6) The mkl sample programs are also not very helpful. They give the needed USE statements, but not the compile and link specs.

Sorry about all these questions. I'm still trying to learn and frustrated about the difficulty of getting it from the documentation.

Also fwiw: the matrices are always symmetric and positive-definite. Not very sparse (but they are so small it hardly matters). I will consider switching to some other method if they get larger than 10 or so.

mecej4 · ‎03-10-2018

If your matrices will always be positive-definite, you should be using POTRF and POTRI instead, since inversion of a pos. def. matrix can be done in about 50 percent of the time required for inverting a general matrix.

What did I correct? I removed the comment mark and added the second argument (IPIV) in the line that called GETRI. I removed the unnecessary USE statements.

The Intel Fortran and C compilers have lots of knobs, bells and whistles. MKL is a huge library and adds to the complexity of the compile/link process. Therefore, if MKL is not needed in an application, it is not linked in. That is why switches such as /Qmkl (and corresponding settings in VS) are provided.

An Ifort command window and the Intel-customized VS integrations are configured to know where the MKL (and IMSL, if installed) components (include files, module files, libraries and runtime DLLs) are located.

dboggs · ‎03-10-2018

Yes, thank you. My careless omission of ipiv in getri was deadly, and the compile error lead me to believe that the proper things were simply not getting linked. It happened because ipiv IS described as optional in getrf. Curiously, I believe the documentation is in error. If my program has (A, B are identically declared arrays)

B = A
CALL getrf (B) ! note "optional" argument ipiv missing
CALL getri (B)
PRINT *, A

the print statement shows that half of matrix A has been overwritten by part of matrix B.

I also tried potrf and potri. Very easy to do, since I now know how. And indeed I see in the documentation that the no. of flops in potri is half of getri. Thanks for the tip.

mecej4 · ‎03-11-2018

I can see that the way the documentation is organized can lead to misinterpretation when a number of MKL routines are to be called in a coordinated sequence.

The documentation of GETRI ( https://software.intel.com/en-us/mkl-developer-reference-fortran-getri ;)shows clearly that IPIV is a required argument, that it is an input argument, and says that IPIV is "as returned by GETRF". It is at this point that a little application of logic becomes helpful. Where/how is the user to get IPIV defined, if (s)he forgot to ask for it in the preceding call to GETRF? The values in IPIV are generated as part of the factorization algorithm.

There may be other use cases where a call to GETRF is not followed by a call to GETRI, and it is conceivable that GETRF is not asked to return IPIV in such a situation.

Note that in the positive definite case partial pivoting is not necessary, which is why the argument lists of POTRF and POTRI do not contain IPIV.

mecej4 · ‎03-11-2018

dboggs wrote:
.... If my program has (A, B are identically declared arrays)

B = A
CALL getrf (B) ! note "optional" argument ipiv missing
CALL getri (B)
PRINT *, A

the print statement shows that half of matrix A has been overwritten by part of matrix B.

If the compiler accepted the second call, with a single argument, you probably did not provide the required explicit interface to GETRI. With "USE LAPACK95, only: getrf, getri", and those four lines of code replacing the corresponding lines in the previous version, I get

Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on IA-32, Version 18.0.1.156 Build 20171018
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

boggs2.f90(14): error #6285: There is no matching specific subroutine for this generic subroutine call.   [GETRI]
   CALL getri (B)
--------^
compilation aborted for boggs2.f90 (code 1)

You have to be much more careful with the argument lists to these Fortran 95 interfaces to Lapack and BLAS than with the corresponding F77 calls. Many of the former have optional arguments and, therefore, require explicit interfaces. If the interface is not provided, the compilation may go through with an implicit (and incorrect) interface, but you will probably see linker errors or, worse, bewildering failures at run time.

dboggs · ‎03-14-2018

Thank you for these clear explanations. I appreciate it.