- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I need to calculate Singula value decomposition of matrix of max size 16x8 (but very often, let say 10000 for different matrices). I use it to calculate least squares interpation
When I use GESVD routine (mkl95_LAPACK module) it is about 2x slower than SVD from numerical recipes.
I think that problem can by in GESVD which each time allocates internal arrays. But declaration of GESVD for FORTRAN95
call gesvd(a, s [,u] [,vt] [,ww] [,job] [,info])
does not contain possibility to specify working array.
FORTRAN77 version have this possibility
call dgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, info)
Is it possible speed up FORTRAN95 interface or I should use FORTRAN77
Jakub
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
the F95 interface is nothing more than, well, interface to F77 routine. So there is nothing you can speed up in F95 version. I believe, you're left with 2 choices: (a) just use F77 (b) write your own (extend existing) wrapper to F95 and provide working array as argument.
A.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,Jakub,
You can find the code of the F95 dgesvd interface in the MKL release
MKLROOT/interfaces/lapack95/source/dgesvd.f90
As you can see its really nothing more than awrapper over the F77 routine.
So to get the high performance you should call theoriginal F77 routine.
Thanks,
Vladimir
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jakub, one more comment to your question:
because of the inputmatrices are very small,I wouldrecommendyou to try IPP's functionality. Please refer to the IPP image processing manual, You can find there the simialr functionality ( e.g.ippsSVD(, ,,,,,,) which performs single value decompositin of matrix). IPP functionality is better optimized for small sizes, but supports only C API.
--Gennady
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I wrote my GESVD calling routine
[fortran]SUBROUTINE DGESVD_MKL(A,M,N,LDA,S,VT,LDVT,LWORK,INFO)
IMPLICIT NONE
INTEGER, PARAMETER :: WP = KIND(1.0D0)
INTEGER, INTENT(OUT):: INFO
REAL(WP), INTENT(INOUT) :: A(:,:)
REAL(WP), INTENT(OUT) :: S(:)
REAL(WP), INTENT(OUT),TARGET :: VT(:,:)
CHARACTER(LEN=1) :: JOBU
CHARACTER(LEN=1) :: JOBVT
INTEGER :: M
INTEGER :: N
INTEGER :: LDA
INTEGER :: LDU
INTEGER :: LDVT
INTEGER :: LWORK
REAL(WP), POINTER :: O_U(:,:)
REAL(WP) WORK(LWORK)
LDU = 1
JOBU = 'O'
JOBVT = 'A'
CALL F77_GESVD(JOBU,JOBVT,M,N,A,LDA,S,O_U,LDU,VT,LDVT,WORK,LWORK,INFO)
if (INFO /=0) then
CALL F77_XERBLA('GESVD',INFO)
endif
END SUBROUTINE DGESVD_MKL[/fortran]
and used following calling routine
[sectionBody]subroutine testspeedmklF77(A,B,x,m,n)
USE mkl95_PRECISION, ONLY: WP => DP
use MKL_SVD
use mkl95_blas , only: GEMV
IMPLICIT NONE
INTEGER :: I, J, INFO, M, N
real*8 A(m,n),B(m),x(m)
real*8 S(min(m,n)),VT(n,n),aux(m)
real*8 AA(m,n)
integer*4 lda,ldvt,lwork
AA=A
!M = SIZE(A,1)
!N = SIZE(A,2)
!LDA = MAX(1,SIZE(A,1))
!LDVT = MAX(1,SIZE(VT,1))
LDA=M
LDVT=N
lwork=max(3*min(m, n)+max(m, n), 5*min(m,n))
CALL DGESVD_MKL( AA,m,n,LDA,S,VT,LDVT,lWork,INFO ) ! aa vraci u
call GEMV(aa,b,aux,trans='T')
aux=aux/S
call GEMV(vt,aux,x,trans='T')
end subroutine
[/sectionBody]
- I obtained following results for 100000 calls to SVD
- Numerical recipes routine - 0.22 sec
- MKL FORTRAN 95 wrapper - 1.05 sec
- MKL my F77 wrapper - 0.87 sec
- So MKL is about 4 times slower. I think that it is due to small matrix and probably MKL internal allocations.
- I add my test code.
- Jakub
[sectionBody][/sectionBody]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I did not found ippsSVD function in "Reference Manual, Volume 2: Image and Video Processing".
Jakub
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
sorry, my fault - this is signal processing manual ( ippsman.chm)
and as an additionally,see, ippsr.h e,g:IPPAPI(IppStatus, ippsSVD_64f_D2,(const Ipp64f* pSrcA, Ipp64f* pDstU, int height,
--Gennady
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I downloaded evaluation copy of IPP and tried to call it
[fortran]subroutine testspeedipp(A,B,x,m,n) USE mkl95_PRECISION, ONLY: WP => DP use mkl95_blas , only: GEMV IMPLICIT NONE interface !ippsSVD_64f_D2L_I(Ipp64f** mSrcDstA, int height, Ipp64f* pDstW,Ipp64f** mDstV, int width, int nIter); integer*4 function ippsSVD_64f_D2L_I(mSrcDstA,height,pDstW,mDstV,width,nIter) !DEC$ IF DEFINED(_X86_) !DEC$ ATTRIBUTES STDCALL, ALIAS : '_ippsSVD_64f_D2L_I@24' :: ippsSVD_64f_D2L_I !DEC$ ELSE !DEC$ ATTRIBUTES STDCALL, ALIAS : 'ippsSVD_64f_D2L_I' :: ippsSVD_64f_D2L_I !DEC$ ENDIF !DEC$ ATTRIBUTES REFERENCE :: mSrcDstA,mDstV,pDstW !DEC$ ATTRIBUTES VALUE :: height,width,nIter use ifwinty integer*4 height,width,nIter real*8 mSrcDstA(width,height) real*8 mDstV(width,width) real*8 pDstW(width) end function end interface INTEGER*4 :: M, N real*8 A(m,n),B(m),x(m) real*8 S(min(m,n)),V(n,n),aux(m) real*8 AA(m,n) integer*4 iret,nIter AA=A nIter=60 iret=ippsSVD_64f_D2L_I(AA,m,s,v,n,nIter) call GEMV(aa,b,x,trans='T') aux=x/S call GEMV(v,aux,x,trans='N') end [/fortran]But I obtained error message
Unhandled exception at 0x0245b7ac in Program.exe: 0xC0000005: Access violation reading location 0x00000000.
when I tried to pass line calling ippsSVD_64f_D2L_I
I suppose that routine ippsSVD_64f_D2L_I uses STDCALL calling convention so I defined my FORTRAN interface correctly.
I called testspeedipp(A,B,x,m,n) routine with A(1:5,1:5),B(1:5),x(1:5) and m=n=5 values.
Where can be problem?
Jakub
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I suppose that routine ippsSVD_64f_D2L_I uses STDCALL calling convention so I defined my FORTRAN interface correctly.
___________________
Unlikely;you could easily find out by dumpbin or equivalent.
May require iso_c_binding, as it isn't written to accomodate Fortran.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I do not understand why to use dumpbin on ipp library. What I am able to find with?
I tried to use
[fortran] interface !ippsSVD_64f_D2L_I(Ipp64f** mSrcDstA, int height, Ipp64f* pDstW,Ipp64f** mDstV, int width, int nIter); integer*4 function ippsSVD_64f_D2L(mSrcDstA,height,pDstW,mDstV,width,nIter) bind(C,name="ippsSVD_64f_D2L_I@24") !DEC$ ATTRIBUTES REFERENCE :: mSrcDstA,mDstV,pDstW !DEC$ ATTRIBUTES VALUE :: height,width,nIter use ifwinty use, intrinsic :: iso_c_binding integer(C_INT):: height,width,nIter real(C_DOUBLE):: mSrcDstA(width,height) real(C_DOUBLE) mDstV(width,width) real(C_DOUBLE) pDstW(width) end function end interface [/fortran]But I stil did not succeeded.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tim's suggestion is to use dumpbin to get the exported symbols to check the calling convention related issues.
A.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page