- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
I need to calculate Singula value decomposition of matrix of max size 16x8 (but very often, let say 10000 for different matrices). I use it to calculate least squares interpation
When I use GESVD routine (mkl95_LAPACK module) it is about 2x slower than SVD from numerical recipes.
I think that problem can by in GESVD which each time allocates internal arrays. But declaration of GESVD for FORTRAN95
call gesvd(a, s [,u] [,vt] [,ww] [,job] [,info])
does not contain possibility to specify working array.
FORTRAN77 version have this possibility
call dgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, info)
Is it possible speed up FORTRAN95 interface or I should use FORTRAN77
Jakub
Link kopiert
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Hi,
the F95 interface is nothing more than, well, interface to F77 routine. So there is nothing you can speed up in F95 version. I believe, you're left with 2 choices: (a) just use F77 (b) write your own (extend existing) wrapper to F95 and provide working array as argument.
A.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Hello,Jakub,
You can find the code of the F95 dgesvd interface in the MKL release
MKLROOT/interfaces/lapack95/source/dgesvd.f90
As you can see its really nothing more than awrapper over the F77 routine.
So to get the high performance you should call theoriginal F77 routine.
Thanks,
Vladimir
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Jakub, one more comment to your question:
because of the inputmatrices are very small,I wouldrecommendyou to try IPP's functionality. Please refer to the IPP image processing manual, You can find there the simialr functionality ( e.g.ippsSVD(, ,,,,,,) which performs single value decompositin of matrix). IPP functionality is better optimized for small sizes, but supports only C API.
--Gennady
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
I wrote my GESVD calling routine
[fortran]SUBROUTINE DGESVD_MKL(A,M,N,LDA,S,VT,LDVT,LWORK,INFO)
IMPLICIT NONE
INTEGER, PARAMETER :: WP = KIND(1.0D0)
INTEGER, INTENT(OUT):: INFO
REAL(WP), INTENT(INOUT) :: A(:,:)
REAL(WP), INTENT(OUT) :: S(:)
REAL(WP), INTENT(OUT),TARGET :: VT(:,:)
CHARACTER(LEN=1) :: JOBU
CHARACTER(LEN=1) :: JOBVT
INTEGER :: M
INTEGER :: N
INTEGER :: LDA
INTEGER :: LDU
INTEGER :: LDVT
INTEGER :: LWORK
REAL(WP), POINTER :: O_U(:,:)
REAL(WP) WORK(LWORK)
LDU = 1
JOBU = 'O'
JOBVT = 'A'
CALL F77_GESVD(JOBU,JOBVT,M,N,A,LDA,S,O_U,LDU,VT,LDVT,WORK,LWORK,INFO)
if (INFO /=0) then
CALL F77_XERBLA('GESVD',INFO)
endif
END SUBROUTINE DGESVD_MKL[/fortran]
and used following calling routine
[sectionBody]subroutine testspeedmklF77(A,B,x,m,n)
USE mkl95_PRECISION, ONLY: WP => DP
use MKL_SVD
use mkl95_blas , only: GEMV
IMPLICIT NONE
INTEGER :: I, J, INFO, M, N
real*8 A(m,n),B(m),x(m)
real*8 S(min(m,n)),VT(n,n),aux(m)
real*8 AA(m,n)
integer*4 lda,ldvt,lwork
AA=A
!M = SIZE(A,1)
!N = SIZE(A,2)
!LDA = MAX(1,SIZE(A,1))
!LDVT = MAX(1,SIZE(VT,1))
LDA=M
LDVT=N
lwork=max(3*min(m, n)+max(m, n), 5*min(m,n))
CALL DGESVD_MKL( AA,m,n,LDA,S,VT,LDVT,lWork,INFO ) ! aa vraci u
call GEMV(aa,b,aux,trans='T')
aux=aux/S
call GEMV(vt,aux,x,trans='T')
end subroutine
[/sectionBody]
- I obtained following results for 100000 calls to SVD
- Numerical recipes routine - 0.22 sec
- MKL FORTRAN 95 wrapper - 1.05 sec
- MKL my F77 wrapper - 0.87 sec
- So MKL is about 4 times slower. I think that it is due to small matrix and probably MKL internal allocations.
- I add my test code.
- Jakub
[sectionBody][/sectionBody]
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
I did not found ippsSVD function in "Reference Manual, Volume 2: Image and Video Processing".
Jakub
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
sorry, my fault - this is signal processing manual ( ippsman.chm)
and as an additionally,see, ippsr.h e,g:IPPAPI(IppStatus, ippsSVD_64f_D2,(const Ipp64f* pSrcA, Ipp64f* pDstU, int height,
--Gennady
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
I downloaded evaluation copy of IPP and tried to call it
[fortran]subroutine testspeedipp(A,B,x,m,n) USE mkl95_PRECISION, ONLY: WP => DP use mkl95_blas , only: GEMV IMPLICIT NONE interface !ippsSVD_64f_D2L_I(Ipp64f** mSrcDstA, int height, Ipp64f* pDstW,Ipp64f** mDstV, int width, int nIter); integer*4 function ippsSVD_64f_D2L_I(mSrcDstA,height,pDstW,mDstV,width,nIter) !DEC$ IF DEFINED(_X86_) !DEC$ ATTRIBUTES STDCALL, ALIAS : '_ippsSVD_64f_D2L_I@24' :: ippsSVD_64f_D2L_I !DEC$ ELSE !DEC$ ATTRIBUTES STDCALL, ALIAS : 'ippsSVD_64f_D2L_I' :: ippsSVD_64f_D2L_I !DEC$ ENDIF !DEC$ ATTRIBUTES REFERENCE :: mSrcDstA,mDstV,pDstW !DEC$ ATTRIBUTES VALUE :: height,width,nIter use ifwinty integer*4 height,width,nIter real*8 mSrcDstA(width,height) real*8 mDstV(width,width) real*8 pDstW(width) end function end interface INTEGER*4 :: M, N real*8 A(m,n),B(m),x(m) real*8 S(min(m,n)),V(n,n),aux(m) real*8 AA(m,n) integer*4 iret,nIter AA=A nIter=60 iret=ippsSVD_64f_D2L_I(AA,m,s,v,n,nIter) call GEMV(aa,b,x,trans='T') aux=x/S call GEMV(v,aux,x,trans='N') end [/fortran]But I obtained error message
Unhandled exception at 0x0245b7ac in Program.exe: 0xC0000005: Access violation reading location 0x00000000.
when I tried to pass line calling ippsSVD_64f_D2L_I
I suppose that routine ippsSVD_64f_D2L_I uses STDCALL calling convention so I defined my FORTRAN interface correctly.
I called testspeedipp(A,B,x,m,n) routine with A(1:5,1:5),B(1:5),x(1:5) and m=n=5 values.
Where can be problem?
Jakub
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
I suppose that routine ippsSVD_64f_D2L_I uses STDCALL calling convention so I defined my FORTRAN interface correctly.
___________________
Unlikely;you could easily find out by dumpbin or equivalent.
May require iso_c_binding, as it isn't written to accomodate Fortran.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
I do not understand why to use dumpbin on ipp library. What I am able to find with?
I tried to use
[fortran] interface !ippsSVD_64f_D2L_I(Ipp64f** mSrcDstA, int height, Ipp64f* pDstW,Ipp64f** mDstV, int width, int nIter); integer*4 function ippsSVD_64f_D2L(mSrcDstA,height,pDstW,mDstV,width,nIter) bind(C,name="ippsSVD_64f_D2L_I@24") !DEC$ ATTRIBUTES REFERENCE :: mSrcDstA,mDstV,pDstW !DEC$ ATTRIBUTES VALUE :: height,width,nIter use ifwinty use, intrinsic :: iso_c_binding integer(C_INT):: height,width,nIter real(C_DOUBLE):: mSrcDstA(width,height) real(C_DOUBLE) mDstV(width,width) real(C_DOUBLE) pDstW(width) end function end interface [/fortran]But I stil did not succeeded.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Tim's suggestion is to use dumpbin to get the exported symbols to check the calling convention related issues.
A.

- RSS-Feed abonnieren
- Thema als neu kennzeichnen
- Thema als gelesen kennzeichnen
- Diesen Thema für aktuellen Benutzer floaten
- Lesezeichen
- Abonnieren
- Drucker-Anzeigeseite