Re: Looking for fast exponential function

Intel_C_Intel · ‎06-24-2001

Hi, all,
I have been using profile to check function time. The following is what I got. It shows my program calls exponential function a lot. I am wondering if there is any documentation about this function. If the algorithm is slow I would try to find a fast version. Also I don't understand what's the difference between FIexp and FIIfexp. FIexp shows up if I turn the argument checking on. One other thing is why my main program shows up twice in the profile(first line and second line). If the first two lines represent the same thing, then the total function time would be less than 100%. I am really puzzled. Can someone enlighten me? Thanks a lot.

NP

Func Func+Child Hit
Time % Time % Count Function
---------------------------------------------------------
17218.950 22.4 59671.473 77.6 1 _MAIN$VFID03@0 (vfid03.obj)
17218.950 22.4 59671.473 77.6 1 _MAIN__ (vfid03.obj)
12131.031 15.8 12131.031 15.8 28595273 __FIexp (intrin.obj)
10852.623 14.1 10852.623 14.1 26621304 __FIIfexp_ (intrini.obj)
10465.588 13.6 12024.093 15.6 110904 _DSVRGP@16 (dsvrgp.obj)
7056.773 9.2 19080.866 24.8 110904 _INTERP1@24 (interp1.obj)
1359.777 1.8 1359.777 1.8 110904 _DCOPY@20 (dcopy.obj)
217.451 0.3 1104.492 1.4 2 _DEBTPRICE@104 (debtprice.obj)

Steven_L_Intel1 · ‎06-24-2001

The algorithm isn't slow, but it is a routine. The compiler can generate inline instructions for this operation if the Math Library:Fast option is enabled. Turning on argument checking makes no difference here.

FIIfexp is called when Check:Power is enabled, FIfexp if that is disabled. Check:Power disallows 0**0.

Steve

TimP · ‎06-25-2001

The in-lined code obtained with /fast is the best you can do on processors other than P4, unless you have a special case. For P4, there are methods involving sse code which are faster, while the internal intrinsics copied from earlier processors are not so competitive.

isn-removed200637 · ‎06-26-2001

If you want to speed things up you can do several things:
a)Improve your code to call EXP less often

b)sacrifice accuracy and use a faster routine.

For example, if your arguments are integer multiples of a real number X say,

save EXP(X) (or EXP(-X) ) in a local variable and then raise it to the required integer power.
Alternatively, use the REAL*4 version instead of the REAL*8.
If you require complex arguments, then evaluate the real and imaginary parts separately.

What about precision?
Since EXP(X) = 10**(X*log10(e)), X*log10(e) will have integer and a fractional part. You therefore only need to evaluate 10**(the fractional part). If you can justify errors on the order of 10**-5 or so, you can program your own
polynomial approximation to 10**X for 0<1 (see for example
Hasting's formulae in Abramowitz and Steguns 'Handbook of Mathematical Functions'.

regards

Tony Richards

Intel_C_Intel · ‎06-27-2001

1. Thank tprince and Tony for your advice. Like what Tony said, I improved my code so EXP is called less often. Now I can live with its speed. However, I notice in the documentation it says for Alpha systems, if you specify /fast, a tuned routine for EXP will be used. That's the reason why I am looking for the similar thing for x86 systems.

2. Thank Steve for your prompt reply, but what you said is very wrong. First, exponential function is defined on the whole complex plane. /check:power will never affect this function. Second, 0**0 is not an exp function. Third, /math_library:fast or /math_library:check does make a difference.

this is part of the profile with /math_library:fast

Func Time % Func+Child Time % Hit Count Function

924.326 8.0 924.326 8.0 1798503 __FIIfexp_ (intrini.obj)

this is part of the profile with /math_library:check

Func Time % Func+Child Time % Hit Count Function

1436.312 10.9 1436.312 10.9 2640051 __FIexp (intrin.obj)

1030.355 7.8 1030.355 7.8 1798503 __FIIfexp_ (intrini.obj)

Please look at the difference carefully. I hope you can tell me what
FIexp did in the second case.

(I've pointed out in another post that there's a possible bug in /check:power)

3. I am still puzzled with the first two lines of my profiling result. I think they represent the same thing. But why should they appear twice?

Func Time % Func+Child Time % Hit Count Function

5301.114 45.8 6251.427 54.0 1 _MAIN$VFID03@0 (vfid03.obj)

5301.114 45.8 6251.427 54.0 1 _MAIN__ (vfid03.obj)

924.326 8.0 924.326 8.0 1798503 __FIIfexp_ (intrini.obj)

4. I got the following link warning:
vfid03.exe : warning LNK4084: total image size 752377856 exceeds max(268435456); image may not run
I am wondering if I can lift the total image size limit of 256MB.

Sorry for all the questions. Your help will be greatly appreciated.

NP

Intel_C_Intel · ‎08-12-2001

hi, I have one question. my code is below,
and , after compileing, I get a warning, "Debug/calsen.exe : warning LNK4084: total image size 274325504 exceeds max (268435456); image may not run"

what is total image size?And i 'd like to make executable file in this code.....
thank you

C---------------------------------------------------------
C CALCULATE SEN
C-----------------------------------------------------------
PROGRAM CALSEN
integer i,j,KELN,KDOF,KNOF,KRN,KNON,KRER,NM
REAL sk(5000,5000)
REAL sm(5000,5000)
REAL db,SEIG,SEN2,SENS
REAL sivec(4500)
REAL sivec2(4500)
integer nod(5000)
c DIMENSION NOD(1)

OPEN(5001,FILE='k.txt',status='OLD')
OPEN(5003,FILE='m.txt',status='OLD')
OPEN(5002,FILE='k2.txt',status='OLD')
OPEN(5004,FILE='m2.txt',status='OLD')
OPEN(2005,FILE='info.txt',status='OLD')
OPEN(2003,FILE='TEST.txt',status='OLD')
OPEN(2002,FILE='info2.txt',status='OLD')
OPEN(2001,FILE='EIG.txt',status='OLD')
OPEN(2004,FILE='EIGV.txt',status='OLD')
open(7001,file='michin.txt')
open(7002,file='michin1.txt')
open(999,file='michin2.txt')
open(998,file='NOchin.txt')

C------------------------------------------------------------

READ(2005,10)KELN,KDOF,KNOF,KRN,KNON,DB
10 FORMAT(1X,5(I7),E20.12)
READ(2002,20)NM
20 FORMAT(1X,I5)

DO I=1,KNON*KELN
READ(2003,100)NOD(I)
END DO
100 FORMAT(1X,I5)

KRER=5001
CALL SEN(KRER,DB,KELN,KDOF,KNOF,KNON,NOD,SK)
KRER=5003
CALL SEN(KRER,DB,KELN,KDOF,KNOF,KNON,NOD,SM)

do i=1,nm*knof
write(7001,300)sk(i,1)
end do
do i=1,nm*knof
write(7002,300)sk(1,i)
end do

DO I=1,NM*KNOF
READ(2001,300)SIVEC(I)
300 FORMAT(1X,E20.12)
END DO

READ(2004,300)SEIG

DO I=1,NM*KNOF
DO J=1,NM*KNOF
SK(I,J)=SK(I,J)-SEIG*SM(I,J)
END DO
END DO

DO I=1,NM*KNOF
DO J=1,NM*KNOF
SEN2=SIVEC(J)*SK(J,I)+SEN2
END DO
SIVEC2(I)=SEN2
SEN2=0
END DO

SENS=0
DO I=1,NM*KNOF
SENS=SIVEC2(I)*SIVEC(I)+SENS
END DO

OPEN(7000,FILE='SEN.TXT')
WRITE(7000,300)SENS

STOP
END

C-------------------------------------------------------------
SUBROUTINE SEN(KRER,DB,KELN,KDOF,KNOF,KNON,NOD,SK1)
C-------------------------------------------------------------
c IMPLICIT DOUBLE PRECISION(A-H,O-Z)
C double precision
REAL sk1(5000,5000)
REAL sm(5000,5000)
REAL ak(100)
REAL ak2(100)
integer nod(5000)
INTEGER KELN,KNON,KNOF,KDOF,L,M,K,K2,J2
M=0
L=0
200 FORMAT(1X,E20.12)
c201 FORMAT(1X,2(E20.12))

DO N=1,KELN

DO K=1,KNON
DO K2=1,KNOF

DO I=1,KDOF
READ(KRER,200)AK(I)
READ(KRER+1,200)AK2(I)
END DO
L=0
DO J=1,KNON
DO J2=0,KNOF-1
SK1((NOD(M+K)-1)*KNOF+K2,((NOD(M+J)-1)*KNOF+1)+J2)
*=((AK2(L+J2+1)-AK(L+J2+1))/DB)+
*SK1((NOD(M+K)-1)*KNOF+K2,((NOD(M+J)-1)*KNOF+1)+J2)

END DO
L=L+KNOF
END DO

END DO
END DO
M=M+KNON
END DO
c--------------------------------------------------------------
RE TURN
STOP
END
C-----------------------------------------------------------------

Steven_L_Intel1 · ‎08-12-2001

See this Knowledge Base article.

Steve

edunlop · ‎08-14-2001

In my experience it is much faster to compute Y=EXP(P*LOG(X)) than Y=X**P; try it!

Best wishes, Edmund Dunlop