Performance issues

eubie · ‎04-17-2012

Dear everybody,

I have a C app calling a piece of fortan code, namely a function called F (say).

I have both the C version of F and the Fortran version of F, both calculating in the same way and giving the same results. Lastly I have a testing app, where I call F 100K times and the Fortan F is supplied in a static Release library.

When in this app I use the C version of F, the computation takes 2 seconds. However, with the Fortran version of F, it takes around 80 seconds. Is there a way how to solve this? I pretty much need the Fortran version to have the same speed as the C version (the reason is that this F function is fed to a numerical integrator which I only have in Fortran).

Any help on this is much appreciated,

Daniel

bmchenry · ‎04-17-2012

without seeing the F function in C v Fortran, i can only guess if it is a coding issue.
the other issue is 'you have a C app calling' both the F function in C and in Fortran.
So then it becomes 'what are you passing to the F function?' that might be slowing it down when it is in Fortran v in C?
lastly you mention that the F function is fed into a Fortran numerical integrator...shouldn't you do the test of whether the F function when fed to the frotran integrator is faster in C or Fortran?
just a few thoughts to maybe move you forward on the project.

eubie · ‎04-17-2012

Hello bmcherry,

The function is

[fortran]real*8 function logpweibull(x, kappa, p) COMMON/weibullCommon/ CKAPPA,CP real*8 CKAPPA real*8 CP real*8 kappa real*8 p real*8 x real*8 logpweibulIntegrand CKAPPA = kappa CP = p logpweibull = logpweibulIntegrand(x) end real*8 function logpweibulIntegrand(x) !logpweibulIntegrand real*8 x COMMON/weibullCommon/ CKAPPA,CP real*8 CKAPPA real*8 CP real*8 kappa real*8 p real*8 xi real*8 xitokappa real*8 retval1 real*8 retval2 real*8 retval3 real*8 retval4 real*8 dlgamma kappa = CKAPPA p = CP !kappa = 1.5 !p = 2.0 xi = exp(dlgamma( 1.0 + 1.0/kappa )) xitokappa = xi ** kappa retval1 = log(x) ** p retval2 = kappa * xitokappa retval3 = x ** ( kappa - 1.0 ) retval4= exp( -xitokappa * (x ** kappa) ) logpweibulIntegrand = retval1*retval2*retval3*retval4 end [/fortran]

and in C it is

[cpp]double Logp_Weibull( double x, void * params ) { double kappa = ((double*)params)[0]; double p = ((double*)params)[1]; double arg = 1.0 + 1.0/kappa; double xi = GAMMA( & arg); double dXiToKappa = pow(xi, kappa); double RetVal1 = pow( log(x), p ); double RetVal2 = kappa * dXiToKappa; double RetVal3 = pow( x, kappa - 1 ); double RetVal4 = exp( -dXiToKappa * pow(x, kappa) ); return RetVal1 * RetVal2 * RetVal3 * RetVal4; }[/cpp] It is pretty straight forward, so I think no coding issues are doing this. Also, when creating a Fortran application that calls thislogpweibull function 1M times, it runs 9 seconds. The same thing in C, i.e. a main() calling 1M times the CLogp_Weibull function takes 21 seconds. So...Fortran per se is faster. Now my original app slowing down with using using a Fortran library is weird. I was thinking along the lines of "changing context" when going from C code to Fortran, but I dont know whether there is something like that.

The reason Im not feeding the integral as a C function is that it needs to be parametrized by kappa, p yet can take only one argument (x). Hence, I need to do it like this.

Thanks again,

Daniel

TimP · ‎04-17-2012

If you would profile this you'd likely shed more light on it. If you don't like Intel AmplifierXE, there's linux.
Surely you could replace xi**kappa * x**kappa by (xi*x)**kappa in case one of the compilers doesn't perform such an algebraic optimization.
If you expect p to take an integral value, pow() is expected to perform appropriate optimization internally, while Fortran expects you to write that explicitly. It's even possible that pow() may perform optimizations involving sqrt(), in case your comments about expected values are significant.

IanH · ‎04-17-2012

(Assuming that GAMMA(&arg) in the C code is equivalent to exp(dlgamma(1.0+1.0/kappa) in the fortran)

Which C compiler? What command line options are in use for both the C and fortran compilers, in all the cases that you are comparing? Are you using any optimisation? Any inter-procedural optimisation? Any inter-procedural optimisation that works across languages?

Why are you putting kappa and p into common and not passing them as arguments? That means that the logpweibull fortran function has side effects. The C variant does not have these side effects - so the two variants aren't equivalent, particularly from the point of view of a code optimiser.

eubie · ‎04-18-2012

First off, thank you both for writing.

TimP: I tried profiling it yesterday and the results were 2 seconds for C with C and 80 second for C with Fortran. Today, I recompiled and now the difference is negligible (2.5 vs 2.8 seconds, Fortran still slower). I dont have a clue what changed because from my perspective, nothing did. I wanted to put up a file with all the projects, but now that the slowdown is gone, there is no point.

IanH:

C compiler is the ICC, and setings for compile and link are the default presets in MSVC 2010 and Intel Fortran compiler. As for the function look, it is like this. The C version of QAGI (integration over a bounded infinite interval) takes a function like "double f(double x, void * Params)" so into Params I can put the kappa, p and I can stuff this function into the integrator without problems. However, the Fortran version of QAGI only takes a function like "real*8 f(x) ... real*8 x..." (the point is it cannot have more parameters). That's the reason for this workaround, as logpweibull needs to access those kappa,p.

Thank you,

Daniel

P.S. Neither kappa, p are limited to being integers, forgot to mention that.

John_Campbell · ‎04-18-2012

If it is a compute time problem, I'd be looking to minimise the use of " ** double ", as this is much less efficient than " ** integer "
There was a time when a**b was replaced by exp(b*log(a))
It's been a while but I think that retval4 = kappa * exp(x)- exp(xitokappa)
or replace x ** kappa with retval3 * x, giving retval4 = exp (-xitokappa*retval3*x)
Also, if log(x)**p = exp(p*log(log(x))) what does this mean ??
Some potential changes there.

Alternatively it may be in the C > F and F > C interface.

eubie · ‎04-18-2012

John, is there a way to figure out whether it could be the C > F and F > C interface, in case this thing happens to me in the future? I am new to Fortran, so if the issue is more complicated, I will be glad for some manual reference or guide.

Thank you,

Daniel

jimdempseyatthecove · ‎04-18-2012

C++>>doublearg=1.0+1.0/kappa;
IVF>>xi=exp(dlgamma(1.0+1.0/kappa))

In C++ 1.0 is a double precision literal
In IVF 1.0 is a single precision literal (when using default options)

Try changing your floating point literals to double precision.

Jim Dempsey

TimP · ‎04-18-2012

It's generally considered better form to make all your consistents consistent data types (and eliminate the non-standard real*8), e.g. by
integer, parameter :: dp = selected real kind(12)
real(dp) :: kappa
xi = exp(dlgamma(1_dp + 1_dp/kappa))
but in this case the code generated should be identical.

JVanB · ‎04-18-2012

Alert! 1_dp is an INTEGER literal.