Hi mecej4, - Page 2

John_B_8 · ‎08-14-2013

Attached is the latest version of the ported code from VBA/Excel and a listing of errors that the compiler is throwing up

Casey · ‎08-16-2013

Thanks for adding those comments mecej4. I only wanted to address the fortran and keep out of the statistics / math aspect of the problem.

mecej4 · ‎08-16-2013

Casey,

My comments were addressed mainly to the OP ( I wish the forum software assigned sequence numbers to responses, so that it would be easy to say, "re #12..."). Specifically, in http://software.intel.com/en-us/comment/reply/413921/1747524?quote=1#comment-form he wrote about having suspicions that double precision might not be sufficient, and speculated that quadruple precision might be needed.

This is a Fortran related forum and, properly so, you (Casey) responded within that context. When the results are not quite right, one is often tempted into asking whether tweaking the code or the compiler options would remove the errors. This thread furnishes an example that shows why one must get the problem specified correctly before it makes sense to discuss the reasons for the failure of the program to deliver correct results.

jimdempseyatthecove · ‎08-16-2013

Fri, 08/16/2013 - 13:02>> ( I wish the forum software assigned sequence numbers to responses, so that it would be easy to say, "re #12...")

You could use the date and time (more to type)

Another gripe related to this:

While you can Select, Drag and Drop from prior post, performing a Select, Ctrl-C (copy), Ctrl-V (paste) crashes the comment edit box.

Jim Dempsey

John_B_8 · ‎08-19-2013

mecej4,

I was also wondering how the algorithm worked.

When I looked at it and started going through the loop manually (and then I temporarily put print statements into the code in order to see what was happening to all the variables from loop to loop), it gave me the idea that a recurrence relationship was involved in computing the Poisson probabilities. If you want to know why the textbook exponential equation is actually not used in statistical code ('behind the scenes') then you have to look at the paper by prof. Knusel, which I attached to an earlier post, (see page 37) (I don't see it in the thread so I'll re-attach). Sections 6 onwards deal with this problem.

Anyway to be doubly sure I contacted retired Prof. John Pezzullo, who wrote the original Excel/VBA code, and his response is given in the attached word doc.

Yes, I remember reading in one of my old computing textbooks that the bisection algorithm is not the most efficient at finding roots of equations, alternatives being secant method, method of false positives, and Newton-Raphson, (see 'Numerical methods for engineers and scientists' by A.C. Bajpai, I.M. Calus, and J.A. Fairley, section 'Solution of non-linear equation', pp 28-81), but it is robust, and by some yardsticks termed 'efficient', but I guess not in terms of machine cycles!

Finally ,I found a less technical article on the recurrence relationship for computing Poisson probablities, see page 10 and the pages leading up to it in the article (attached) on Poisson distributions.

Cheers,

John

mecej4 · ‎08-19-2013

John B.:

I read the references that you gave and reexamined the VB code. What it does is to sum terms in the series expansion of e^-λ, stopping when the last term is smaller than the current sum by a factor of 1E-10. That works because the exponential series is absolutely convergent with infinite radius of convergence.

Here is a shorter replacement for the part of your code that computes the CDF using the intrinsic EXP function. To compensate for the forum software's harsh treatment of leading blanks in the source code, I have also attached the source file.

This program gives the result 0.879107, which agrees with the value given by Knüsel's ELV.EXE (http://knuesel.userweb.mwn.de/elv/). [Notes: Prof. Knüsel died in April this year, and some of the links in his Web pages are extinct. ELV.EXE is a DOS program, and does not run in Windows 8. However, it runs inside DosBox, which is available at http://www.dosbox.com/download.php?main=1 ].

The VBA code in your XLS file was rejected by Excel 2013 running on W8-64.

[fortran]

program rpoisp
integer :: x1=2,x2=8
double precision :: z = 5.3
write(*,'(F11.8)')PoisP(z,x1,x2)

CONTAINS
double precision Function PoisP(z,x1,x2)
implicit none
double precision z,q,S
integer x1,x2,k
!
If(z < 0) Then
write(*,*) 'PoisP: z < 0'
PoisP = 0
Return
Endif
!
q = 1
do k=1,x1-1
q=q*z/k
end do
S = 0
DO k = x1,x2
q = q*z/k
S = S + q
End Do
PoisP = S*exp(-z)
Return
End Function PoisP
End Program RPoisP

[/fortran]

John_B_8 · ‎08-21-2013

Hi mecej4,

The reason for the code work in this particular fashion is to avoid underflow and over flow in the cumulative probabilities. What Prof. Pezzullo does is arbitrarily set P(0)=1, so in the normal recurrence formula for Poisson probabilities he sidesteps having to evaluate P(0) at all, i.e. by using the normal formula for the first term, i.e. e^(-lambda). [In some problems lambda can easily be 300 or much more ! Such numbers can get rounded down to zero]. Now you can construct the cumulative probabilities for the Poisson distribution from P(0) upwards, as P(0) * (some terms), due to the recurrence relationship between successive probability terms, Pk=(lambda/k)*Pk-1. He computes those 'some terms' correctly, even though P(0) is deliberately incorrect (he watches for the possible overflow of the variable Tot at large lambda, and just rescales everything if the numbers involved do get too large, but his real 'trick' is to use the same P(0) for the partial sum of probabilities, S, between the desired limits. Thus when he divides S/Tot, the same incorrect P(0) s in the numerator and denominator simply cancel out, and you have the correct answer for the integrated probability between the limits, without having to calculate P(0) using the exponential formula, which as Knusel clearly illustrates can be a significant computational problem. A bit of reflection indicates why P(0)=1 is good choice as a 'dummy' value for P(0).

His other neat 'trick' is to avoid the difficulties with the range of lambda, i.e. from zero to infinity, in PoisHigh and PoisLow. This he does by mapping lambda to another variable v, defined as v=lambda/(1+lambda+obs) [obs = observed number of events, pz, or vz, etc. depending upon if you are looking at the VBA code or the fortran code]. When lambda =0, v=0, and when lambda = infinity, v=1; and when lambda = obs, v has some value between 0 and 1 (near 0.5 for large obs) That makes the half interval search more well behaved, and once the routine has found a value for v which makes the appropriate sum of Poisson terms equal to say 0.025, then he can calculate lambda as (1 + obs ) * v/(1-v), as done in the code.

Original explanation is all due to Prof. Pezzullo, I have just interpreted his comments and filled in bits for my own future reference. Originally I could not figure out from any of the Poisson equations in the literature where v came from ! It was driving me crazy trying to manipulate the various Poisson equations to get a variable like v. But the answer was that it wasn't really from those Poisson equations. No it came as clever idea from Prof. Pezzullo !

I guess a mathematician would have spotted this 'trick' and why it was being employed here more quickly, I was just thinking why ''v', where has that come from, which Poisson equation, (or the related Chi squared or exponential relationships).

Anyway there you have it.

Regards,

John

ps And when the expression [see VBA version] (1 + vz) * v/(1 - v), or (1 + obs) * v/(1 - v) [my version] is sent from PoisHigh or PoisLow to the PoisP function or subroutine, he is actually sending lambda, as a bit of algebraic manipulation of his mapping relationship v = lambda/(1 + lambda + obs), shows, making lambda the subject of the equation, gives as required, lambda = (1 + obs)*v/(1 - v). So unlike other cases of mathematical substitution nothing else has to be altered, e.g. like the limits in an integration when making a trigonometric substitution. The 'chopping/bisection' is in v, but the Poisson probablitity computations use lambda.

and Special thanks to Prof. Pezullo for his patience and help on this !!

More info on file and errors on code ported from VBA/Excel