Quote:jimdempseyatthecove

sjl · ‎06-18-2017

Hi,

I am writing a Multigrid code and I have encountered a strange bug. Sorry this might be a little bit long, but I am seeking for help desperately. The structure of my code is as follows.

1. I need some information (Matrices, Mesh information etc.) to start the code. I can generate all the information using Matlab. Then I write those data in txt files. After that, I wrote a simple subroutine in fortran to read all these data. Say I have 5-6 variables containing all the information I need to start the computation.

2. Since I already have this code in Matlab, all I need to do is to translate those codes into fortran codes ( The reason I did this is that I wanna compute some 3D numerical experiments which Matlab on laptop cannot handle). So the other parts of the codes are the same as matlab codes.

3. The algorithm itself requires me to have a deferred size array. Because I need to keep track of the solution which transfer through different grids. That means the size of these arrays are changing all the time. The following is a part of my codes which does this job.

         ms=size(X(kk)%component)/3
         allocate(yy(ms))
         yy=0
         yy(Free(kk)%component)=uc             ! the size of yy is larger than uc
         deallocate(uc)

         allocate(ff(size(X(kk+1)%component)/3))   ! size of ff is larger than yy
         ff=0
         ff=smvpro(pro(kk)%component,yy)    ! this line change the size of yy
         uu=uu+ff(Free(kk+1)%component)
         deallocate(ff,yy)

So basically I allocate and deallocate a lot to achieve my goal.

4. I've ran the codes. The results are confusing. If the discretization of mesh become finer and finer, the results should be better and better. At first 7 levels the results are good and basically the same as the results of Matlab. But when I reach level 8, the results suddenly blew up. But the results of Matlab are still good.

5. My questions are, if I assume the algorithm is correct (Matlab gave the correct results), what could be the problem when I have a very fine mesh in fortran?

6. My guesses are:

Could the way of reading data be wrong? Because I export the data from Matlab and read the data in fortran. I checked this everything looks fine.

Could the way I allocate and deallocate the arrays cause a problem?

Could the way I pass arrays through subroutine be wrong? (It is a recursive code so I have a recursive subroutine)

Thank you.

mecej4 · ‎06-19-2017

Check for ALLOCATE statements that contain optional clauses such as STAT= and make sure that the program checks for successful allocation before using the variables that were just allocated (or failed to be allocated).

sjl · ‎08-04-2017

mecej4 wrote:

Check for ALLOCATE statements that contain optional clauses such as STAT= and make sure that the program checks for successful allocation before using the variables that were just allocated (or failed to be allocated).

Thanks for the reply. I checked all the allocate statements and also checked if anything like array out of boundaries happened. Everything seems ok but the code just doesn't work right beyond level 7.

I suspected there is a problem in the recursive routine. Are there common mistakes when using recursive subroutine? It's weird everything is fine until certain level.

mecej4 · ‎08-04-2017

(name withheld) wrote:

If the discretization of mesh become finer and finer, the results should be better and better.

In general, that is not true, as the following example will show. Consider approximating the derivative of tan(x) at x = 1 using the forward difference ratio [tan(x+δx) - tan(x)]/δx. Calculate the approximate derivative and the error in that approximate result for δx = 10^-4, 10^-5, ..., 10^-15. Is the error the least when the "mesh size" δx is least? (Ans.: No. As you decrease δx, the error decreases, reaches a minimum near δx = 10^-8, and increases again. Furthermore, if you decrease δx below a certain value, the approximate derivative becomes zero, which is unacceptable.)

If you want to use a different ID/name, open your IDZ profile and enter a name that does not contain your E-mail address.

sjl · ‎08-04-2017

mecej4 wrote:

Quote:

(name withheld) wrote:

If the discretization of mesh become finer and finer, the results should be better and better.

In general, that is not true, as the following example will show. Consider approximating the derivative of tan(x) at x = 1 using the forward difference ratio [tan(x+δx) - tan(x)]/δx. Calculate the approximate derivative and the error in that approximate result for δx = 10^-4, 10^-5, ..., 10^-15. Is the error the least when the "mesh size" δx is least? (Ans.: No. As you decrease δx, he error decreases, reaches a minimum near δx = 10^-8, and increases again. Furthermore, if you decrease δx below a certain value, the approximate derivative becomes zero, which is unacceptable.)

If you want to use a different ID/name, open your IDZ profile and enter a name that does not contain your E-mail address.

Thanks. I got your point. But the results of Matlab is good and what I expected. Somehow the results of fortran aren't the same as that of Matlab but basically they are the same code.

jimdempseyatthecove · ‎08-05-2017

>>Somehow the results of fortran aren't the same as that of Matlab but basically they are the same code

Is your Fortran code using REAL or REAL(8)?

Jim Dempsey

sjl · ‎08-05-2017

jimdempseyatthecove wrote:

>>Somehow the results of fortran aren't the same as that of Matlab but basically they are the same code

Is your Fortran code using REAL or REAL(8)?

Jim Dempsey

I am using real*8. Thanks.

jimdempseyatthecove · ‎08-05-2017

Another "gotcha" that can cause (precision difference) issues is literal floating point numbers default to REAL(4). This can introduce differences in calculations when the literal cannot be precisely specified as REAL(4) or REAL(8). e.g. 0.1 is different than 0.1_8 (or 0.1D0). Whole numbers or fractions that can be contained within the precision are also fine. e.g. 0.25

Jim Dempsey

Laura_S_3 · ‎08-05-2017

Is it possible that you aren't outputting enough significant figures from Matlab for your matrices and mesh? That could cause symptoms like you are seeing. since Matlab and Fortran would be working on different problems. (In my experience on the systems I have available, you can output binary from Matlab and read it into Fortran to avoid this issue. I don't think this always works, though, if you try this, check the data you read in to make sure it is correct. Also, special handling is required for complex data types, but it doesn't sound like that is an issue here.)

Yes, there are "gotchas" with recursion. Don't do things like "real :: x = 0.0" and expect X to get initialized every time you enter the routine. However, that particular gotcha probably wouldn't explain your bug.

Are there any intermediate results you could output to compare between the Fortran and Matlab to see where the deviation arises?

Good luck!

Laura

sjl · ‎08-05-2017

Laura S. wrote:

Is it possible that you aren't outputting enough significant figures from Matlab for your matrices and mesh? That could cause symptoms like you are seeing. since Matlab and Fortran would be working on different problems. (In my experience on the systems I have available, you can output binary from Matlab and read it into Fortran to avoid this issue. I don't think this always works, though, if you try this, check the data you read in to make sure it is correct. Also, special handling is required for complex data types, but it doesn't sound like that is an issue here.)

Yes, there are "gotchas" with recursion. Don't do things like "real :: x = 0.0" and expect X to get initialized every time you enter the routine. However, that particular gotcha probably wouldn't explain your bug.

Are there any intermediate results you could output to compare between the Fortran and Matlab to see where the deviation arises?

Good luck!

Laura

Thanks for the reply. I've already considered the suggestion you mention. But I think I can try the binary one you suggested. Other than that, it seems the data are the same.

Also there are two parts that I doubt the bug lies in . First one is the data which you just mentioned, second one is the recursive routine which now I believe is the problem. But I just cannot find a way to fix that.

I do have some intermediate results, I literally compared all the solutions from level 1 to level 7, the numbers are not exactly the same, but at least the behavior of the algorithm is the same. The error is also the same. But once I hit level 8, the results change drastically.

I think what Jim said is definitely possible. But I don't know how to fix this at this moment. Also the same bug happens in 3D case at level 5.

sjl · ‎08-05-2017

jimdempseyatthecove wrote:

Another "gotcha" that can cause (precision difference) issues is literal floating point numbers default to REAL(4). This can introduce differences in calculations when the literal cannot be precisely specified as REAL(4) or REAL(8). e.g. 0.1 is different than 0.1_8 (or 0.1D0). Whole numbers or fractions that can be contained within the precision are also fine. e.g. 0.25

Jim Dempsey

This is actually what I concerned. I changed all the real*8 to real before. The results were worse.

So this is definitely a issue here. But I don't know how to test this.

Thanks.

Laura_S_3 · ‎08-05-2017

The only other think I can think of (without seeing the actual code) is, are there any "integers" in your calculations? Matlab will automatically type-cast integers to float, but in Fortran follows a specific set of rules dictating when it will type-cast. So, in Matlab, (5/2) == 2.5, but in Fortran, (5/2) == 2.

Also, do you have "implicit none" at the tops of your subroutines and functions? Perhaps there is a miss-typed variable name and the magnitude of the problem caused by it increases at higher levels. (It doesn't seem likely, since you started with working Matlab code, but the bug seems rather unlikely too.)

One possibility I just thought of ... Fortran isn't case-sensitive. Is there any chance that there are two variables that are different in Matlab (like aB and Ab) but that the Fortran compiler treats as the same? (Again, seems unlikely, but I have written this sort of bug before.)

sjl · ‎08-07-2017

Laura S. wrote:

The only other think I can think of (without seeing the actual code) is, are there any "integers" in your calculations? Matlab will automatically type-cast integers to float, but in Fortran follows a specific set of rules dictating when it will type-cast. So, in Matlab, (5/2) == 2.5, but in Fortran, (5/2) == 2.

Also, do you have "implicit none" at the tops of your subroutines and functions? Perhaps there is a miss-typed variable name and the magnitude of the problem caused by it increases at higher levels. (It doesn't seem likely, since you started with working Matlab code, but the bug seems rather unlikely too.)

One possibility I just thought of ... Fortran isn't case-sensitive. Is there any chance that there are two variables that are different in Matlab (like aB and Ab) but that the Fortran compiler treats as the same? (Again, seems unlikely, but I have written this sort of bug before.)

Thanks for the advice. I've checked my code and found no such mistakes. I am trying to export the data from Matlab to binary files. I'll give the feedbacks later.

One more thing, does it matter if I use same input variable name when I define two functions? For example,

function f1(A,v) and function f2(A,v). Will this be a problem? Thank you.

mecej4 · ‎08-08-2017

Sharon Jl wrote:

One more thing, does it matter if I use same input variable name when I define two functions? For example,

function f1(A,v) and function f2(A,v). Will this be a problem?

The scope in which the names of the dummy arguments apply is within the boundaries of the function code. So, no this will not be a problem.

Error in multigrid code