Any danger in this sample program?

Yaqi_Wang · ‎04-28-2010

MODULE mytype1

TYPE mytype

INTEGER, POINTER, DIMENSION(:) :: ptr

END TYPE mytype

CONTAINS

! rename a to b

SUBROUTINE rename(a, b)

IMPLICIT NONE

TYPE(mytype) :: a

TYPE(mytype) :: b

b%ptr => a%ptr

NULLIFY(a%ptr)

RETURN

END SUBROUTINE rename

END MODULE mytype1

PROGRAM test

USE mytype1

IMPLICIT NONE

TYPE(mytype) :: a, b

INTEGER :: i

! create a

ALLOCATE(a%ptr(100))

a%ptr = 1

CALL rename(a, b)

! then we can use b

DO i = 1, 100

b%ptr(i) = i

ENDDO

STOP

END PROGRAM

-------------------------

I have a large code doing the similar thing. And when I turn O3 on

with Intel Visual Fortran compiler 11.1.054. The code crashes and I am pretty

sure it is the 'rename' subroutine that causes the problem. But I did

not see anything wrong in it. Is there any danger in 'rename' which is

not so obvious?

==================
Reply from Steve Lionel:
==================

It is evident that you retyped in the code as you showed it here, as it

has several syntax errors. When I correct the syntax errors and build

it with /O3 and Intel Visual Fortran 11.1.054 (and 11.1.065), it runs fine.

I conclude, then, that the code you showed here is not representative of

your actual code. Paraphrases and uncompilable excerpts do not help in

getting your problem resolved. Please provide an actual test case.

==============
My answer:
==============

The problem is that the actual code is so large and propertery that I

am sure no one want to really look into it.

I add few print statements in my actual code at the beginning of

rename and after. Something like:

SUBROUTINE rename(a, b)

IMPLICIT NONE

TYPE(mytype) :: a

TYPE(mytype) :: b

print *, associated(a%ptr), associated(b%ptr), size(a%ptr), size(b

%ptr)

b%ptr => a%ptr

NULLIFY(a%ptr)

print *, associated(a%ptr), associated(b%ptr), size(a%ptr), size(b

%ptr)

RETURN

END SUBROUTINE rename

And I saw the second print did not give me the expected output.

Although a%ptr is not associated, its size is not zero. (I guess it is

not specified in the Fortran reference.) This is tolerable. On the

other hand, b%ptr is associated but its size is not equal to the size

of a%ptr in the first print statement. It gave me 1! If I run the code

under Debug mode, everything is just fine.

I am sorry I am not able toprovide the actual code.

Yaqi_Wang · ‎04-28-2010

I am also pretty frustrated because of the problem. I can not reproduce it in the testprogram too. I did the similar thing to 'rename' with 'copy(a,b) + delete(a)' in the actual code and the problem is gone. But it is really not necessary.

Actually in the real code, ptr is pointed to null() in the type definition.

The actual rename subroutine is also very simple. There is no way that I misunderstand the code.

My question is:

how possible the size of a pointer is not equal to the size of another which the pointer is just pointed to?

Steven_L_Intel1 · ‎04-28-2010

As I suggested in the newsgroup, the actual problem is probably not what you think it is. Rather than trying to guess as to the problem based on the symptom, I recommend that you start cutting down your real application until you get it to the minimum that shows the error. When you finally get to something which, when removed, makes the error go away, look very carefully at that thing.

I doubt that the size is the issue. You are correct that if a pointer is not associated, its size is undefined. I think it is likely you are overwriting memory somewhere.

I'll be interested to see a test case that shows the problem.

Yaqi_Wang · ‎04-28-2010

Thanks Steve.

The real type definition:

! Compressed Sparse Row format

TYPE CSR

INTEGER :: n ! number of rows

INTEGER :: m ! number of columns

INTEGER :: nnz ! number of non-zeros

INTEGER :: nnz_capacity ! capacity of non-zeros

INTEGER :: mcount ! memory count in bytes

REAL(RKD), DIMENSION(:), POINTER :: a =>null() ! (nnz_capacity) non-zeros

INTEGER, DIMENSION(:), POINTER :: ia =>null() ! (n+1) pointers to the beginning of each row of all non-zeros

INTEGER, DIMENSION(:), POINTER :: ja =>null() ! (nnz_capacity) column index of all non-zeros

END TYPE CSR

The real rename subroutine:

SUBROUTINE csr_rename(a, b)

IMPLICIT NONE

TYPE(CSR) :: a

TYPE(CSR) :: b

print *, a%n, a%m, a%nnz, a%nnz_capacity, associated(a%a), associated(a%ia), associated(a%ja), size(a%a), size(a%ja), size(a%ia)

print *, b%n, b%m, b%nnz, b%nnz_capacity, associated(b%a), associated(b%ia), associated(b%ja), size(b%a), size(b%ja), size(b%ia)

b%ia => a%ia

b%ja => a%ja

b%a => a%a

NULLIFY(a%ia, a%ja, a%a)

print *, a%n, a%m, a%nnz, a%nnz_capacity, associated(a%a), associated(a%ia), associated(a%ja), size(a%a), size(a%ja), size(a%ia)

print *, b%n, b%m, b%nnz, b%nnz_capacity, associated(b%a), associated(b%ia), associated(b%ja), size(b%a), size(b%ja), size(b%ia)

b%n = a%n

b%m = a%m

b%nnz = a%nnz

b%nnz_capacity = a%nnz_capacity

b%mcount = a%mcount

a%mcount = 0

RETURN

END SUBROUTINE csr_rename

And the output:

144 144 5216 5216 T T T 5216 5216 145

0 0 0 0 F F F 4192 4192 145

144 144 5216 5216 F F F 5216 5216 145

0 0 0 0T T T 1 1 1
^^^
The last three 1s do not make sense. The print statement happens just after the assignments. Overwriting memory does not explain this. If I use /check:all, the problem is gone.

Thanks.

Steven_L_Intel1 · ‎04-28-2010

Let me know when you have a test case I can build and run.

Yaqi_Wang · ‎04-29-2010

I give up to create a test case to reproduce this error. I can not get the error with the test program. Actually even in the code, this error does not happen always, but it does happen at a fixed point and the error is indeed always the wrong pointer sizes with 1 (I know this because I tried to remove few rename calls with copy+delete, which changes the place where the error occurs. it remains the same, wrong pointer sizes with 1).

So on the compiler side, can you have any thoughts on how this possibly happens with the optimization so that the range for me to find the bug is limitted? If not, I will rewrite the rename subroutine with copy+delete operations.

Thanks.

Steven_L_Intel1 · ‎04-29-2010

Are you building this from within Visual Studio? If not, add the /warn:interface option to your build. Otherwise I would check the value of the pointer at various places in the program and see where it changes unexpectedly.

I saw that Richard Maine and Jim Xia gave you some good general advice in the newsgroup.

Yaqi_Wang · ‎04-29-2010

Yes, I am using VS.

It changes right after the pointer assignments and nullification. Is there a way we can check the optimized code?

Steven_L_Intel1 · ‎04-29-2010

Sorry to sound repetitious, but if you can provide a test case we'll be glad to look at it. What is "it" that changes? I would certainly expect a pointer to change when you assign to it.

Yaqi_Wang · ‎04-29-2010

For the Runtime Error Checking, if I just turn 'check array and string bounds', then the error is gone.

The command line is:
/nologo /O3 /Qip /fpp /I"C:\works\tt\projects3D" /I"C:\Program Files\MPICH2\include" /gen-interfaces /Qsave /module:"Release\" /object:"Release\" /check:bounds /libs:static /threads /c /Qinline-factor=1000

Yaqi_Wang · ‎04-29-2010

sorry, I was replying you. I mean it changes unexpectedly. After the pointer assignment, the size of the pointer should be something other than 1. But the print statement right after the pointer assignment says no. BTW, the pointer size indeed changes after the assignment. See my posting at #3.

Steven_L_Intel1 · ‎04-29-2010

There may be a compiler bug, or you may have a bug in aliasing or something else. Debugging this is possible, but it is not something I can explain to you - especially without seeing the actual program.

I suggest you open an issue with Intel Premier Support and attach the actual program.

Yaqi_Wang · ‎04-29-2010

Thanks Steve. I know this is not good, what I can do is just describing the symptom and cannot reproducing the error with a smaller test case. But believe me it does exist.It could be totally my badsomewhere in the code. I hope somebody else may raise the similar problem in future and I can get the answer. I like Fortran and want IVF to be better. Best wishes.