- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello
I have a piece of code where I got an array which is part of a declared type (DT) that all of a sudden changing value. I have protected the whole DT so nothing can change it out of its module file. By making a simple change this odd behaviour does not occur.
type father real, allocatable :: odd_array(:) type(son1),allocatable :: s1 type(son2),allocatable :: s2 type(son3),allocatable :: s3 end type type son1 real, allocatable :: ar1(:) end type type son2 real, allocatable :: ar2(:) end type type son3 real, allocatable :: ar3(:) end type interface son1 module procedure son1_constructor end interface type(father), allocatable , protected :: test(:) (...) function son1_constructor(size_) implicit none integer, intent(in) :: size_ type(son1) :: son1_constructor allocate(son1_constructor% ar1(size_) ) end function (...) function father_constructor(size_1, size_2,size_3,size_odd) implicit none integer, intent(in) :: size_,size_1, size_2,size_3,size_odd type(father) :: father_constructor allocate(father% son1) allocate(father% son2) allocate(father% son3) allocate(father% odd_array(odd) ) ! this array's value change randomly (memory corrupted?) father% son1 = son1_constructor(size_1) father% son2 = son1_constructor(size_2) father% son3 = son1_constructor(size_3) end function subroutine SETUP (...) allocate(test(1000)) test(1) = FATHER_CONSTRUCTOR(10,12,15,100) (...) ! for test(2) etc. end subroutine
As you will note the array % odd_array seems to have memory corruption. If I instead DO NOT allocate within the 'father_constructor' but rather doing it in the subroutine which is calling the constructor all will be good.
Also if I change the father_constructor to a subroutine then everything also works.
So I found the solution, HOWEVER, why is this happening?
I would really appreciate if someone could spot an obvious mistake
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you post a full reproducer for this issue?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unfortunately that will be rather difficult. I understand it can be quite difficult to see the problem without it. However, is something illegal in my code/method shown?
As I mentioned changing the father_constructor to a subroutine won't have the problem shown
In my setup routine I am doing
father% odd_array = another_allocatable_array ! that array goes out of scope after the SETUP subroutine. But that should not be a problem as it is basic copy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I assume line 53 should have (size_odd) not (odd) for the allocate.
On line 63, won't the rhs create a (stack) temporary object, then copy the stack object to the lhs (then delete the rhs on stack). IOW you may need a copy operator for type father.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Jim,
Thanks for your reply. Yes you are right about the size_odd.
I am also assuming that I probably have some problems with the temporary objects created by the functions calls.
However, what I don't see is the problem.
Having test(1) = function_call will copy the LHS to RHS and then LHS will be deleted. Why will that create a problem, and even why so for my odd_array
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I suspect the copy operator subroutine You might experiment with making the copy operator elemental. Something like:
interface assignment(=) elemental subroutine Father_Assign(lhs, rhs) use YourFatherModule type(father), intent(out) :: lhs type(father), intent(in) :: rhs end subroutine Father_Assign end interface ... elemental subroutine Father_Assign(lhs, rhs) use YourFatherModule type(father), intent(out) :: lhs type(father), intent(in) :: rhs call move_alloc(rhs, lhs) end subroutine Father_Assign
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Jim, I will try that just for the sake of curiousity. However, I must admit that right now I dont see why I just should change all contructors to fortran subroutines and make the simple calls.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Because in that way, Shouldn't I also do the assigment operator for each "son" - which eventually will make everything quite verbose in terms of coding.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>Shouldn't I also do the assigment operator for each "son"...
I'd try it without an assignment operator for "son". Reasoning I am suspecting that MOVE_ALLOC of "father" takes care of this.
Inspect the aftermath of a test case to be sure.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Jim
Thanks for your reply. I will experiment with it. However, I am still struggeling to understand why the assignment operator is needed.
What does in this case make it necessary in your opinion?
Because I have done constructors several times before, although for simpler declared types which only contain arrays and the types are not arrays them selves i.e dt(..)% arr(..), like in the shown case. And they did not require/seem to require any assignment operator.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Jim,
Your suggestions does not seem to work. MOVE_ALLOC seems to require allocatable objects, and if I declare them allocatable at Intent(in) then the elemental would be voilated?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So I removed the elemental, declared both as allocatable and then I had to put inout for the rhs.
i.e
interface assignment(=) subroutine Father_Assign(lhs, rhs) use YourFatherModule type(father), allocatable, intent(out) :: lhs type(father), allocatable, intent(inout) :: rhs end subroutine Father_Assign end interface ... subroutine Father_Assign(lhs, rhs) use YourFatherModule type(father), allocatable, intent(out) :: lhs type(father), allocatable, intent(inout) :: rhs call move_alloc(rhs, lhs) end subroutine Father_Assign
HOWEVER, just realized that the inout is not acceptable by Fortran's assignment operator.
. Again, my comment from quote 10, where I am still struggling to understand when this is necessary/not, i.e. when assigment as in this case seem to solve the problem but in other cases where it is not necessary. .
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>Thanks for your reply. I will experiment with it. However, I am still struggeling to understand why the assignment operator is needed.
I cannot say what the standards say should happen in this case. I am "simply" trying to offer a work around. My (unfounded) thinking on this is with respect to:
test(1) = FATHER_CONSTRUCTOR(10,12,15,100)
The FATHER_CONSTRUCTOR on the rhs is constructing a type(father) on the stack as opposed to the lvalue (C++ speak) of the object on the lhs (test(1)). Subsequent to the stack construction, the stack object, inclusive of the array descriptors (but not that which they describe) are copied. IOW the stack temporary type(father)'s: type(son1), type(son2), type(son3) array descriptors are copied to test(1). Meaning the test(1) array descriptors now hold aliases to the stack temporary type(father). Then when the statement goes out of scope, the stack temporary type(father) is deleted, inclusive of its type(son1), type(son2), type(son3) arrays, thus making the aliases held within test(1) undefined. *** This is my guess as to what is happening.
In reading Fortran 2003 Draft 04-007.pdf, 7.4.1.3 Interpretation of intrinsic assignments, page 141, starting at line id 18:
For an allocatable component the following sequence of operations is applied: 19 (1) If the component of variable is allocated, it is deallocated. 20 (2) If the component of expr is allocated, the corresponding component of variable is allocated 21 with the same dynamic type and type parameters as the component of expr . If it is an array, 22 it is allocated with the same bounds. The value of the component of expr is then assigned 23 to the corresponding component of variable using defined assignment if the declared type 24 of the component has a type-bound defined assignment consistent with the component, and 25 intrinsic assignment for the dynamic type of that component otherwise. 26 The processor may perform the component-by-component assignment in any order or by any means that 27 has the same effect.
My interpretation of the above is that you should not need to create an assignment operator, however, by doing so, you may find a work around for the assignment bug (assuming it is a bug).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
AT90 wrote:Unfortunately that will be rather difficult. I understand it can be quite difficult to see the problem without it. However, is something illegal in my code/method shown?
There are quite a few things that are "illegal" in the code as shown (for example type names being referenced as if they were objects). They might just be a consequence of creating the example, but they obscure the nature of the problem and make it difficult to comment further.
There have been (and perhaps are still) problems with structure constructors and ifort - perhaps those problems with ifort are causing you problems, perhaps it is something else. Hard to say without a compilable reproducer. The attached compilable variant of the original example works for me with current beta.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Ian Thanks very much for your comment!
There are quite a few things that are "illegal" in the code as shown (for example type names being referenced as if they were objects). They might just be a consequence of creating the example, but they obscure the nature of the problem and make it difficult to comment further.
Yes, that is really my fault here. And you are totally right the mistake was a consequence of creating this example. In my original code I don't have
" father_constructor%s1 = son1_constructor(size_1)" but RATHER "father_constructor%s1 = son1(size_1)". was that what you referred to?
Your example, line 59 you have:
father_constructor%odd_array = [(real(i), i = 1, size_odd)]
Does this implicitly allocate odd_array - and why does this differ from having a normal allocate command.
General thing worth mentioning, you are right that your example works. In my original code it also work, until a random point!
That is, I have a time simulation, now before this begins I initialize my various array including the "father-son" example. After this becomes a READONLY object. When I time-simulation begins, the values in odd_array are correct until a random time-step where all of a sudden some elements change value. This does not happen if, as mentioned, I change the constructors to normal subroutine calls with intent(intout). I could imagine this being a bug, but frankly I have no idea how to reproduce this error in a more managable code?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
AT90 wrote:
In my original code I don't have" father_constructor%s1 = son1_constructor(size_1)" but RATHER "father_constructor%s1 = son1(size_1)". was that what you referred to?
The original post has `father% son1 = son1_constructor(size_1)`. father is the type name.
There have been issues with structure constructor overloads over the years (i.e. references to the procedure son1_constructor via the name of the son1 type being constructed). I have only started using them in production code with relatively recent compilers.
Your example, line 59 you have:
father_constructor%odd_array = [(real(i), i = 1, size_odd)]Does this implicitly allocate odd_array - and why does this differ from having a normal allocate command.
With recent compilers, or with older compilers and the -standard-semantics switch, this [re-]allocates and defines the value of father_constructor%odd_array, as per the rules of the current language standard. The corresponding allocate statement in the original example did not set the value. You could use a SOURCE= specifier in the allocate statement as an alternative (or an assignment statement separate to the allocate statement), but I find the single assignment statement less typing and easier to read.
General thing worth mentioning, you are right that your example works. In my original code it also work, until a random point!That is, I have a time simulation, now before this begins I initialize my various array including the "father-son" example. After this becomes a READONLY object. When I time-simulation begins, the values in odd_array are correct until a random time-step where all of a sudden some elements change value. This does not happen if, as mentioned, I change the constructors to normal subroutine calls with intent(intout). I could imagine this being a bug, but frankly I have no idea how to reproduce this error in a more managable code?
If the value of the components of the object are correct after the object has "become read only", then the problem sounds more like memory corruption due to something else going astray in the program. If you run your program under a debugger, then once you know the location in memory of the data for the component that is erroneously changing, you can set a breakpoint on write access to that memory. This can help identify the cause of the memory corruption.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks very much for your Help IanH
IanH (Blackbelt) wrote:Quote:
If the value of the components of the object are correct after the object has "become read only", then the problem sounds more like memory corruption due to something else going astray in the program. If you run your program under a debugger, then once you know the location in memory of the data for the component that is erroneously changing, you can set a breakpoint on write access to that memory. This can help identify the cause of the memory corruption.
Do you still believe it is a memory corruption due to something else, when if I change the constructors and replace them with subroutines and do the construction more like the Fortran90 style with intent(inout) etc. that this problem is no longer persistent.
Another comment: How do you normally debug for memory corruption and other more complicated bugs. Do you have any references or somewhere where I can learn about this. Sofar, I literally have only used print statements which in the longer term is probably not duable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>> When I time-simulation begins, the values in odd_array are correct until a random time-step where all of a sudden some elements change value.
This is symptomatic of either
a) the initial array descriptor on lhs receiving a copy of the array temporary created on the stack (rhs) as opposed to receiving a copy of the data that the temporary points to/references. (bug) Or
b) Some other uninitialized pointer or reference being used that happens to clash with the address of the data being altered.
For both cases, IanH's suggestion of setting a data changed break point is best.
Also note, for case a), just after the constructor completes, using the debugger, enter in the location of the memory that gets corrupted (e.g. bla%foo(index)) into the Memory Window. Then look at the address the memory window references. Ignore the contents, look at the address. Then open the registers window, look at rsp (stack pointer) to see if the address of what got corrupted is just below the stack pointer. If it is, then this confirms that issue a) is the culprit.
Note, corruption of the data in situation a) can occur with correctly defined variables that co-reside with the improperly constructed array.
corruption of the data in situation b) occurs with undefined (or out of scope) references.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Jim,
Thanks for the guide! I will like to go through this as an exercise from myself as well, since I am not experienced with debugging using debug tools.
May I ask which software you use.? Do you use the Intel Inspector Debugger - with the GUI
best regards
Ali
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
On Linux I use the GDB debugger (perform build with Eclipse IDE)
RE #18: a) would be bug in compiler, and b) would be bug in your code
Jim Dempsey
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page