linked list memory usage

onurb · ‎03-03-2008

Hi,

I am trying to figure out the memory usage of a linked list in fortran 90. I use the following simple test code.

program test_sub
implicit none

type:: mytype
integer :: ij(2)
type(mytype), pointer :: next
end type mytype

integer :: k, n
type(mytype), pointer :: head, tail, item

n=1e7 --->length of the list
mem =0
nullify(head, tail)
do k=1,n
nullify(item)
allocate(item)
item%ij=1
mem=mem+sizeof(item)
if (k==1) then
head=>item
tail=>item
else
tail%next=>item
tail=>item
end if
end do

print*, sizeof(head) -> This gives 16
print*, sizeof(head%next) ->This gives 16
print*, sizeof(head%ij)->This gives8
print*, real(mem)/1024/1024 -> This gives 152.5879mb
stop
end program test_sub

Using the sizeof function I expect 16 byte per node and ~150mb memory usage in total. However when I check the usage through the system using 'top' command, I see 300mb. If I increase thelength of the list to 1e8, memory usage is 3gb while I expect it to be1.5gb. Am I doing something stupid in this code? Or my calculations are wrong? I am using intel compiler version 9.1 on a linux 64 bit machine. I forgot to add: I compiled the code without any flags.

Thank you for your help

Onur Bakir

onurb · ‎03-04-2008

Can it be explained like this: There is an overhead when using derived-types (there is right?) something like an array descriptor. If Iallocate a continuousarray of derived types (defining thenodes of the linked list), I have only one of these array descriptors. But when using a linked-list structure I allocate each node seperately,then theoverhead is repeated for each node. As far as I observed the there is an almost constant scale between expected and actual memory usage. If this is true linked-list would be a very unfortunate choice for large data sets right? Oh boy I am doomed. Can anybody help me out here?

Onur

Steven_L_Intel1 · ‎03-05-2008

There is no overhead using derived types. Array descriptors do take more space, but you don't have any in this program. Pointers to non-array types are just addresses.

albert · ‎03-05-2008

Did a quick test on the program and printed the addresses of the items.
The difference between 2 allocation is in this case 32 byte (so an overhead of 16 bytes), resulting in the observations of Onur. Changing the size of the type results in other overhead values.
(Used Intel 10.1 on a 64-bit system)

Albert

Ron_Green · ‎03-05-2008

what option are you using (or not using) for -align ?

ron

onurb · ‎03-05-2008

I don't use any options. Just plain:

ifort filename.f90

Steven_L_Intel1 · ‎03-06-2008

The difference in addresses between two allocations does not imply that the size of the allocation increased. Allocations automatically align, though I am not sure what the boundary is.

An easy way to check this is to declare an array of the structures and look at the size of the array divided by the number of elements. Remember that padding will be added by default to natural boundaries unless you say SEQUENCE in the type.

albert · ‎03-06-2008

I agree about the amount of memory allocated. But for the total memory used by the process is increased by the allocated memory and the "padding" (so in total with the number of bytes between 2 calls). That is correct ?

Albert

onurb · ‎03-06-2008

I check the allocation through the system, and that space is allocated (or padded but not usable, whatever you call it). I cannot use sequence in this type because the type statement includes a recursive pointer in itself. Compiler returns a warning that the structure is misaligned.

onurb · ‎03-06-2008

I think I was too optimistic about the size of the pointers. I have learned that another compiler is using around 16 bytes for a pointer to a scalar structure not 8, they told me that fortran pointers are not just memory addresses (I did not know that before).

Also, I forgot to add one more observation: IfIchange the length of the integers %ij from 1-4, memory usage does not change at all (it increases after 4).This structure is taking up 32 bytes of memory-space, with padding or no padding. Therefore it is extremely inefficient if the structure contains very small amount of data.

Steven_L_Intel1 · ‎03-06-2008

What another compiler does is not necessarily relevant to Intel Fortran. For non-array types, Intel Fortran represents a pointer as an address only.

In my test, each allocation is of size 16 (on x64), but there is extra space that the memory allocator uses to record information about the allocation so that it can properly deallocate the space.

If you are doing lots of allocations of small objects, this is going to add up. I'd suggest instead that you add a layer of your own memory manager that allocates, say, 4096 instances of "item" at a time and then doles them out as needed.