- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I’m new to defined input/output procedures and I find that the one I have written is eating up 80% of my runtime. I’m trying to saving a lot of data to disk but this still doesn’t seem correct to me; I am generating and processing all this data in the program as well, so just saving it to disk taking 80% of the runtime seems disproportionate. I can see why my I/O procedure would induce a lot of loops and be slow; however, being new to defined I/O procedures I’m not sure what I can do about it. Any suggestion would be greatly appreciated. The defined type I am trying to save with its routine is below.
Thanks,
Jamie
Module Policy
implicit none
type sparseCOOType
integer :: col
integer :: row
real :: val
end type sparseCOOType
type policyType
type(sparseCOOType), allocatable :: COO(:)
contains
procedure :: write_sample => write_container_sample_impl
procedure :: read_sample => read_container_sample_impl
generic :: write(unformatted) => write_sample
generic :: read(unformatted) => read_sample
end type policyType
contains
subroutine write_container_sample_impl(this, unit, iostat, iomsg)
class(policyType), intent(in) :: this
integer, intent(in) :: unit
integer, intent(out) :: iostat
character(*), intent(inout) :: iomsg
integer :: i
write(unit, iostat=iostat, iomsg=iomsg) size(this%COO)
do i=1,size(this%COO)
write(unit, iostat=iostat, iomsg=iomsg) this%COO(i)%col
write(unit, iostat=iostat, iomsg=iomsg) this%COO(i)%row
write(unit, iostat=iostat, iomsg=iomsg) this%COO(i)%val
end do
end subroutine write_container_sample_impl
subroutine read_container_sample_impl(this, unit, iostat, iomsg)
class(policyType), intent(inout) :: this
integer, intent(in) :: unit
integer, intent(out) :: iostat
character(*), intent(inout) :: iomsg
integer :: i, sizeCOO
read(unit, iostat=iostat, iomsg=iomsg) sizeCOO
allocate(this%COO(sizeCOO))
do i=1,sizeCOO
read(unit, iostat=iostat, iomsg=iomsg) this%COO(i)%col
read(unit, iostat=iostat, iomsg=iomsg) this%COO(i)%row
read(unit, iostat=iostat, iomsg=iomsg) this%COO(i)%val
end do
end subroutine read_container_sample_impl
end module
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your current procedure writes and reads all the data elements in turn. But the sparse matrix COO is an array of a simple derived type that just contains three scalars. Such a type can be written directly via the default facilities. So, you could write your policy object "this" via:
write(lun) this%COO
That will make the writing (and similarly the reading) much faster
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When it comes to performance questions, one is able to provide useful guidance only in the context of what_actions/how_often/in_what_way. You showed the code for the module, but provided little information regarding how that module is used, how much data is generated and written, etc.
If your program did nothing more than storing and restoring the row, col and val arrays using your defined I/O procedures, it is quite reasonable for the I/O to take most of the run time. It is only when we compare the I/O time with the rest of the time time spent (in doing other useful things) that we can decide whether the I/O is taking up more than a reasonable fraction of the run time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The I/O procedure is called 16 times to save large allocatable arrays of type policyType having dimension ranging from (30,8,1,1,10) to (30,8,3003,11,10). I have included below the open, write close snippet that makes those 16 calls.
More than that I don’t see how any additional information is relevant to diagnosing the efficiency of I/O procedure. All the code is posted for the I/O procedure and all the information about the calls to it also. When I gave the 80% figure it was meant to be indicative and I thought you would assume some average level of computational complexity behind the data I am trying to save. Not that I might be generating large amounts of random numbers and saving them to disk. Also whether I am write or wrong about 80% of runtime being very high for the saving of data if there is or isn’t a way to make the code more efficient is a separate issue that can be addressed without this information. I did not post additional information about the program as it seems to make my issue as clear as possible. That said, if you think it may be useful, here is an attempt of a summary of what is going on to generate the data. Each entry in the allocatable array of type policyType is the solution to a sequential quadratic programming problem, the input function value to which have to be found by fixed point iteration and which itself is embedded in an outer fixed point iteration.
Thanks,
Jamie
open (unit=201,form="unformatted", file=outfile, status='unknown', action='write')
write (201) modelObjects%policy
close( unit=201)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I suspect that there are two reason for the I/O taking up much CPU time.
The first is that you have a DO loop with an iteration count as large as 30*8*3003*11*10.
Each iteration of the loop writes or reads 12 bytes of "payload" as 3 records plus 24 bytes of record length markers. For each such record, your defined I/O method gets called, with 4+ arguments on the stack. That is a lot of overhead.
In addition, you include the optional arguments IOSTAT= and IOMSG=. Finding and stuffing in the values to return, even the zero and blank that are normally expected, is an additional overhead.
All this is done 16 times, as you wrote.
Is it possible to restructure the work so that the entire col array is written with one WRITE? Similarly for the row and val arrays?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great thanks, I will remove IOSTAT= and IOMSG= as a start. I had wondered if removing these might help a bit but hadn’t tried yet. It’s good to have that confirmed.
“Is it possible to restructure the work so that the entire col array is written with one WRITE? Similarly for the row and val arrays?” I don’t know that was the kind of question I was trying to get at when I said I could see why my code would imply lots of loops but don’t know how to change it. This is my first user defined I/O procedure so I don’t know what options are available there, as I have written it, it seems that for each entry in the array of type policy there is one call so I have access to just the object coo. Are there other options for the read write routine that would pass me the whole object?
If there aren’t other option with the I/O procedure itself, I can’t see how to reorganise the data to be able to write en masses. COO represents a sparse matrix so will have variable size for each entry in the array of type policyType, and I need to make sure the values of each COO for each entry in the main array are kept separate. Previously I didn’t use a user defined Policy and it was just an allocatalbe array with an extra couple of dimension (i.e. I wasn’t taking advantage of the sparsity of the matrix) which was much easier to write to disk but exhausted my memory for the largest arrays.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You have selected the (row, col, val) triplet as your basic entity. It gives you great flexibility in the parts of the code where you may visit row and col values in arbitrary order, computing the corresponding val. On the other hand, there is no way of storing information regarding the relation(s) of one triplet to the myriads of other triplets.
We could, instead, have selected as basic entity a sparse matrix type, with (n, nnz, row(:), col(:), val(:)) as our basic entity. In that case, the I/O would be done without any defined I/O procedures, and we would process a moderate number of large records, instead of processing a huge number of tiny records using defined I/O procedures, as you are doing now.
That brings me back to asking the kind of questions that you may not like: What advantages does type sparseCOOType give you in the portions of the code that you have not shown but have described rather tersely? How difficult would it be to define a sparse matrix type of the type that I mentioned, and use that instead of the disjointed triplets in those portions?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have only followed the discussion from afar, but would it be a solution to gather the triplets into arrays as suggested by mecej4 and then write them to file (and on input revert the process)? That would reduce the number of reads/writes, while increasing the memory usage.
Another possibility: write to a memory-based file first and then dump its contents to a file on disk.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Arjen,
Increasing memory usage is a no-go as I'm pretty much at the limit their. Don't follow the memory-based file suggestion, is the suggestion just to work with a binary stream inside the program? If so that sounds really messy.
Thanks,
Jamie
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The sparseCOOType gives no really advantage in the very long very messy section of the code I have described tersely, that’s why I described it tersely. I just don’t know what alternative basic entity structures fit my data and would be more efficient for I/O. This is probably at least partly because I do not clearly understand when a user defined I/O is required, and when it is not. What I am struggling with is the fact that storing a sparse array needs size to be variable which from my limited understanding means I need a user defined I/O procedure.
“We could, instead, have selected as basic entity a sparse matrix type, with (n, nnz, row(:), col(:), val(:)) as our basic entity. In that case, the I/O would be done without any defined I/O procedures, and we would process a moderate number of large records, instead of processing a huge number of tiny records using defined I/O procedures, as you are doing now.” That sounds great but not I’m not sure I follow. What are n and nnz here? Are you suggesting I use either compressed sparse row (or column) where nnz is the number and n collapse all the other dimension I have in the array of type policyType? If so then I have coded up below what I understand you to be saying and it gives me an error “error #5514: A derived type I/O list item that contains a pointer or an allocatable component (ROW) requires a user-defined derived-type input/output procedure.” So it doesn’t appear to be saveable without defined I/O procedure but maybe this wasn’t what you were suggesting.
program scratch
implicit none
type policyType
integer :: n
integer :: nnz
integer, allocatable :: row(:)
integer, allocatable :: col(:)
real, allocatable :: val(:)
end type policyType
type (policyType):: policy
open (unit=201,form="unformatted", file="outputFile", status='unknown', action='write')
write (201) policy
close( unit=201)
end program
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
EDIT: What I wrote here is wrong but I'll leave for posterity.
I just thought even if I hadn't misunderstood you and I can't save without a user defined routine this format seems like it would be much more efficient than the one I have now as I would only have one call to the user defined I/O. That's great, thanks. Being able to save without user defined I/O would be even better!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Assuming that the arrays have been allocated and values filled in properly, just use
write (201) policy%n,policy%nnz,policy%row,policy%col,policy%val
instead of
write (201) policy
Depending on how the work is structured, you may have to write (and read) two records per matrix:
write (201) policy%n,policy%nnz
write (201) policy%row,policy%col,policy%val
for facilitating the subsequent READ, with allocation of arrays between the two READs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry I'm still unclear about how you are suggesting I store the data in this solution. Currently I have an arrray policy(:) and for each entry in policy I have a spare matrix saved as COO. With the code I wrote interpreting your suggestion I need one object of type policyType for each of the entries in the previous array just now they are stored as CSR. So still multiple calls to write (although now they don't need a user defined routine) and I no longer understand what the point of n is.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your current procedure writes and reads all the data elements in turn. But the sparse matrix COO is an array of a simple derived type that just contains three scalars. Such a type can be written directly via the default facilities. So, you could write your policy object "this" via:
write(lun) this%COO
That will make the writing (and similarly the reading) much faster
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Perhaps this sketch of a program to form and dump two square matrices will help.
n = number of rows = number of columns
nnz = number of non-zero entries in matrix
program scratch
implicit none
type spMatType
integer :: n
integer :: nnz
integer, allocatable :: row(:)
integer, allocatable :: col(:)
real, allocatable :: val(:)
end type spMatType
type (spMatType):: mat1, mat2
mat1%n = 5
mat1%nnz = 13
allocate(mat1%row(mat1%nnz),mat1%col(mat1%nnz),mat1%val(mat1%nnz))
mat1%row = [1,1,1, 2,2, 3,3,3, 4,4,4, 5,5]
mat1%col = [1,2,3, 1,2, 3,4,5, 1,3,4, 2,5]
mat1%val = [1.0,-1.0,-3.0, -2.0,5.0, 4.0,6.0,4.0, -4.0,2.0,7.0, 8.0,-5.0]
! ...
! similar code to assign values for sparse matrix mat2
! ...
open (unit=201,form="unformatted", file="outputFile", status='unknown', action='write')
write (201) mat1%n, mat1%nnz
write (201) mat1%row, mat1%col, mat1%val
write (201) mat2%n, mat2%nnz
write (201) mat2%row, mat2%col, mat2%val
!
! pairs of WRITE statements for mat3, etc.
!
close( unit=201)
end program
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK understood. I thought you were saying I could have one single object of type spMatType for all my sparse matrices but I need one object for each sparse matrix. In that case I understand but the number of object I will need and the number of write statement will still be very large. The advantage is in removing the loop at the object COO level. Does that seem right? Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Instead of variables mat1, mat2, etc., you can declare an array mat(5), say, and use mat(1) in place of mat1, mat(2) in place of mat2, and so on.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes I understood that but that is still as many write statements as there are elements in mat(:) which would be just as many calls as I have currently to my user defined routine. This makes me think the gains would be prettty much the same as the suggestion of replacing the loop in my I/O routine with write(lun) this%COO but that the amount of code rewriting is great.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Originally, you were writing three I/O records for each element of each matrix. Instead, now you could be writing three I/O records for each matrix.
You may also be able to write code such as
do i = 1, n_mat
write (201) mat(i)%n, mat(i)%nnz
write (201) mat(i)%row, mat(i)%col, mat(i)%val
end do
if all the information is formed and collected before dumping to file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In regard to your first point. I understand the performance advantage over my original solution, I don’t understand if there is one over simply replacing my original I/O routine with:
subroutine write_container_sample_impl(this, unit, iostat, iomsg)
class(policyType), intent(in) :: this
integer, intent(in) :: unit
integer, intent(out) :: iostat
character(*), intent(inout) :: iomsg
integer :: i
write(unit, iostat=iostat, iomsg=iomsg) this%COO
end subroutine write_container_sample_impl
as suggested above. My best guess is there isn’t as this seems to be a single I/O record per matrix.
In regards to the second point are you saying that the compiler would be better able to optimize the loop in
do i = 1, n_mat
write (201) mat(i)%n, mat(i)%nnz
write (201) mat(i)%row, mat(i)%col, mat(i)%val
end do
Than
write (201) policy
where policy is an array of size mat_n with the I/O as above? If so couldn't I get the same advantage from doing
do i1=1,n1
do i2=1,n2
do i3=1,n3
do i4=1,n4
do i5=1,n5
write policy(i1,i2,i3,i4,i5)%coo
end do
end do
end do
end do
end do
Which requires less re-writing of code
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
do i1=1,n1 do i2=1,n2 do i3=1,n3 do i4=1,n4 do i5=1,n5 write policy(i1,i2,i3,i4,i5)%coo end do end do end do end do end do
do i1=1,n1
do i2=1,n2
do i3=1,n3
do i4=1,n4
do i5=1,n5
write policy(i1,i2,i3,i4,i5)%coo
end do
end do
end do
end do
end do
At a pinch when you are looking at developing a new program you might do this -- but based on a reasonable assumption about the other code -- writing will consume a lot -- the real question is why are you writing -- is there a better way to use the data in the program or is this a permanent write to an output file.
If I was doing something like this - open a SQL database and shove the whole thing in as a blob.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi John,
This is being saved as unformatted so it is a binary blob, surely formatting this as an SQL database would only add overhead. Given I have no need of this data to be in a database I don't see the advantage.
The only reason to save to disk is to get around insufficient RAM, (pretty much every machine I might run this on will have an order of magnitude more hard drive than RAM so not a question of going to a better machine). I then read the files one by one back into the same program to run simulation on the results and delete the files. So no reason for SQL, and I want the files to be the format that intel fortran will work fastest with which presumably is its own unformatted binary.
Thanks,
Jamie

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page