Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29277 Discussions

Thread Safe, Recursive Subroutines and Speed Issue

Alireza_Forghani
Beginner
1,907 Views

I'm working on making our subroutines used by a commercial FE software thread safe. To do so, I have turned all the subroutines to recursive subroutines, and that has led to significant slow down of the code. I wonder if this is expected, and if so, if there is any remedy to it. 

Thanks,

Alireza

0 Kudos
1 Solution
jimdempseyatthecove
Honored Contributor III
1,908 Views

What you have above is not related to recursive. Instead, you explained it yourself, you now are copying the data. consider using

function GETMATDETAILS(nUMAT)
  use data_container
  integer, intent(in) :: nUMAT
  type(t_material), pointer :: GETMATDETAILS
  GETMATDETAILS => matrices%matrix(nUMAT)
end function GETMATDETAILS

The above returns a pointer to the object.

Note, if you want a copy of the object then either a) make an explicit copy, of b) have two routines, one for pointer, and one for copy. The copy will be slower.

Jim Dempsey

View solution in original post

0 Kudos
7 Replies
jimdempseyatthecove
Honored Contributor III
1,908 Views

If your threading is using OpenMP and you use the compiler flag -openmp then all functions and subroutines by default are recursive. Recursive routines are not inherently slower.

You may have a performance issue induced if small local arrays are heap arrays as opposed to automatic arrays (stack).

Jim Dempsey

0 Kudos
Alireza_Forghani
Beginner
1,908 Views

Thanks Jim for the response. In this code, I'm just writing some subroutines that is called by the main software. I'm not sure how the threading is done in the main program, since it's like a black box to me. 

Initially our code was not thread safe, since running the main program with more than 1 cpu was causing segmentation fault errors. We modified the code to get rid of read-write variables in the module and adding recursive to the beginning of the subroutines. These modifications, made the subroutines capable of running when main program uses more than 1 cpu, but resulted in significant slow downs (3-4 times slower). 

Regarding your comment, I wonder how I can make sure that the arrays are dynamic not heap? 

Thanks,

Alireza

 

0 Kudos
Alireza_Forghani
Beginner
1,908 Views

According to perf, this is the subroutine that takes most of the runtime:

recursive subroutine GETMATDETAILS(nUMAT, MATRIX)
   use data_container
   integer, intent(IN) :: nUMAT
   type(t_material) :: MATRIX
   MATRIX=matrices%matrix(nUMAT)

end subroutine

data_container is a module that contains the data. And I have these data_interface subroutine  that read the data from the container and return it. 

Before, instead of calling GETMATDETAILS, subroutines were directly accessing data stored in the data_container, and it was much more quicker. 

Thanks

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,909 Views

What you have above is not related to recursive. Instead, you explained it yourself, you now are copying the data. consider using

function GETMATDETAILS(nUMAT)
  use data_container
  integer, intent(in) :: nUMAT
  type(t_material), pointer :: GETMATDETAILS
  GETMATDETAILS => matrices%matrix(nUMAT)
end function GETMATDETAILS

The above returns a pointer to the object.

Note, if you want a copy of the object then either a) make an explicit copy, of b) have two routines, one for pointer, and one for copy. The copy will be slower.

Jim Dempsey

0 Kudos
Alireza_Forghani
Beginner
1,908 Views

Hi Jim,

I tried your code but it returns this error:

 error #6796: The variable must have the TARGET attribute or be a subobject of an object with the TARGET attribute, or it must have the POINTER attribute.   

I wonder if I need to change the matrices%matrix array in my data_continer to type(...), dimension(...), target to allow pointing to it. 

Thanks,

Alireza

0 Kudos
TimP
Honored Contributor III
1,908 Views

If your critical local arrays have content which is uniform in all threads (not updated by individual threads), you may be able to recover performance by declaring them with SAVE,

Yes, an object must be declared with TARGET in order to set a pointer to it.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,907 Views

The array matrix, within the object matrices, must be attributed with TARGET

...

type(t_material), allocatable, target :: matrix(:)

or

type(t_material), target :: matrix(YourNumberOfMatricesParameter)

Jim Dempsey

0 Kudos
Reply