- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
this code implements a random number generator class and test class. An object of the test class contains a pointer to an object of the random number class which can either be allocated individually or point to a global object. The number of objects created for the test class may be several thousand. The question is whether this code for random generation thread safe?:
module Mod_Ran
Implicit None
Private
Type, Public :: Ran
Integer, Private :: Error=0
Integer(kind=8), allocatable :: Seed
Character(:), allocatable :: CSMSG
Type(VSL_STREAM_STATE), allocatable :: TSS
contains
Private
Procedure, PAss, Public :: Init => SubInit
Generic, Public :: GetUniform => SubGetUniformVector
End type Ran
contains
Subroutine SubInit(this)
Implicit None
CLass(Ran), Intent(InOut) :: this
Integer(Kind=8) :: brng
Brng=VSL_BRNG_MCG59 !1.20
if(allocated(this%TSS)) Deallocate(this%TSS)
Allocate(this%TSS)
if(.not.allocated(this%ISSeed)) Then
Allocate(this%Seed,source=12345)
End if
this%Error=vslnewstream(this%TSS,brng,this%Seed)
End Subroutine SubInit
Subroutine SubGetUniformvector(this,InOut,lb,rb)
Implicit None
CLass(Ran), Intent(InOut) :: this
Real(Kind=8), Intent(In) :: rb, lb
Real(Kind=8), Intent(InOut) :: InOut(:)
this%error=vdrnguniform(&
&method=VSL_RNG_METHOD_UNIFORM_STD_ACCURATE,&
&stream=this%TSS,&
&n=size(InOUt,1),&
&r=InOut,&
&a=lb,&
&b=rb)
End Subroutine SubGetUniformVector
End module Ran
!!@@@@@@@@@@@@@@@@@@@@@@@@@@
!!@@@@@@@@@@@@@@@@@@@@@@@@@@
!!@@@@@@@@@@@@@@@@@@@@@@@@@@
Module Mod_Type
use Mod_Ran
private
Type, Public :: TT
integer(kind=8), allocatable :: seed
Type(Ran), Pointer :: TSR=>Null()
Real(kind=8), allocatable :: tmp(:)
contains
Procedure, Pass :: fill => subFill
End type TT
contains
Subroutine SubFill(this)
Implicit None
real(kind=8) :: rsr
Class(TT), Intent(InOut) :: this
if(.not.allocated(this%tmp)) Then
allocate(this%tmp(100))
End if
!!@@@@@@@@@@@@@@
!!check whether a stream exists, if not create one
if(.not.associated(this%tsr)) Then
if(.not.allocated(this%Seed)) Then
call random_number(rsr)
allocate(this%seed,source=int(rsr*100000.0D0,kind=8))
End if
Allocate(this%tsr);Allocate(this%tsran%seed,source=this%Seed)
End if
call this%tsr%getuniform(inout=this%tmp,lb=0.0D0,rb=1.0D0)
End Subroutine SubFill
End Module Mod_Type
!!@@@@@@@@@@@@@@@@@@@@@
!!@@@@@@@@@@@@@@@@@@@@@
!!@@@@@@@@@@@@@@@@@@@@@
Program Test
use Mod_Type
use Mod_Ran
Implicit none
Type(TT), allocatable :: TVT(:)
Type(Ran), allocatable, Target :: TSR
Integer :: i
Allocate(TVT(10000))
!!@@@@@@@@@@@@@@@@
!!option 1: everybody gets its on stream
!$OMP PARALLEL DO
Do i=1,size(TVT)
call tvt(i)%fill()
End Do
!$OMP END PARALLEL DO
!!@@@@@@@@@@@@@@@@
!!option 2: everybody uses the same stream
Allocate(TSR); call TSR%init(800466)
!$OMP PARALLEL DO PUBLIC(TSR)
Do i=1,size(TVT)
tvt(i)%TSR=>TSR
call tvt(i)%fill()
End Do
!$OMP END PARALLEL DO
end Program Test
From what I understood from the MKL manual, option 1 should be thread safe, option 2 not (in terms of correlations). Is that right?
Thanks a lot
karl
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
While your options can be considered as thread-safe given the assumption that the first option relies on the atomic call to the generator we recommend standard parallelization techniques for use with Intel MKL RNGs such as
• Skipping-ahead
• Leapfrogging
• Using different parameters set
Please, see documentation for additional details:
https://software.intel.com/en-us/mkl-vsnotes-independent-streams-leapfrogging-and-block-splitting
Choice of the generator as well as parallelization technique are driven by the requirements of the application.
Please, let us know, if it answers your question.
Best regards,
Pavel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am trying to implement a parallel Monte Carlo simulation using MKL random number generator.
A test program for that looks like the following.
This program runs but the speed is incredibly slow. (it's some 10 times slower than a single thread version by commenting out the #pragma line.)
I guess in my program the way to initialize the RNG system is not good but I couldn't understand what is the correct use of the initializing functions.
Does anyone give me an advice to improve this?
Also, I have no idea how much the value for "nskip" should be and I could not find its correct value on the Intel's documents including vslnotes.pdf, because those documents only shows pseudo programs as examples and they don't mentions the actual values to be set.
Thanks a lot in advance.
#include
#include
#include "mkl_vsl.h"
#define N 1000*1000
#define M 10
int main()
{
double *r; /* buffer for random numbers */
double s; /* average */
VSLStreamStatePtr stream;
int i, j, status, nskip;
nskip = 2*N;
r = (double*) malloc(N*sizeof(double));
#pragma omp parallel for
for( i=0; i, VSL_BRNG_MCG59, 777 );
status = vslSkipAheadStream( stream, i*nskip );
vdRngGaussian( VSL_RNG_METHOD_GAUSSIAN_ICDF, stream, N, r, 0.0, 1.0 );
s = 0.0;
for ( j=0; j += r;
printf("Sum is %.4lf\n", s);
vslDeleteStream( &stream );
}
free(r);
return 0;
}
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page