- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
this code implements a random number generator class and test class. An object of the test class contains a pointer to an object of the random number class which can either be allocated individually or point to a global object. The number of objects created for the test class may be several thousand. The question is whether this code for random generation thread safe?:
module Mod_Ran Implicit None Private Type, Public :: Ran Integer, Private :: Error=0 Integer(kind=8), allocatable :: Seed Character(:), allocatable :: CSMSG Type(VSL_STREAM_STATE), allocatable :: TSS contains Private Procedure, PAss, Public :: Init => SubInit Generic, Public :: GetUniform => SubGetUniformVector End type Ran contains Subroutine SubInit(this) Implicit None CLass(Ran), Intent(InOut) :: this Integer(Kind=8) :: brng Brng=VSL_BRNG_MCG59 !1.20 if(allocated(this%TSS)) Deallocate(this%TSS) Allocate(this%TSS) if(.not.allocated(this%ISSeed)) Then Allocate(this%Seed,source=12345) End if this%Error=vslnewstream(this%TSS,brng,this%Seed) End Subroutine SubInit Subroutine SubGetUniformvector(this,InOut,lb,rb) Implicit None CLass(Ran), Intent(InOut) :: this Real(Kind=8), Intent(In) :: rb, lb Real(Kind=8), Intent(InOut) :: InOut(:) this%error=vdrnguniform(& &method=VSL_RNG_METHOD_UNIFORM_STD_ACCURATE,& &stream=this%TSS,& &n=size(InOUt,1),& &r=InOut,& &a=lb,& &b=rb) End Subroutine SubGetUniformVector End module Ran !!@@@@@@@@@@@@@@@@@@@@@@@@@@ !!@@@@@@@@@@@@@@@@@@@@@@@@@@ !!@@@@@@@@@@@@@@@@@@@@@@@@@@ Module Mod_Type use Mod_Ran private Type, Public :: TT integer(kind=8), allocatable :: seed Type(Ran), Pointer :: TSR=>Null() Real(kind=8), allocatable :: tmp(:) contains Procedure, Pass :: fill => subFill End type TT contains Subroutine SubFill(this) Implicit None real(kind=8) :: rsr Class(TT), Intent(InOut) :: this if(.not.allocated(this%tmp)) Then allocate(this%tmp(100)) End if !!@@@@@@@@@@@@@@ !!check whether a stream exists, if not create one if(.not.associated(this%tsr)) Then if(.not.allocated(this%Seed)) Then call random_number(rsr) allocate(this%seed,source=int(rsr*100000.0D0,kind=8)) End if Allocate(this%tsr);Allocate(this%tsran%seed,source=this%Seed) End if call this%tsr%getuniform(inout=this%tmp,lb=0.0D0,rb=1.0D0) End Subroutine SubFill End Module Mod_Type !!@@@@@@@@@@@@@@@@@@@@@ !!@@@@@@@@@@@@@@@@@@@@@ !!@@@@@@@@@@@@@@@@@@@@@ Program Test use Mod_Type use Mod_Ran Implicit none Type(TT), allocatable :: TVT(:) Type(Ran), allocatable, Target :: TSR Integer :: i Allocate(TVT(10000)) !!@@@@@@@@@@@@@@@@ !!option 1: everybody gets its on stream !$OMP PARALLEL DO Do i=1,size(TVT) call tvt(i)%fill() End Do !$OMP END PARALLEL DO !!@@@@@@@@@@@@@@@@ !!option 2: everybody uses the same stream Allocate(TSR); call TSR%init(800466) !$OMP PARALLEL DO PUBLIC(TSR) Do i=1,size(TVT) tvt(i)%TSR=>TSR call tvt(i)%fill() End Do !$OMP END PARALLEL DO end Program Test
From what I understood from the MKL manual, option 1 should be thread safe, option 2 not (in terms of correlations). Is that right?
Thanks a lot
karl
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
While your options can be considered as thread-safe given the assumption that the first option relies on the atomic call to the generator we recommend standard parallelization techniques for use with Intel MKL RNGs such as
• Skipping-ahead
• Leapfrogging
• Using different parameters set
Please, see documentation for additional details:
https://software.intel.com/en-us/mkl-vsnotes-independent-streams-leapfrogging-and-block-splitting
Choice of the generator as well as parallelization technique are driven by the requirements of the application.
Please, let us know, if it answers your question.
Best regards,
Pavel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am trying to implement a parallel Monte Carlo simulation using MKL random number generator.
A test program for that looks like the following.
This program runs but the speed is incredibly slow. (it's some 10 times slower than a single thread version by commenting out the #pragma line.)
I guess in my program the way to initialize the RNG system is not good but I couldn't understand what is the correct use of the initializing functions.
Does anyone give me an advice to improve this?
Also, I have no idea how much the value for "nskip" should be and I could not find its correct value on the Intel's documents including vslnotes.pdf, because those documents only shows pseudo programs as examples and they don't mentions the actual values to be set.
Thanks a lot in advance.
#include
#include
#include "mkl_vsl.h"
#define N 1000*1000
#define M 10
int main()
{
double *r; /* buffer for random numbers */
double s; /* average */
VSLStreamStatePtr stream;
int i, j, status, nskip;
nskip = 2*N;
r = (double*) malloc(N*sizeof(double));
#pragma omp parallel for
for( i=0; i, VSL_BRNG_MCG59, 777 );
status = vslSkipAheadStream( stream, i*nskip );
vdRngGaussian( VSL_RNG_METHOD_GAUSSIAN_ICDF, stream, N, r, 0.0, 1.0 );
s = 0.0;
for ( j=0; j += r;
printf("Sum is %.4lf\n", s);
vslDeleteStream( &stream );
}
free(r);
return 0;
}

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page