Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

random numbers with vdrnguniform

Marios_G_
Beginner
1,253 Views

Hello,

I want to generate random number with the SFMT19937 generator. Everything works fine as long as I store the random numbers to a vector, e.g

.... 
      brng=VSL_BRNG_SFMT19937

      method=VSL_RNG_METHOD_UNIFORM_STD
      seed=1777

      errcode=vslnewstream( stream, brng,  seed )

      do j = 1,1000
       
         errcode=vdrnguniform( method, stream, n, r, lb, ub )
         write(1,*) r(j)
       end do

with this I fill r(1000) with double real random numbers, since the argument r in

vdrnguniform

is a vector.

Is there a way just to generate them without storing them to a vector? 

I want to generate several trillions of random numbers for a monte-carlo simulation.

Storing them to an array doesn't seem like a good idea....

0 Kudos
1 Solution
Andrey_N_Intel
Employee
1,253 Views

Hi Marios,

In large scale Monte Carlo simulations that require significant amount of random numbers you generally do not need to store all of them in memory. As Mecej4 mentions above, random numbers can be generated as blocks of the fixed size, and generation of random numbers can be followed by their immediate processing  as shown below

do i = 1,Nexperiments
! size of array n can be aligned with size of (L1) cache of your CPU
errcode=vdrnguniform( method, stream, n, r, 0.0, 1.0 ) 

! process random numbers

end do

This blocked approach will help you to improve data locality and sets the base for parallelization. On the other hand, we do not recommend to call Intel MKL RNGs on vector size 1 as in this case you would likely to see degradation of your application, please see the chart available at https://software.intel.com/sites/products/documentation/doclib/mkl_sa/111/vsl/functions/uniform_cont.htm. Depending on type of basic random number generator, it makes sense to set the vector size to at least a few hundred elements.

Please, let me know, if it answers your question, or you need more details/clarification.

Thanks,

Andrey

 

View solution in original post

0 Kudos
16 Replies
mecej4
Honored Contributor III
1,254 Views

There is no requirement that the fourth argument to vdrnguniform, a, be a vector. If you enable interface checking, and you wish to retrieve a single random number with a call to the generator, declare the array as a(1). Alternatively, leave out the USE MKL_VSL statement, to bypass the argument type checking.

You have a great deal of flexibility in whether to generate batches of random numbers into a buffer, to be used and replenished as required, or to generate a single random number with each call.

0 Kudos
Marios_G_
Beginner
1,254 Views

If I declare the array as r(1) then the following code:

s=0.0d0 
n=20000
 do j = 1, n
              
  errcode=vdrnguniform( method, stream, n, r, lb, ub )
               s = s + r(1)
                write(1,*) r(1)

    end do

produces an access violation run-time error. Specifically I get: forrtl: severe (157): Program Exception - access violation

0 Kudos
Marios_G_
Beginner
1,254 Views

If I leave out the USE MKL_VSL statement I get compilation warnings:

\Source1.f90(118): warning #6717: This name has not been given an explicit type.   [VSLNEWSTREAM]
\Source1.f90(127): warning #6717: This name has not been given an explicit type.   [VDRNGUNIFORM]
\Source1.f90(139): warning #6717: This name has not been given an explicit type.   [VSLSAVESTREAMF]
\Source1.f90(143): warning #6717: This name has not been given an explicit type.   [VSLDELETESTREAM]

and the code:

              
              do j = 1, n        
               errcode=vdrnguniform( method, stream, n, r1, lb, ub )
               s = s + r1
                write(1,*) r1  
              end do

where r1 this time is just a double real variable again produces forrtl: severe (157): Program Exception - access violation at run-time

 

0 Kudos
Andrey_N_Intel
Employee
1,254 Views

Hi Marios,

In large scale Monte Carlo simulations that require significant amount of random numbers you generally do not need to store all of them in memory. As Mecej4 mentions above, random numbers can be generated as blocks of the fixed size, and generation of random numbers can be followed by their immediate processing  as shown below

do i = 1,Nexperiments
! size of array n can be aligned with size of (L1) cache of your CPU
errcode=vdrnguniform( method, stream, n, r, 0.0, 1.0 ) 

! process random numbers

end do

This blocked approach will help you to improve data locality and sets the base for parallelization. On the other hand, we do not recommend to call Intel MKL RNGs on vector size 1 as in this case you would likely to see degradation of your application, please see the chart available at https://software.intel.com/sites/products/documentation/doclib/mkl_sa/111/vsl/functions/uniform_cont.htm. Depending on type of basic random number generator, it makes sense to set the vector size to at least a few hundred elements.

Please, let me know, if it answers your question, or you need more details/clarification.

Thanks,

Andrey

 

0 Kudos
mecej4
Honored Contributor III
1,254 Views

It may help, Marios, to see a short example that illustrates the ideas mentioned in this thread.

The example generates 100 blocks of 100 random numbers each, and computes the mean of the 10,000 random numbers generated.

0 Kudos
Marios_G_
Beginner
1,254 Views

Dear Andrey,

 

thanks for your reply. I solved the problem during the weekend doing the following:

...
do k = 1, mcsweeps
           errcode=vdrnguniform( method, stream, size, r, lb, ub )
     
           do j = 1, NY
           do i = 1, NX
               s = s + r((j-1)*NX+i)
                write(1,*) r((j-1)*NX+i)

           end do
           end do
      end do 

...

where r(NX*NY) the size of the lattice. This way I get random numbers for all the lattice sites and all the mcsweeps (which is  Nexperiments in your example) without having to store them in an array of NX*NY*mcsweeps elements. 

Actually you suggest the same thing if I understand correctly, right?

Best Regards,

Marios

 

0 Kudos
Andrey_N_Intel
Employee
1,254 Views

Hi Marious,

yes, I suggest the same approach. However, If NX x NY is big enough, say, has order of million(s), it would make sense to make the generation more granular, that is to generate the same amount of numbers NX x NY in chunks of the smaller size and apply processing to each chunk.

While I' not aware about all the details of your application, the loop over k appears to be the candidate for parallelization. Do you consider the option to parallelize that loop using OpenMP or other technology? If so, another BRNG such as MT2203 could be used in this case.

Thanks,

Andrey

0 Kudos
Marios_G_
Beginner
1,254 Views

Hi Andrey,

 

NX*NY is the lattice size so it's not big (less than 100x100). The code above is an example, actually it's more like that:

do k = 1, mcsweeps
	           errcode=vdrnguniform( method, stream, size, r, lb, ub )
	      
	           do j = 1, NY
	           do i = 1, NX
	        
!                   calculate the spin -flip probability U
                     ......
                      if (r((j-1)*NX+i) < U) S(i,j)=-1.0d0
                      else 
                       S(i,j)=1.0d0
	              end if  
	           end do
	           end do

	      end do


I am worried about the summation between the candidate spin and it's nearest neighbors in order to parallelize the double loop over NX, NY. 

If I manage it do I have to use another RNG? Doesn't each thread just pick an element of r to do it's job? Or each thread/core needs it's own random number sequence?

Thanks again,

Marios

 

0 Kudos
Andrey_N_Intel
Employee
1,254 Views

Hi Marios,

If iterations of the outer loop over mcsweeps depend on each other in a simple way (for example, say, you need summation of the results obtained for each iteration k), you could do the blocks of iterations in parallel, calculate an save the intermediate results into additional memory, and then  combine them into the final result in the serial part of the function, after the parallel loop is done. Amount of the additional memory is defined by the number of threads used in the code.

To run parallel Monte Carlo simulations you would need one of MKL basic random number generators suitable for goals of your application. RNGs in Intel MKL support parallel simulations via several techniques such as leap-frog, skip-ahead, and a family of BRNGs; each thread would produce  it's own sequence. High level overview of the techniques is available in VSL Training materials at https://software.intel.com/en-us/articles/intel-mkl-vsl-training-material, slides 9-13. Additional details are available in VSL RNG Notes at https://software.intel.com/en-us/mkl_11.1_vslnotes, section "Random Streams and RNGs in parallel computations"; for each basic random number generator the document describes supported parallelization techniques.

Please, let me know, if you need more details.

Thanks,

Andrey

0 Kudos
Marios_G_
Beginner
1,254 Views

hello again,

I am trying to use the SFMT19937 RNG in a program that I run in a cluster but I found out that the MKL version the cluster has installed has only the mkl_vsl.f77 version not the f90.

Where can I find documentation on how to generate random numbers using the f77 version of mkl_vsl?

This is really frustrating...

0 Kudos
mecej4
Honored Contributor III
1,254 Views

Try the following: create a file containing these lines:

      module mkl_vsl
      include 'mkl_vsl.f77'
      end module mkl_vsl

Compile this source file to produce the file mkl_vsl.mod. Then, add "USE mkl_vsl" to your source code from which you wish to callMKL VSL routines.

Whether this suggestion will work will depend on the version of MKL on your cluster.

0 Kudos
Marios_G_
Beginner
1,254 Views

I modified the first lines of the code as:

PROGRAM test

 USE IFPORT
   
IMPLICIT NONE

  include 'mkl_vsl.f77'  
 
...

and it compiled and run correctly on my pc but when I tried it at the cluster I got:

 

 ifort  -O3 -warn all -xSSE4.2 -parallel -par-report1 -openmp -openmp-report1 -vec-report1 -heap-arrays 0 -mkl -o run.out Long2.F90


/opt/intel/Compiler/11.1/072/mkl/include/mkl_vsl.f77(295): error #6404: This name does not have a type, and must have an explicit type.   [VSL_USER_INIT_DIRECTION_NUMBERS]
      PARAMETER (VSL_USER_INIT_DIRECTION_NUMBERS = 2)
-----------------^
Long2.F90(48): error #6457: This derived type name has not been declared.   [VSL_STREAM_STATE]
 TYPE (VSL_STREAM_STATE) :: stream
-------^
Long2.F90(109): error #6404: This name does not have a type, and must have an explicit type.   [VSL_BRNG_SFMT19937]
    brng=VSL_BRNG_SFMT19937
---------^
Long2.F90(110): error #6404: This name does not have a type, and must have an explicit type.   [VSL_RNG_METHOD_UNIFORM_STD]
    method=VSL_RNG_METHOD_UNIFORM_STD
-----------^
Long2.F90(151): error #6404: This name does not have a type, and must have an explicit type.   [STREAM]
    errcode=vslnewstream( stream, brng,  seed1 )   ! new stream for the RNG
--------------------------^
/opt/intel/Compiler/11.1/072/mkl/include/mkl_vsl.f77(201): remark #7712: This variable has not been used.   [VSL_BRNG_SHIFT]
      INTEGER*4 VSL_BRNG_SHIFT
----------------^

The good news is that this time the linking to mkl_vsl.f77 is done but,

it seems that  the compiler still doesn't "understand" the mkl_vsl.f77

I still don't understand how it's possible that the cluster's mkl version doesn't have the mkl_vsl.f90. As I see from the first error:

/opt/intel/Compiler/11.1/072/mkl/include/mkl_vsl.f77(295): error #6404: This name does not have a type, and must have an explicit type.   [VSL_USER_INIT_DIRECTION_NUMBERS]
      PARAMETER (VSL_USER_INIT_DIRECTION_NUMBERS = 2)

the compiler version is 11.1so how old can the mkl version be?

Any ideas? I am really frustrated with this situation

 

 

 

0 Kudos
Marios_G_
Beginner
1,254 Views

hi mecej4,

I also tried your suggestion:

PROGRAM HEATBATH

  USE mkl_vsl
 
IMPLICIT NONE

 ....

I compiled first the file containing the module mkl_vsl

 ifort -c -O3  -parallel -par-report1 -mkl filemod.f90

and then:

ifort ewald.o filemod.o -O3 -warn all -xSSE4.2 -parallel -par-report1 -openmp -openmp-report1 -vec-report1 -heap-arrays 0 -mkl -o run.out Long2.F90

and I got almost the same as in my comment above:

Long2.F90(48): error #6457: This derived type name has not been declared.   [VSL_STREAM_STATE]
 TYPE (VSL_STREAM_STATE) :: stream
-------^
Long2.F90(109): error #6404: This name does not have a type, and must have an explicit type.   [VSL_BRNG_SFMT19937]
    brng=VSL_BRNG_SFMT19937
---------^
Long2.F90(110): error #6404: This name does not have a type, and must have an explicit type.   [VSL_RNG_METHOD_UNIFORM_STD]
    method=VSL_RNG_METHOD_UNIFORM_STD
-----------^
Long2.F90(151): error #6404: This name does not have a type, and must have an explicit type.   [STREAM]
    errcode=vslnewstream( stream, brng,  seed1 )   ! new stream for the RNG
--------------------------^

Any ideas?

 

0 Kudos
mecej4
Honored Contributor III
1,254 Views

I do not see a definition of VSL_STREAM_STATE in mkl_vsl.f77 under IFort 11.1.070-Windows. Same problem with IFort 2011 SP1 on Windows.

I do see the definition in the version of mkl_vsl.f77 that came with IFort 2013 and later versions.

I think, therefore, that you need to find out the latest version of MKL available on the cluster and write programs that are compatible with that version, unless you can ask the managers of the cluster to update to a more recent version of MKL.

 

0 Kudos
Marios_G_
Beginner
1,254 Views

thanks mecej4,

 

the cluster uses unix (linux) so I guess the same problem exists there too.

if that's the case where can I find documentation on how to generate properly random numbers using ifort 11.1.070?

 

0 Kudos
mecej4
Honored Contributor III
1,254 Views

You will need to ascertain the version of MKL on your cluster. It is probably 10.2 Update 7, but I am not sure -- especially if your cluster uses Itanium CPUs.

You can then search for the MKL documentation on the Intel site, but you may find it hard to locate older documentation. The documentation may be stored on the cluster login nodes or a NFS fileshare. You can also find older manuals at third party sites, such as http://scc.qibebt.cas.cn/docs/compiler/intel/11.1/Intel%20MKL/mklman.pdf .

 

0 Kudos
Reply