Solved: How to generate the same random sequence with random and random_intel

Gang_Z_ · ‎10-26-2017

Currently

rg = numpy.random.mtrand.RandomState(1234)

ri = numpy.random_intel.RandomState(1234)

However these two instance generate different sequences.

rg.get_state() returns tuple while ri.get_state() returns bytes.

how could I set rg's state to ri?

Could they generate the same random sequences?

Thanks!

Oleksandr_P_Intel · ‎10-27-2017

Hi,

The class instance rg implements MT19937 algorithm which generates a pseudo-random sequence of 32-bit unsigned integers. All samplers within that class use this sequence to generate random samples from their distribution.

The class instance ri also uses MT19937, implemented in MKL. If identically initialized both would generate the same stream of unsigned integers.

The paper by Matsumoto, where the algorithm had been proposed, gave two initialization algorithms, one which takes a single seed (implemented in numpy.random.seed, but absent in MKL and hence in numpy.random_intel.seed), and one which takes a vector of numbers implemented in both numpy.random and in numpy.random_intel.

In [1]: import numpy as np, numpy.random as vrng, numpy.random_intel as irng

In [2]: rg = vrng.RandomState([1234, 5678])

In [3]: ri = irng.RandomState([1234, 5678])

In [4]: rg.randint(0,2**32, size=10, dtype=np.uint32)
Out[4]: 
array([1880837566, 2718258354, 4147529071, 3731544169, 4197473601,
       1173020486, 1638125687, 3756842864,  909180233, 3754665251], dtype=uint32)

In [5]: ri.randint(0,2**32, size=10, dtype=np.uint32)
Out[5]: 
array([1880837566, 2718258354, 4147529071, 3731544169, 4197473601,
       1173020486, 1638125687, 3756842864,  909180233, 3754665251], dtype=uint32)

Initialization with the single integer seed in numpy.random_intel just interprets it as running the vector algorithm with length one vector containing that seed value:

In [6]: ri = irng.RandomState([1234])

In [7]: ri.randint(0,2**32, size=10, dtype=np.uint32)
Out[7]: 
array([4150886329, 3342196574, 1892932127,  501869158,   32175636,
        389311301, 3912611952, 4048155970, 4034129617, 3466048957], dtype=uint32)

In [8]: ri = irng.RandomState(1234)

In [9]: ri.randint(0,2**32, size=10, dtype=np.uint32)
Out[9]: 
array([4150886329, 3342196574, 1892932127,  501869158,   32175636,
        389311301, 3912611952, 4048155970, 4034129617, 3466048957], dtype=uint32)

Even having the same underlying pseudo-random sequence of uint32, numpy.random and numpy.random_intel use different algorithms to generate sequences of floating point numbers, even doubles:

In [13]: rg = vrng.RandomState([1234, 5678])

In [14]: ri = irng.RandomState([1234, 5678])

In [15]: rg.rand(10)
Out[15]: 
array([ 0.43791662,  0.96567187,  0.97730048,  0.38140586,  0.21168502,
        0.36981215,  0.75081243,  0.99305566,  0.24318839,  0.09435813])

In [16]: ri.rand(10)
Out[16]: 
array([ 0.43791662,  0.63289384,  0.96567186,  0.86881783,  0.97730048,
        0.27311511,  0.38140586,  0.87470814,  0.21168502,  0.87420113])

In Monte-Carlo-type computations this should not matter, as the value obtained by the algorithms should be interpreted as a realization of a random variable, and repeating the computation with a different seed is expected to produce a different value, another sample from the same distribution.

View solution in original post

Oleksandr_P_Intel · ‎10-27-2017

Hi,

The class instance rg implements MT19937 algorithm which generates a pseudo-random sequence of 32-bit unsigned integers. All samplers within that class use this sequence to generate random samples from their distribution.

The class instance ri also uses MT19937, implemented in MKL. If identically initialized both would generate the same stream of unsigned integers.

The paper by Matsumoto, where the algorithm had been proposed, gave two initialization algorithms, one which takes a single seed (implemented in numpy.random.seed, but absent in MKL and hence in numpy.random_intel.seed), and one which takes a vector of numbers implemented in both numpy.random and in numpy.random_intel.

In [1]: import numpy as np, numpy.random as vrng, numpy.random_intel as irng

In [2]: rg = vrng.RandomState([1234, 5678])

In [3]: ri = irng.RandomState([1234, 5678])

In [4]: rg.randint(0,2**32, size=10, dtype=np.uint32)
Out[4]: 
array([1880837566, 2718258354, 4147529071, 3731544169, 4197473601,
       1173020486, 1638125687, 3756842864,  909180233, 3754665251], dtype=uint32)

In [5]: ri.randint(0,2**32, size=10, dtype=np.uint32)
Out[5]: 
array([1880837566, 2718258354, 4147529071, 3731544169, 4197473601,
       1173020486, 1638125687, 3756842864,  909180233, 3754665251], dtype=uint32)

Initialization with the single integer seed in numpy.random_intel just interprets it as running the vector algorithm with length one vector containing that seed value:

In [6]: ri = irng.RandomState([1234])

In [7]: ri.randint(0,2**32, size=10, dtype=np.uint32)
Out[7]: 
array([4150886329, 3342196574, 1892932127,  501869158,   32175636,
        389311301, 3912611952, 4048155970, 4034129617, 3466048957], dtype=uint32)

In [8]: ri = irng.RandomState(1234)

In [9]: ri.randint(0,2**32, size=10, dtype=np.uint32)
Out[9]: 
array([4150886329, 3342196574, 1892932127,  501869158,   32175636,
        389311301, 3912611952, 4048155970, 4034129617, 3466048957], dtype=uint32)

Even having the same underlying pseudo-random sequence of uint32, numpy.random and numpy.random_intel use different algorithms to generate sequences of floating point numbers, even doubles:

In [13]: rg = vrng.RandomState([1234, 5678])

In [14]: ri = irng.RandomState([1234, 5678])

In [15]: rg.rand(10)
Out[15]: 
array([ 0.43791662,  0.96567187,  0.97730048,  0.38140586,  0.21168502,
        0.36981215,  0.75081243,  0.99305566,  0.24318839,  0.09435813])

In [16]: ri.rand(10)
Out[16]: 
array([ 0.43791662,  0.63289384,  0.96567186,  0.86881783,  0.97730048,
        0.27311511,  0.38140586,  0.87470814,  0.21168502,  0.87420113])

In Monte-Carlo-type computations this should not matter, as the value obtained by the algorithms should be interpreted as a realization of a random variable, and repeating the computation with a different seed is expected to produce a different value, another sample from the same distribution.

Gang_Z_ · ‎10-29-2017

@Oleksandr P.

Thanks for your clean and detailed explanation. I summarize it

Intel doesn't implement single seed(convert it to vector with one value), the result will be different from numpy
The same vector seed, Intel will be consistent with numpy for integer random sequences
Float/Double random sequences generation algorithm for Intel and Numpy are different, so the results are different, either.

In my project randn call will generate different float/double sequences. The application scenario is in risk computation(Monte-Carlo method). Beyond my expectation the different random matrix with random normal distribution, the results are different...

I think It really helps. Thank you again.