Random Numbers

JohnNichols · ‎09-17-2022

The sample program vdRngGaussian Example Program Text has a seed of 777. The program if executed on a Dell Precision running VS 2019 and the latest ONEAPI MKL returns for a sample of 10000

The first number is -4.95 -- the program is seeking a mean of 0 and a variance and hence SD of 1.

The probability that a 4.95 will occur in a set of 10000 random numbers with the mean and variance is 0.07% according to a standard EXCEL calculation.

The mean is 0.012, the variance is 1.005, the skew is -0.00792 and the kurt is -0.00055

The count about the mean is

-5	1
-4	8
-3	227
-2	1331
-1	3374
1	3408
2	1408
3	228
4	15

The results file is attached. I look at the statisitical properties of thermal vibration data, a typical set has 400000 FFT points, I look at the ranges, we very rarely, as in almost never, see a 4.95 standard deviations from the mean. We colour code the plots so if it came out we would see it and I would want to look at it closely.

I appreciate the limitations, but I still find this interesting.

JohnNichols · ‎09-18-2022

The first entry in the results matrix is determined directly from seed, using an equation that has a quadratic element, a sgn term and a ln term.

The seed can theoretically be back determined from the first entry, although for accuracy you would need the first two.

The second term is functionally related to the first term, I did not seek further. The FFT generates a similar result for the phase angle, although more random than this method.

In a Gaussian distribution the likely first term has a probability of being within one SD from the mean of 68% which suggests that the first term should lie in this range for most of the seed. The seed needs to be half the size of the largest integer(4) to be about 0.

No one in their right mind is going to use a seed of this size, it is just not a human number.

The example does not include implicit none.

About 1 in 1250 of the Gaussian sets described by the program are non-Guassian - they appear from a simple test to be about evenly spaced, I could be wrong as I am not going to investigate the several billion sets available to that level. Using your Gaussian test.

The numbers are not random in any sense if they can be recovered given the first two numbers to a reasonable level of accuracy. They are at best Gaussian most of the time, ok some of the time.

I have read the original Box Muller work from Princeton, I understand what they are doing, it is just not random, sometimes it is Gaussian depending on how you define Gaussian. The seed limits are interesting and they are probably ok for the 1970s, but one suspects that the huge data bases of today generate SD that is greater than six occasionally.

The SOUP program from Recreational Mathematics converged nicely with the BASIC random number, the program gets stuck using the Fortran Random number generator. This is an observation.

The seed and its impact should be described in your notes.

JohnNichols · ‎09-18-2022

!     ***** Sample moments *****
    sum=0.0
    sum2=0.0
    do i=1,n
        sum=sum+r(i)
        sum2=sum2+r(i)*r(i)
    end do
    sM=sum/n
    sD=sum2/n-sM*sM

    !     ***** Comparison of theoretical and sample moments *****
    tD2=tD*tD
    s=((tQ-tD2)/n)-(2*(tQ-2*tD2)/(n*n))+((tQ-3*tD2)/(n*n*n))
    DeltaM=(tM-sM)/sqrt(tD/n)
    DeltaD=(tD-sD)/sqrt(s)

    if (abs(DeltaM)>3.0 .OR. abs(DeltaD)>3.0) then
        print 12,"Error: sample moments (mean=",                        &
            &    sM,", variance=",sD,                                          &
            &    ") disagree with theory (mean=",                              &
            &    tM,", variance=",tD,")."
        !stop 1
    else
        print 12,"Sample moments (mean=",sM,                            &
            &    ", variance=",sD,") agree with theory (mean=",                &
            &    tM,", variance=",tD,")."
    end if

In practical terms your limit on Gaussian means a mean of 0.3 for a N(0,1) set is called non-Gaussian. Any idea why this number set was selected?

I am using a slightly older MKL example, although it appears consistent with your current one, other than a allocate statement and OMP.

From your notes:

--------------------------------------------------------------------------------------------------

Random Number Generator Functions (RNG)
Use common pseudorandom, quasi-random, and nondeterministic random number engines to solve continuous and discrete distributions.

---------------------------------------------------------------------------------------------------

From the dictionary nondeterministic = Non-predictive. Referring to the inability to objectively predict an outcome or result of a process due to lack of knowledge of a cause and effect relationship or the inability to know initial conditions.

----------------------------------------------------------------------------------------------------

I am having some trouble accepting this is an accurate description of your RNGs.


    Module MCRandom

    USE MKL_VSL_TYPE
    USE MKL_VSL
    implicit none


    INTEGER, PARAMETER :: dp = selected_real_kind(15, 307)


    contains

    SUBROUTINE seekRandom(n, r)

    implicit none

    integer n, NTB
    integer(kind=4) i,nn, j

    integer(kind=4) errcode

    real(kind=8) a,sigma
    real(kind=8) r(n)
    integer brng,method,seed

    real(kind=8) tM,tD,tQ,tD2
    real(kind=8) sM,sD
    real(kind=8) sum, sum2
    real(kind=8) s
    real(kind=8) DeltaM,DeltaD,ab, ac

    TYPE (VSL_STREAM_STATE) :: stream


    brng=VSL_BRNG_MCG31
    method=VSL_RNG_METHOD_GAUSSIAN_ICDF
    seed=1000000000

    a=0.0
    sigma=1.0

    !     ***** Initialize *****
    errcode=vslnewstream( stream, brng,  seed )
    call CheckVslError(errcode)

    !     ***** Call RNG *****
    errcode=vdrnggaussian( method, stream, n, r, a, sigma)
    call CheckVslError(errcode)

    !     ***** Theoretical moments *****
    tM=a
    tD=sigma*sigma
    tQ=720.0*sigma*sigma*sigma*sigma

    !     ***** Sample moments *****
    sum=0.0
    sum2=0.0
    do i=1,n
        sum=sum+r(i)
        sum2=sum2+r(i)*r(i)
    end do
    sM=sum/n
    sD=sum2/n-sM*sM

    !     ***** Comparison of theoretical and sample moments *****
    tD2=tD*tD
    s=((tQ-tD2)/n)-(2*(tQ-2*tD2)/(n*n))+((tQ-3*tD2)/(n*n*n))
    DeltaM=(tM-sM)/sqrt(tD/n)
    DeltaD=(tD-sD)/sqrt(s)

    if (abs(DeltaM)>3.0 .OR. abs(DeltaD)>3.0) then
        print 12,"Error: sample moments (mean=",                        &
            &    sM,", variance=",sD,                                          &
            &    ") disagree with theory (mean=",                              &
            &    tM,", variance=",tD,")."
        !stop 1
    else
        print 12,"Sample moments (mean=",sM,                            &
            &    ", variance=",sD,") agree with theory (mean=",                &
            &    tM,", variance=",tD,")."
    end if

    !     ***** Deinitialize *****
    errcode=vsldeletestream( stream )
    call CheckVslError(errcode)

10  format(F7.3)
11  format(A,F5.3)
14  format(A,I6)
12  format(A,F5.2,A,F5.2,A,F5.2,A,F5.2,A)


    RETURN
    END     SUBROUTINE seekRandom

    end module MCRandom

JohnNichols · ‎09-20-2022

Unless I am mistaken in your sample the equation on line 74 is in error.

!===============================================================================
! Copyright 2003-2022 Intel Corporation.
!
! This software and the related documents are Intel copyrighted  materials,  and
! your use of  them is  governed by the  express license  under which  they were
! provided to you (License).  Unless the License provides otherwise, you may not
! use, modify, copy, publish, distribute,  disclose or transmit this software or
! the related documents without Intel's prior written permission.
!
! This software and the related documents  are provided as  is,  with no express
! or implied  warranties,  other  than those  that are  expressly stated  in the
! License.
!===============================================================================

!  Content:
!    vdRngGaussian  Example Program Text
!*******************************************************************************

      include 'mkl_vsl.f90'
      include "errcheck.inc"

      program MKL_VSL_TEST

      USE MKL_VSL_TYPE
      USE MKL_VSL

      integer(kind=4) i,nn
      integer n
      integer(kind=4) errcode

      real(kind=8) a,sigma
      real(kind=8) r(1000)
      integer brng,method,seed

      real(kind=8) tM,tD,tQ,tD2
      real(kind=8) sM,sD
      real(kind=8) sum, sum2
      real(kind=8) s
      real(kind=8) DeltaM,DeltaD

      TYPE (VSL_STREAM_STATE) :: stream

      n=1000
      nn=10

      brng=VSL_BRNG_MCG31
      method=VSL_RNG_METHOD_GAUSSIAN_ICDF
      seed=777

      a=0.0
      sigma=1.0

!     ***** Initialize *****
      errcode=vslnewstream( stream, brng,  seed )
      call CheckVslError(errcode)

!     ***** Call RNG *****
      errcode=vdrnggaussian( method, stream, n, r, a, sigma)
      call CheckVslError(errcode)

!     ***** Theoretical moments *****
      tM=a
      tD=sigma*sigma
      tQ=720.0*sigma*sigma*sigma*sigma

!     ***** Sample moments *****
      sum=0.0
      sum2=0.0
      do i=1,n
        sum=sum+r(i)
        sum2=sum2+r(i)*r(i)
      end do
      sM=sum/n
      sD=sum2/n-sM*sM

JohnNichols · ‎09-20-2022

!     ***** Comparison of theoretical and sample moments *****
        tD2=tD*tD
        s=((tQ-tD2)/n)-(2*(tQ-2*tD2)/(n*n))+((tQ-3*tD2)/(n*n*n))
        DeltaM=(tM-sM)/sqrt(tD/n)
        DeltaD=(tD-sD)/sqrt(s)

The equation on line 4 is a single sample t test to check if the means are the same, Intel postulates that the mean and the variance are the same for the analysis, so one would prefer to use the two sample test as Intel specifies, N(0,1). The two results are related by the number -0.71, but the second alternative provides a way to directly use the t Test table, which for a reasonable number of samples yields 2.

A small explanation would help as well.

JohnNichols · ‎09-20-2022

A suggestion that adds the skew and the t test for your assumptions of N(0,1).

 !===============================================================================
    ! Copyright 2003-2019 Intel Corporation.
    !
    ! This software and the related documents are Intel copyrighted  materials,  and
    ! your use of  them is  governed by the  express license  under which  they were
    ! provided to you (License).  Unless the License provides otherwise, you may not
    ! use, modify, copy, publish, distribute,  disclose or transmit this software or
    ! the related documents without Intel's prior written permission.
    !
    ! This software and the related documents  are provided as  is,  with no express
    ! or implied  warranties,  other  than those  that are  expressly stated  in the
    ! License.
    !===============================================================================

    !  Content:
    !    vsRngGaussian  Example Program Text
    !*******************************************************************************

    subroutine MKL_VSL_TEST(mul,r,a,sigma,n)

    USE MKL_VSL_TYPE
    USE MKL_VSL

    integer(kind=4) i,nn,j,mul
    integer(4) n
    integer(kind=4) errcode

    real(kind=4) a,sigma
    real(kind=4) r(n)
    integer(4) brng,method,seed

    real(kind=8) tM,tD,tQ,tD2
    real(kind=8) sM,sD,sS
    real(kind=8) sum, sum2, sum3
    real(kind=8) s, ttest, dum
    real(kind=8) DeltaM,DeltaD


    TYPE (VSL_STREAM_STATE) :: stream

    nn=10

    do j = 1, 1

        brng =   VSL_BRNG_MCG31
       ! method = VSL_RNG_METHOD_GAUSSIAN_ICDF 
        method = VSL_RNG_METHOD_GAUSSIAN_BOXMULLER

        seed=2000*(100-mul) + 100000000*j*mul

        !a=0.0

        !     ***** Initialize *****
        errcode=vslnewstream( stream, brng,  seed )
        call CheckVslError(errcode)

        !     ***** Call RNG *****
        errcode=vsrnggaussian( method, stream, n, r, a, sigma)
        call CheckVslError(errcode)

        !     ***** Theoretical moments *****
        tM=a
        tD=sigma*sigma
        tQ=720.0*sigma*sigma*sigma*sigma

        !     ***** Sample moments *****
        sum=0.0
        sum2=0.0
        sum3 = 0.0
        do i=1,n
            sum=sum+r(i)
            sum2=sum2+r(i)*r(i)
            sum3=sum3+r(i)*r(i)*r(i)
        end do
        sM=sum/n
        sD=sum2/(n-1)
        sS = sum3/((n-1)*sD*sqrt(sD))

        !     ***** Comparison of theoretical and sample moments *****
        tD2=tD*tD
        s=((tQ-tD2)/n)-(2*(tQ-2*tD2)/(n*n))+((tQ-3*tD2)/(n*n*n))
        DeltaM=(tM-sM)/sqrt(tD/n)
        DeltaD=(tD-sD)/sqrt(s)
        
        dum = 2.0/float(n)
        ttest = (sM - tM)/(sqrt(dum)*sqrt((sD + tD)/2.0))

        !     ***** Printing results *****
        print *,"           Method :: vsRng Gaussian Box Mueller."
        print *,"           ------------------------------------"
        print *,"           seed=2000*(100-mul) + 100000000*j*mul"
        print *,""
        print *,"           Parameters:"
        print 11,"            a        = ",a
        print 11,"            sigma    = ",sigma
        print 11,"            Variance = ",sigma*sigma
        print 15,"            Mult     = ",mul
        print 15,"            seed     = ",seed
        
        write(2,221)a,sigma, sigma*sigma,mul,seed
221     format("           Method :: vsRng Gaussian Box Mueller.",/,&
               "           -------------------------------------",/,&    
               "           seed=2000*(100-mul) + 100000000*j*mul",//,&
               "           Parameters:",/,&
               "            a        = ",f12.3,/,&
               "            sigma    = ",f12.3,/,&
               "            Variance = ",f12.3,/,&
               "            Mult     = ",I12,/,&
               "            seed     = ",I12,/)
               

        !print *,""
       ! print *,"Results (first 10 of 1000):"
       ! print *,"---------------------------"
       ! do i=1,nn
         !   print 10,i,r(i)
        !end do

      !  print *,""
        if (abs(DeltaM)>3.0 .OR. abs(DeltaD)>3.0) then
            print 12,"Error: sample moments (mean=",                        &
                &    sM,", variance=",sD,                                          &
                &    ") disagree with theory (mean=",                              &
                &    tM,", variance=",tD,")."
            stop 1
        else
           
            write(2,223)sM,sD,tM,tD,sS,ttest,DeltaM,DeltaD
            write(*,223)sM,sD,tM,tD,sS,ttest,DeltaM,DeltaD,(ttest/DeltaM)
223         Format(/,"            Sample moments are (mean=",F10.3, ", variance=",F10.3,")",/"            agree with theory  (mean=",F10.3,", variance=",F10.3,").",/,"            Skew ",F10.3,"  t Test :: ",F10.3,/,"                  ",3(F10.3,3x)/,"           --------------------------------")
        end if

        !     ***** Deinitialize *****
        errcode=vsldeletestream( stream )
        call CheckVslError(errcode)
    end do

10  format(i6,2x,F7.3)
11  format(A,2x,F12.3)
12  format(A,F8.2,A,F8.2,A,F8.2,A,F8.2,A,F8.2,3x,F8.2)
15  format(A,2x,I12)
    return
    end

Sets a reasonable seed value each time. Well mostly.


     !---------------------------------------------------------------------------
    !
    !
    !
    !---------------------------------------------------------------------------

    subroutine get_seed(mul)
    ! Program to demonstrate GETDAT and GETTIM
    USE IFPORT
    integer(4) mul
    INTEGER(4) tmpday, tmpmonth, tmpyear
    INTEGER(4) tmphour, tmpminute, tmpsecond, tmphund
    CHARACTER(1) mer
    CALL GETDAT(tmpyear, tmpmonth, tmpday)
    CALL GETTIM(tmphour, tmpminute, tmpsecond, tmphund)
    IF (tmphour .GT. 12) THEN
        mer = 'p'
        tmphour = tmphour - 12
    ELSE
        mer = 'a'
    END IF
    WRITE (*, 900) tmpmonth, tmpday, tmpyear
    WRITE (2, 900) tmpmonth, tmpday, tmpyear
900 FORMAT("            Date :: ",I2, '/', I2.2, '/', I4.4)
    WRITE (*, 901) tmphour, tmpminute, tmpsecond, tmphund, mer
    WRITE (2, 901) tmphour, tmpminute, tmpsecond, tmphund, mer
901 FORMAT("            Time :: ",I2, ':', I2.2, ':', I2.2, ':', I2.2, ' ', A, 'm',/"            Seed value determined from time!",/)
    mul = tmphund * rand(0)

    return
    END

ShanmukhS_Intel · ‎09-21-2022

Hi John Nichols,

Thank you for your feedback and posting on Intel communities and for elaborating on your details.

We have provided your feedback to the relevant team. At this moment there is no visibility when it will be implemented and available for use. Will get back to you with an update. Meanwhile, Could you please confirm the software version of oneMKL being used by you? We are discussing the details mentioned internally.

Best Regards,

Shanmukh.SS

Pavel_D_Intel1 · ‎09-22-2022

Hi John,

Thanks a lot for the great questions and suggestions.

I completely agree with your statement that the first value deviates from the rest of the generated sequence.

The reason is that MCG31 engine is used in the current example (it is a known behavior for it).

To mitigate this effect I suggest you to change the engine to the another one (e.g. VSL_BRNG_PHILOX4X32X10 / VSL_BRNG_MT19937 / other) or to change the seed (as you have already done).

I don't understand a little bit the question about nondeterministic engine. Nondeterministic engine is RDRAND() based and produces nonrepeatable from run to run results. It makes impossible to use this engine in Monte Carlo simulations but it can be used in any other applications where such behavior is required. Note: to use nondeterministic engine you just need to change VSL_BRNG_MCG31 to VSL_BRNG_NONDETERM.

Regarding adding skewness and t-test - thank you! We will investigate this question more deeply. Note: this code is just an example that demonstrates how to use APIs of Gaussian distribution. But anyway thank you for the insights how we can improve it!

And regarding the last suggestion about adding the function to generate the seed value: the main aim of this code is to show users how to call Gaussian distribution (including Monte Carlo simulations use-case) so that the way with the constant seed and reproducible from run to run results was selected for this example.

Best regards,
Pavel

ShanmukhS_Intel · ‎09-29-2022

Hi,

Reminder:

Has the information provided helped? Kindly let us know if this resolves your issue.

Best Regards,

Shanmukh.SS

ShanmukhS_Intel · ‎11-03-2022

Hi John,

As we have provided your feedback to the relevant team, we are closing this thread for now. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.

Best Regards,

Shanmukh.SS