- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

I would expect it to compute 2.0000000000000000

as does the by hand implementation below executing on the same data in the same place.

--------

double testMean = 0.0;

for ( UINT idx = 0; idx < m_data.size(); idx++ ) {

testMean += m_data[idx];

}

testMean /= m_data.size();

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

MKL Summary Statistics algorithm for computation of mean is based onone-pass/on-line schemeM(n+1) = 1/(n+1) * (n * M(n) + X(n+1)), where X(n) - n-th observation, M(n) - mean estimate for dataset X(1),..., X(n),... Generally, the parameters in this formula are not exactly represented in FP arithmetic (like coefficient 1/(n+1) = 1/3 for n=2),thus, you see the result which differs from the ideal one by ~10^(-15) (what isaligned with double FP error).On the other hand, the intention behind the algorithm - have more stable resultsfor various types of the datasets, not just the the absolutely exact for thissimple dataset.

We also need to pay attention to the following important fact: in some (not all) applications statistical error dominates numerical error; if we estimate the mean fora realistic dataset (which contains non constant "signal"), stat errorwould have theorder ~1/sqrt(500)=~0.044 whichmakesnumerical error 10^(-15) "absolutely negligible".

Does itanswer your question?

Thanks,

Andrey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi Andrey,

Look, here is a modifiedtest for '**float**' ( single-precision )and '**double**' ( double-precision )data types:

/////////////////////////////////////////////////////////////////////////////////

float fData[500] = { 0.0f };

double dData[500] = { 0.0L };

float fTestMean = 0.0f;

double dTestMean = 0.0L;

UINT idx;

for( idx = 0; idx < 500; idx++ )

{

fData[idx] = 2.0f;

dData[idx] = 2.0L;

}

for( idx = 0; idx < 500; idx++ )

{

fTestMean += fData[idx];

}

fTestMean /= ( float )500;

for( idx = 0; idx < 500; idx++ )

{

dTestMean += dData[idx];

}

dTestMean /= ( double )500;

printf( "Mean for 'Float test-case' = %.32f\n", fTestMean );

printf( "Mean for 'Double test-case' = %.32f\n", dTestMean );

/////////////////////////////////////////////////////////////////////////////////

Mean for 'Float test-case' = 2.00000000000000000000000000000000

Mean for 'Double test-case' = 2.00000000000000000000000000000000

/////////////////////////////////////////////////////////////////////////////////

As you see resultsfrom **single-precision** ( float \24-bit ) test-case and **double-precision** ( double \53-bit ) test-case are identical! I'm absolutely not convinced with your explanation:

>>...~10^(-15) (**what isaligned with double FP error**)...

There are no miracles in my test-casebecause in **IEEE 754 standard** number **2.0** hasexact binary representations:**For single-precision:**

**2**(Base10) =

**0x40000000**(Base16) =>

**0 10000000 00000000000000000000000**(Base2\IEEE754)

and

**For double-precision:**

**2**(Base10) =

**0x4000000000000000**(Base16) =>

**0 10000000000 0000000000000000000000000000000000000000000000000000**(Base2\IEEE754)

Best regards,

Sergey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Thanks for your inputs.

My comments above are related not to the straightforward summation followed by division but to the one-pass scheme for computation of the mean M(n+1) = 1/(n+1) * (n * M(n) + X(n+1)), n=1,...,N(N=500) which is a different algorithm and whichgenerally can return the result slightly different from 2.0 in that particular case; this is what Scott sees with MKL Summary Stats.

In your test case you just accumulate sum of '2' (which has exactrepresentation)and then normalize it by N=500. The resulthas the exact representation until the number of summands is less or equal to the specifc threshold defined by data type - for single precision it is equal to 2^24. Once you exceed the threshold the estimate for the mean should degrade: if N=2^25 and the computations are done in single precision, the output of summation followed by the division is 1.0 and outputof the one pass algorithm is ~2.0.

Does it explain the difference?

Thanks,

Andrey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Sergey, you seem to have overlooked the main point of Andrey's explanation.

The MKL algorithm, as he explained it, is a **one-pass** algorithm for computing not only the mean but other statistics such as the standard deviation.

Another way of looking at the MKL calculation of the mean is as an **update** algorithm for use with streaming data. As new data become available, the statistics are updated. As a side effect of how this calculation is done, as early as after taking in the third data point, the result is no longer exact, since 1/3 is not exact in floating point.

For real data, again as Andrey explained, the errors arising from the finite precision of floating point are negligible compared to the noise in the data. Therefore, the failure of MKL to deliver an exact result for a case where a different algorithm can yield an exact result is not a significant deficiency.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Thanks for your explanation. I understand that the discrepancy is unavoidable for the single pass method.

Of course, when you display results for special cases like this to unsophisticated users you need to give them what they expect. In this case, formatting the result to 14 decimal places does the job.

The reason why I noticed is that the slight discrepancy in the mean messed up the workaround code described in my previous thread for the special case of constant data.

This code below results in skewness_est: -1.#IND and kurtosis: kurtosis_est = 21530895567645716.

However, for this case both skewness_est and kurtosis_est should be: -1.#IND

And, this is in fact the result when I replace the mean_est with the exact mean which gives data of all zeros

None of this is a real problem because I can just interpret huge values of kurtosis as an error return.

thanks for your help, Scott

----------------------------

if ( cen2mom == cen2mom && (cen2mom < 1.0e-14 || abs(trueMean) > 1.0e8 * cen2mom) ) {

vector

for ( int idx = 0; idx < m_count; idx++ ) {

data.push_back( m_data[idx] - trueMean );

}

status = vsldSSNewTask( &task, &numCols, &numRows, &xstorage, &(data[0]), 0, indices );

// Setup moment computation. Computing kurtosis requires all the other moments.

status = vsldSSEditTask(task, VSL_SS_ED_SKEWNESS, &skewness_est);

status = vsldSSEditTask(task, VSL_SS_ED_KURTOSIS, &kurtosis_est);

status = vsldSSEditMoments( task, &mean_est, &raw2mom, &raw3mom, &raw4mom, &cen2mom, &cen3mom, &cen4mom );

// Request moment computation

mask = VSL_SS_MEAN | VSL_SS_KURTOSIS | VSL_SS_SKEWNESS |

VSL_SS_2R_MOM | VSL_SS_3R_MOM | VSL_SS_4R_MOM |

VSL_SS_2C_MOM | VSL_SS_3C_MOM | VSL_SS_4C_MOM;

status = vsldSSCompute(task, mask, VSL_SS_METHOD_FAST);

// Remove the task

status = vslSSDeleteTask( &task );

}

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Do you calculate it as Biased or Unbiased? Could I specify it?

Best regards,

Sergey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi Sergey,

As specified in Intel MKL Manual (Chapter Statistical Functions/Summary Statistics, section Math Notations and Definitions) and in Summary Statistics Application Notes (Chapter Algorithms and Interfaces in Summary Statistics, Section Calculating Multiple Estimates) the library provides unbiased estimates for expectation, variance, variance-covariance matrix, and biased estimates for central moments of 3d and 4th order.

Thanks,

Andrey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

I believe the vsldSS task allows us to request the second central moment

which is the population variance or the biased sample variance.

But, you can easily convert to the unbiased sample variance by multiplying by N/(N-1)

where N is sample size.

Scott

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Just to make sure that we are on the same page: Intel MKL Summary Stats returns unbiased variance.

If for some reasonswe need the biasedestimate, we can get itby multiplying by (N-1)/N.

Thanks,

Andrey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

I just tested and you are absolutely right.

Requesting VSL_SS_2C_MOM, I get the unbiased sample variance.

The 2C_MOM in the name led me to think second central moment which usually refers to the biased sample variance.

thanks for straightening me out, Scott

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

I've spentextra time on the issue and I wanted to reproducea rounding problem ( or precision loss ) with the Meanin a simple C test-case.

On http://en.wikipedia.org/wiki/Algorithms_for_calculating_variancethere are 5algorithmswith pseudo-codesto calculate the Variance.I've implementedfour of them in C to calculate theStandard Deviation.

The most interesting thing isthat in case of One-Pass\Online algorithm I could not reproduce the problem!

The Mean for a data set with 500values of 2.0 was 2.0.

Take a look at my test-case for One-Pass algorithm and let me know if you're interested to get a complete test-case with 4 algorithms ( Naive, Two-Pass, Compensated and One-Pass )

...

/*

**Sources forNaive, Two-Pass, Compensated algorithmsare not included***/

...

//#define _RTTYPEfloat

#define _RTTYPEdouble

//#define _RTTYPElong double

RTint iN = 0;

RTint i = 0;

_RTTYPE dX[500] = { 0.0 };// Data Set 8

iN = 500;

for( i = 0; i < iN; i++ )

dX* = 2.0;*

RTint iB = 0; // 0 - UnBiased\1 - Biased

...

// **Sub-Test 5 - One-Pass Variance & StdDev**

{

/*

Def **One_Pass_Variance**( Data ):

N = 0

Mean = 0

M2 = 0

Delta = 0

for x in data:

N = N + 1

Delta = x - Mean

Mean = Mean + ( Delta / N )

M2 = M2 + Delta * ( x - Mean )# This expression uses the new value of Mean

Variance = M2 / ( N - B )

return Variance

*/

///*

_RTTYPE dVariance = 0.0;

_RTTYPE dStdDev = 0.0;

_RTTYPE dMean = 0.0;

_RTTYPE dM2 = 0.0;

_RTTYPE dDelta = 0.0;

for( i = 0; i < iN; i++ )

{

dDelta = dX* - dMean;dMean = dMean + ( dDelta / ( i+1 ) );dM2 = dM2 + dDelta * ( dX*

*- dMean );}dVariance = dM2 / ( iN - iB );dStdDev = CrtSqrt( dVariance );*

CrtPrintf( RTU("One-Pass StdDev: %.24f\n"), ( _RTTYPE )dStdDev );

CrtPrintf( RTU(" Mean: %.24f\n"), ( _RTTYPE )dMean );

CrtPrintf( RTU("\n") );

//*/

}

...**Results:**

Naive StdDev: 0.000000000000000000000000

Mean: 2.000000000000000000000000

Two-Pass StdDev: 0.000000000000000000000000

Mean: 2.000000000000000000000000

Compensated StdDev: 0.000000000000000000000000

Mean: 2.000000000000000000000000

One-Pass StdDev: 0.000000000000000000000000

Mean: 2.000000000000000000000000

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

@Sergey

Sorry but what would you expect from your code? You are summing TWO! Try with normally distributed random number and as Andrey said your error in computing mean and variance will scale as sqrt(nsamples).

I'm running Montecarlo simulations and the issue about normal RNG (mean and variance accuracy) can be destructive on fine structure of the physics. The solution of increasing sample space can be viable but sometime (depending on the physics of the system) can be even worst. A better choice would be to reject a random realization based on a threshold on its mean and/or variance, although VERY computationally expensive.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page