Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library
- VSL Summary Statistics skewness and kurtosis are incorrect

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

tennican

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-01-2011
06:38 PM

105 Views

VSL Summary Statistics skewness and kurtosis are incorrect

The mean was correct and the variance was just a little off.

But, the MKL estimates for skewness and kurtosis totally disagree with the R moments package.

R says:

> skewness(X)

[1] 2.400606

> kurtosis(X)

[1] 14.15447

MKL says:

skewness_est = -3.9303471475184684

kurtosis_est = 5.7271436907737581

Can anyone see what I did wrong in the code below:

int status;

VSLSSTaskPtr task;

MKL_INT xstorage = VSL_SS_MATRIX_STORAGE_ROWS;

MKL_INT numCols = 1;

MKL_INT numRows = 1419;

int indices[1] = {1};

status = vsldSSNewTask( &task, &numCols, &numRows, &xstorage, &(X[0]), 0, indices );

double mean_est = X[0];

double raw2mom = X[0];

double raw3mom = X[0];

double raw4mom = X[0];

double variance_est = X[0];

double cen3mom = X[0];

double cen4mom = X[0];

double skewness_est = X[0];

double kurtosis_est = X[0];

status = vsldSSEditTask(task, VSL_SS_ED_SKEWNESS, &skewness_est);

status = vsldSSEditTask(task, VSL_SS_ED_KURTOSIS, &kurtosis_est);

status = vsldSSEditMoments( task, &mean_est, &raw2mom, &raw3mom, &raw4mom,

&variance_est, &cen3mom, &cen4mom );

status = vsldSSCompute(task, VSL_SS_MEAN|VSL_SS_2C_MOM|VSL_SS_SKEWNESS|VSL_SS_KURTOSIS,

VSL_SS_METHOD_FAST);

X is composed of the attached data

1 Solution

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-02-2011
03:07 AM

105 Views

I believe that there is a typographical error in the value reported to be given by R for the kurtosis.

Here is a complete program to read the data file and print the statictics (

[cpp]#include#include int main(int argc,char *argv[]){ int status; double *X; int i; VSLSSTaskPtr task; MKL_INT xstorage = VSL_SS_MATRIX_STORAGE_ROWS; MKL_INT numCols = 1, numRows = 1419; int indices[1] = {1}; unsigned MKL_INT64 mask = 0; X=(double *)malloc(numRows*sizeof(double)); for(i=0; i With this program, I get [bash]s:lang>icl /Qmkl /Ot stats.cpp s:lang>stats < dat mean = 1.0505e+004, var = 1.0115e+000 skew = 2.3981e+000, kurt = 1.1135e+001

[/bash]

Link Copied

10 Replies

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-02-2011
03:07 AM

106 Views

I believe that there is a typographical error in the value reported to be given by R for the kurtosis.

Here is a complete program to read the data file and print the statictics (

[cpp]#include#include int main(int argc,char *argv[]){ int status; double *X; int i; VSLSSTaskPtr task; MKL_INT xstorage = VSL_SS_MATRIX_STORAGE_ROWS; MKL_INT numCols = 1, numRows = 1419; int indices[1] = {1}; unsigned MKL_INT64 mask = 0; X=(double *)malloc(numRows*sizeof(double)); for(i=0; i With this program, I get [bash]s:lang>icl /Qmkl /Ot stats.cpp s:lang>stats < dat mean = 1.0505e+004, var = 1.0115e+000 skew = 2.3981e+000, kurt = 1.1135e+001

[/bash]

Andrey_N_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-02-2011
07:45 AM

105 Views

Thanks for the communication.

Please, give us some time to reproduce your results;we will then return back to you.

Best,

Andrey

Andrey_N_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-04-2011
10:39 PM

105 Views

We reproduced your results and compared MKLversion of simple statistics against R package "Moments". Below is the summary of the observations and comments:

1.In computation of variance R package uses normalization n, where n is size of dataset x = (x(i)), i=1,..,n. MKL uses normalization n-1 which provides unbiasedness of the estimate. To be more more precise, MKL uses B=W-sum (w(i)^2) / W, W= sum (w(i)) where w(i) is weight assigned to observation x(i). With all weights w(i) set to 1 B is equal to n-1.

2.MKLversion of kurtotis is based on the algorithm K(x) = C4(x) / V(x) ^ 2 - 3where C4 is the central moment of 4th order, V(x) - variance. R package basically uses the same formula but without term "-3". Term "-3" is set to have thisestimate equal to zero for Gaussian distribution.

MKL versions ofcentral moment of 4th order and variance use normalizations n and n-1, correspondingly, while R uses normalization n in both cases.This can contribute insome numerical differences between R and MKLestimates of kurtosis.

Please, see Section "Mathematical Notation and Definitions" of Chapter Statistical Functions/ Summary Statistics in Intel MKL Manual for the additional detailsbehind the algorithms of Summary Stats in MKL.

3.Theimplementation ofkurtosis in MKL contains the bug which appears in the settings of the test case in the first post. Welocalized and fixed it. The fix will be available inone of future releases of the library. Currently, the work-around for thebug is to request "simultaneous" computation of kurtosis and central moments of 2nd and 4th order.

Please, let me know if this answers yourquestions.

Please, feel free to submit morequestions/feedbacks on Statistical Component of MKL,and we will help.

Best,

Andrey

tennican

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-05-2011
11:49 AM

105 Views

I think the main problem was that I hadn't requested the computation of all the precursors to skewness and kurtosis which you do correctly in your program and Audrey mentions is necessary below.

And R reports kurtosis rather than excess kurtosis as Audrey says.

Also, when I export the correct data to validate in R, sd(sales) in R equals sqrt(cen2mom) from MKL.

The text below is crossed out to avoid confusion:

I'm not so sure about the variance being different from the 2nd central moment.

I think you thought I wanted to compute the coefficient of variation which is quite different from variance.

There are small differences in the estimates of R var and MLK cen2mom.

However, R does report the unbiased not the biased estimate of variance.

So, the difference is not due to that.

I computed the biased and unbiased variance by hand below and the unbiased one is exactly var(X).

Also, I computed the variance from the coefficient of variation given by MKL and got a number

quite close to the cen2mom computed by MKL as you would expect.

Anyway, the estimates are close enough for my current purpose. So, I'm happy.

> var(X)

[1] 112904318

> N = length(X)

> myvar = sum((X-mean(X))^2)/N

> myvar

[1] 112824752

> myvar*N/(N-1)

[1] 112904318

>

> cen2mom = 112904462.89915065

> variation_est = 1.0114540252081452

> (variation_est * abs(mean(X)))^2

[1] 112904489

Andrey_N_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-06-2011
07:52 AM

105 Views

We ranthe test case with your data using double version of Summary Statistics functions on Nehalem CPU, Linux, in both IA-32 and Intel 64modesand obtained the followingoutput for mean and variance with MKL 10.3 (I do not provide the numbers forother estimates):

Mean = 10505.324172 Variance = 112904318.274247

That is, the value of variance is aligned with R output.

On the other hand, you communicate a different value: cen2mom = 112904462.89915065.

It might make sense to sync-up the results obtained on your and our sides.

Can you please clarify your environment:

1. MKL version

2. OS

3. CPU

4. Mode, IA-32 or Intel 64

5. Build/link line

6. Precision, double of float

Thanks,

Andrey

tennican

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-06-2011
11:11 AM

105 Views

I'm currently building within a larger project.

I did briefly try to build this example standalone but didn't want to waste time figuring out how to get around the fact that I don't want to install libiomp5md.dll on my computer.

### Environment

Using MKL 10.3 - ia32 and double precision 'd' functions

In C++ in MSVS 2010

Windows 7 Professional SP1 - 64-bit

Intel Xeon CPU dual-proc E5520 @ 2.27GHz

### To build

### I copied libiomp5md.dll and libiomp5md.lib into the mkl/lib/ia32 directory and used:

$(mkl-include)%(AdditionalIncludeDirectories)

$(IntelMklLib);%(AdditionalDependencies)

$(mkl-lib)%(AdditionalLibraryDirectories)

thanks for your help, Scott

tennican

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-06-2011
12:53 PM

105 Views

The discrepancy is due to an data export error.

I just realized that the data I was exporting to validate in R was rounded to dollars.

Now that I have fixed that, I get exactly the same variance in R as with MKL.

sorry for the confusion, Scott

Andrey_N_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-06-2011
09:57 PM

105 Views

It's good to know that we resolved the issue.

Please, feel free to askquestions on Statistical Component of Intel MKL, weare ready to help.

Best,

Andrey

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-07-2011
06:18 AM

105 Views

Thank you.

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

09-07-2011
01:59 AM

105 Views

The problem has been already fixed in the latest version MKL 10.3.Update 6 available now for download.

The official anounce of this release you can find here.

--Gennady

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.