hidden text to trigger early load of fonts ПродукцияПродукцияПродукцияПродукция Các sản phẩmCác sản phẩmCác sản phẩmCác sản phẩm المنتجاتالمنتجاتالمنتجاتالمنتجات מוצריםמוצריםמוצריםמוצרים
Intel® oneAPI Data Analytics Library
Learn from community members on how to build compute-intensive applications that run efficiently on Intel® architecture.

wrong pca result?

Lingzi_P_
Beginner
376 Views

Hi,

I attached my code for PCA analysis below and I don't think it's giving correct result. I also found that no matter how I change the values in "data" it always return the last eigenvalue as 0. Am I doing something wrong here?

data:

2.000     0.000     0.000
0.000     3.000     0.000
0.000     0.000     9.000

eigen value:
1.500     1.500     0.000

eigen vectors:
-0.816    0.408     0.408
-0.000    -0.707    0.707
0.577     0.577     0.577

Code:

    pca::Batch<double, pca::svdDense> algorithm;

    HomogenNumericTable<double> *dataTable = new HomogenNumericTable<double>(data, nFeatures, nObservations);

    services::SharedPtr<HomogenNumericTable<double> > dataTablePtr(dataTable);
    printNumericTable(dataTablePtr);
    algorithm.input.set(pca::data, dataTablePtr);

    algorithm.compute();

    services::SharedPtr<pca::Result> result = algorithm.getResult();

    printNumericTable(result->get(pca::eigenvalues));
    printNumericTable(result->get(pca::eigenvectors));

 

 

 

0 Kudos
1 Reply
Olga_R_Intel
Employee
376 Views

Hi Lingzi,

Intel DAAL version of PCA normalizes the data before computation of eigenvectors and eigenvalues. Thus, for arbitrary 3d diagonal matrix that represents your data, its intermediate normalized representation is as follows:
1.155 -0.577 -0.577
-0.577 1.155 -0.577
-0.577 -0.577 1.155

Per our extra validation of Intel DAAL PCA results by using R*, the results are identical up to numeric error.

For your question related to zero value of the last eigenvalue.
The n x p dataset represents n feature vectors in p-dimensional space. In particular, 3 feature vectors of size 3 will occupy the same plane which is represented with 2 vectors. PCA computes those two vectors.
For a different data set with n > p, for example

double data[] =
{
    2, 1, 0,
    0, 3, 3,
    -1, 3, 7,
    0, 0, 2
};

Intel DAAL PCA returns the following eigenvalues: 2.391     0.542     0.067

0 Kudos
Reply