- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I run the examples the document provided, then I can only get the eigenvalues and the eigenvectors. but I want to get the transformed values after Online PCA, Because I only want to use the final result and compare it with Incrementalpca in sklearn, obviously, the two results are not directly comparable , I want to know how to use the eigenvalues and the eigenvectors to get the converted values in python.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
You need to use "daal.algorithms.pca.transform" library to transform the data. I've update the "pca_svd_dense_online.py" example to implement data transformation using Eigen values. Hope that helps.
import os import sys from daal.algorithms import pca import daal.algorithms.pca.transform as pca_transform from daal.data_management import FileDataSource, DataSourceIface utils_folder = os.path.realpath(os.path.abspath(os.path.dirname(os.path.dirname(__file__)))) if utils_folder not in sys.path: sys.path.insert(0, utils_folder) from utils import printNumericTable DAAL_PREFIX = os.path.join('..', 'data') # Input data set parameters nVectorsInBlock = 250 dataFileName = os.path.join(DAAL_PREFIX, 'online', 'pca_normalized.csv') if __name__ == "__main__": # Initialize FileDataSource<CSVFeatureManager> to retrieve the input data from a .csv file dataSource = FileDataSource( dataFileName, DataSourceIface.doAllocateNumericTable, DataSourceIface.doDictionaryFromContext ) # Create an algorithm for principal component analysis using the SVD method algorithm = pca.Online(method=pca.svdDense) while(dataSource.loadDataBlock(nVectorsInBlock) == nVectorsInBlock): # Set the input data to the algorithm algorithm.input.setDataset(pca.data, dataSource.getNumericTable()) # Update PCA decomposition algorithm.compute() # Finalize computations result = algorithm.finalizeCompute() # Print the results printNumericTable(result.get(pca.eigenvalues), "Eigenvalues:") printNumericTable(result.get(pca.eigenvectors), "Eigenvectors:") #Data Transformation tralgorithm = pca_transform.Batch() # Set lower and upper bounds for the algorithm tralgorithm.parameter.nComponents = 2 dataSource = FileDataSource( dataFileName, DataSourceIface.doAllocateNumericTable, DataSourceIface.doDictionaryFromContext ) dataSource.loadDataBlock() # Set an input object for the algorithm tralgorithm.input.setTable(pca_transform.data, dataSource.getNumericTable()) # Set an input object for the eigenvectors tralgorithm.input.setTable(pca_transform.eigenvectors, result.get(pca.eigenvectors)) # Set an input object for the eigenvectors tralgorithm.input.setCollection(pca_transform.dataForTransform, result.getCollection(pca.dataForTransform)) # Compute PCA transformation function trres = tralgorithm.compute() printNumericTable(dataSource.getNumericTable(), "First rows of the input data:", 4) printNumericTable(trres.get(pca_transform.transformedData), "First rows of the min-max normalization result:", 4)
See here to get a detailed explanation on the library's usage
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
when I follow your advise, I get the error
ImportError: No module named 'daal.algorithms.pca.transform'; 'daal.algorithms.pca' is not a package
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What version of daal are you using? This is available in 2018.0.1 version. You can check that at by using the command
import daal daal.__version__
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How can I update the daal. I tried
pip install daal --upgrade
and
pip uninstall daal pip install daal
but the version of daal is still '2017.0.3.20170414'
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
you may also try to take the latest version of DAAL download this free library from this resource - https://software.intel.com/en-us/performance-libraries
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm not complete sure if the latest version of daal is available in PyPI. It is recommended to use Anaconda cloud to install latest versions of Intel related Python packages.
conda install daal -c intel
I assume you are using Intel Distribution for Python. If not, the instructions are here. It's pretty simple.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page