- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In kmeans_types.h, kmeans::Result allocates instances of HomogenNumericTable to hold the results.
Is there an example showing how to get the algorithm to store its result in SOANumericTable instances?
Thanks.
ACS
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Seems to be working now.
Just needed to add some extra code to populate the SOANumericTable properly.
template<typename T> SOANumericTablePtr allocateSoaNumericTable(size_t nColumns,size_t nRows) { SOANumericTablePtr t(new SOANumericTable(nColumns,nRows)); NumericTableDictionary *d(t->getDictionary()); for(int i=0;i<nColumns;++i){ d->addFeature<T>(i); } t->allocateDataMemory(); return t; }
Let me know if I'm missing anything.
Thanks!
ACS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Alvin,
In Intel(R) DAAL, by default, the compute() method of the algorithm classes (including k-means) allocates memory for results using allocate() method of the respective Result class. You also have the option to save the results into your memory what can be more effective from perspective of memory use in specific cases. In order to do that, you need to apply the following steps:
- construct the object of the respective (for example, k-means) Result class
- allocate memory/Numeric Tables for results. In case of k-means, those are centroids, value of goal function, number of iterations, and, optionally, assignments
- register the memory (more exactly, shared pointer to memory) in the Result using the method set(). Make sure your Numeric Tables derive from Intel DAAL NumericTable type and implement all necessary methods
- register the Result object in the algorithm object using method setResult()
- run computations.
Providing the examples which demonstrate such use of the library is in our plans for future releases.
You also might want to have a look at the examples in the folder examples\cpp\source\datasource of the library installation directory which demonstrate use of AOS/SOA/Homogeneous Numeric Tables.
Please, let us know, if you have more questions
Thanks,
Andrey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Turns out it's not working after all. Here's basically what I have:
kmeans::Batch<> algorithm(nClusters, nIterations); algorithm.input.set(kmeans::data, numericTablePtr); size_t nColumns = numericTablePtr->getNumberOfColumns(); size_t nRows = numericTablePtr->getNumberOfRows(); SOANumericTablePtr assignmentsResult(allocateSoaNumericTable<double>(1,nRows)); SOANumericTablePtr centroidsResult(allocateSoaNumericTable<double>(nColumns,nClusters)); // Create our own Result object and allocate SOANumericTables to hold results. SharedPtr<kmeans::Result> result(new kmeans::Result()); result->set(kmeans::assignments,assignmentsResult); result->set(kmeans::centroids,centroidsResult); algorithm.setResult(result); // Use my Result object. algorithm.compute(); NumericTablePtr assignments1(algorithm.getResult()->get(kmeans::assignments)); NumericTablePtr centroids1(algorithm.getResult()->get(kmeans::centroids));
Unfortunately, this gives incorrect results when used with the same parameters as kmeans_batch.cpp and using kmeans.csv example data.
kMeans(...) assignments: 0.000 4.000 2.000 4.000 1.000 1.000 1.000 0.000 1.000 0.000 0.000 5.000 0.000 1.000 2.000 2.000 0.000 3.000 2.000 0.000 kMeans(...) centroids: 4.118 26.727 -14.120 57.874 -9.685 28.675 30.317 -6.586 14.098 10.541 20.676 0.801 47.162 1.496 -18.233 -43.514 4.475 5.384 51.285 54.874 -2.336 -3.688 -3.691 -25.112 -17.490 -6.370 -45.389 -50.949 -37.732 -25.020 0.780 70.679 -52.177 7.757 7.493 -91.890 28.577 49.136 -53.561 16.029 18.887 -32.584 39.780 15.388 18.220 52.567 -35.363 15.762 8.102 7.649 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
The assignments don't match at all and the rows of zeroes are unexpected.
If I comment out the algorithm.setResult(result) line, it's basically the same as the example program and gets a matching result.
Using intel-daal-common-056-2016.0-056.noarch if that makes any difference.
Thanks.
ACS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I figured I'd add a call to check my hand-rolled result before calling algorithm.setResult(result):
result->check(&algorithm.input,&algorithm.parameter,kmeans::lloydDense); // result->check(&algorithm.input, &algorithm.parameter, algorithm.getMethod());
I can't use the second form since algorithm.getMethod() is protected. Is check(...) not supposed to be called from the outside?
Thanks.
ACS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Alvin,
I was not able to reproduce your k-means outputs when SOA tables are used to store the results of the computations. On my side the results were equal to the results of the default example. It makes sense to sync-up on the details. I used your function allocateSoaNumericTable(size_t
nColumns,
size_t
nRows)
to allocate memory; in my case the function returns the pointer SOANumericTable* that is used to initialize respective shared pointer:
SharedPtr<data_management::SerializationIface> centroids(allocateSoaNumericTable<double>( nColumns, nClusters ) ); ... SharedPtr<kmeans::Result> result(new kmeans::Result()); ... result->set(kmeans::centroids,centroids);
Did you do the same initialization?
Also, which OS/version of the library (32 or 64 bit) what type of linking(static or dynamic), threading or sequential library do you use?
Answering your second question - check() method of the Result object is expected to be used from the outside. We plan to have the method getMethod() public in the nearest release. Thank you for this feedback.
Andrey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here's the hacked version of the example program:
/* file: kmeans_batch.cpp */ /******************************************************************************* ! Copyright(C) 2014-2015 Intel Corporation. All Rights Reserved. !******************************************************************************* ! Content: ! K-means clustering example program text. !******************************************************************************/ /** * <a name="DAAL-EXAMPLE-CPP-KMEANS_BATCH"></a> * \example kmeans_batch.cpp */ #include "daal.h" #include "service.h" using namespace std; using namespace daal; using namespace daal::algorithms; /* Input data set parameters */ string datasetFileName = "rpm/intel-daal-common-056-2016.0-056.noarch/opt/intel/compilers_and_libraries_2016.0.056/linux/daal/examples/data/batch/kmeans.csv"; const size_t nObservations = 10000; /* KMeans algorithm parameters */ const size_t nClusters = 20; const size_t nIterations = 5; typedef SharedPtr<NumericTable> NumericTablePtr; typedef SharedPtr<SOANumericTable> SOANumericTablePtr; namespace{ template<typename T> SOANumericTablePtr allocateSoaNumericTable(size_t nColumns,size_t nRows) { SOANumericTablePtr t(new SOANumericTable(nColumns,nRows)); NumericTableDictionary *d(t->getDictionary()); for(int i=0;i<nColumns;++i){ d->addFeature<T>(i); } t->allocateDataMemory(); return t; } } int main(int argc, char *argv[]) { checkArguments(argc, argv, 1, &datasetFileName); /* Initialize FileDataSource to retrieve input data from .csv file */ FileDataSource<CSVFeatureManager> dataSource(datasetFileName, DataSource::doAllocateNumericTable, DataSource::doDictionaryFromContext); /* Retrieve the data from input file */ dataSource.loadDataBlock(nObservations); /* Create algorithm object for KMeans algorithm */ kmeans::Batch<> algorithm(nClusters, nIterations); NumericTablePtr numericTablePtr(dataSource.getNumericTable()); algorithm.input.set(kmeans::data, numericTablePtr); size_t nColumns = numericTablePtr->getNumberOfColumns(); size_t nRows = numericTablePtr->getNumberOfRows(); std::cerr<<"nColumns="<<nClusters<<std::endl; std::cerr<<"nRows="<<nIterations<<std::endl; NumericTablePtr assignmentsResult(allocateSoaNumericTable<int>(1,nRows)); // works NumericTablePtr centroidsResult(new data_management::HomogenNumericTable<double>(nColumns, nClusters, data_management::NumericTable::doAllocate)); // doesn't work // NumericTablePtr centroidsResult(allocateSoaNumericTable<double>(nColumns,nClusters)); // Create our own Result object and allocate SOANumericTables to hold results. SharedPtr<kmeans::Result> result(new kmeans::Result()); result->set(kmeans::assignments,assignmentsResult); result->set(kmeans::centroids,centroidsResult); // result->check(&algorithm.input, &algorithm.parameter, algorithm.getMethod()); result->check(&algorithm.input,&algorithm.parameter,kmeans::lloydDense); algorithm.setResult(result); algorithm.compute(); /* Print clusterization results */ printNumericTable(algorithm.getResult()->get(kmeans::assignments), "First 20 cluster assignments:", 20); printNumericTable(algorithm.getResult()->get(kmeans::centroids ), "First 10 dimensions of centroids:", 20, 10); return 0; }
ACS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Host is an IBM dx360 M3. Uname info:
Linux myhost 2.6.18-308.13.1.el5 #1 SMP Thu Jul 26 05:45:09 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
Here's how I compile the thing:
gcc/4.4.6/bin/g++ \ kmeans_batch.cpp \ -o kmeans_batch.o \ -c \ -g \ -m64 \ -I rpm/intel-daal-common-056-2016.0-056.noarch/opt/intel/compilers_and_libraries_2016.0.056/linux/daal/include \ -I rpm/intel-daal-common-056-2016.0-056.noarch/opt/intel/compilers_and_libraries_2016.0.056/linux/daal/examples/cpp/source/utils \ -fPIC gcc/4.4.6/bin/g++ \ kmeans_batch.o \ -o kmeans_batch \ -g \ -m64 \ -L rpm/intel-tbb-libs-056-4.3.4-056.noarch/opt/intel/compilers_and_libraries_2016.0.056/linux/tbb/lib/intel64_lin/gcc4.4 \ -L rpm/intel-openmp-l-all-056-16.0.0-056.x86_64/opt/intel/compilers_and_libraries_2016.0.056/linux/compiler/lib/intel64_lin \ rpm/intel-daal-056-2016.0-056.x86_64/opt/intel/compilers_and_libraries_2016.0.056/linux/daal/lib/intel64_lin/libdaal_core.a \ rpm/intel-daal-056-2016.0-056.x86_64/opt/intel/compilers_and_libraries_2016.0.056/linux/daal/lib/intel64_lin/libdaal_thread.a \ -ltbb \ -liomp5
Let me know if you need any further details to reproduce.
Thanks.
ACS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can reproduce the problem. However, if I use the latest release (DAAL Beta update 3), then it works fine. This might only be an issue for update 2 or earlier releases.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page