- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
After success with SVM in batch mode, I'm now looking at KMeans. In SVM I could get the model out with something like this:
auto model = trainingResults->get(classifier::training::model);
services::SharedPtr<kmeans::Result> trainingResults = algorithm.getResult();
Do you have to load them up into the kMeans algorithm with your new data and kick it off with a special value of iterations for example?
How can I persist the model? Do I just persist the centroids numeric table?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
KMeans in DAAL doesn't follow a "model training" --> "prediction" usage model. There isn't an opaque "model" object in KMeans result. That said, however, you can mimic a model object by extracting centroids from the result, serializing the centroids numeric table (together with other information such as number of clusters). And then you use these as input for clustering for your new data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, but if I initialized the kmeans with the previously calculated centroids (as well as the number of clusters) and recomputed, then the centroids probably would have moved. Is there a way to call compute on the kmeans algorithm without it trying to recalculate the centroids - would iterations=0 do this? I guess its essentially nearest neighbour on the centroids?
Surely there has to be a way, for this particular application I may well have several billion rows so I have to segment on a sample, but I have to come out with a segmentation for all of them?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Starting from upcoming release we've updated algorithm logic a bit, improved documentation and added specific examples for your case. And the only thing you'll need to do - set interations=0.
The simplest way to do it with Intel DAAL 2016 is:
kmeans::Distributed<step1Local> alg(nClusters, true);
alg.input.set(kmeans::data, data);
alg.input.set(kmeans::inputCentroids, centroids);
alg.compute();
alg.finalizeCompute();
assignments = alg.getResult()->get(kmeans::assignments);
The key here, is to use distributed version of the algorithm even if data is all local. centroids will not move in this case.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Excellent, I'll try that - thanks very much.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page