Intel® oneAPI Data Analytics Library
Community support for building compute-intensive applications that run fast on Intel® architecture.
The Intel sign-in experience is changing in February to support enhanced security controls. If you sign in, click here for more information.

K-means empty action



In the Matlab version of the K-means algorithm, there is a very useful flag that indicates the action to take if a cluster loses all member observations during the optimization. There are 3 possibilities in Matlab: 1/ treat empty cluster as an error, 2/ remove any clusters that become empty, 3/ Create a new cluster consisting of the one point furthest from its centroid

Does any one know what happens in DAAL K-means in that case? I could not find anything in the documentation about this.

Thanks a lot!


0 Kudos
1 Reply

It seems that the DAAL doc was updated (or I missed it the first time). Anyway, here is what they say about it:

In some cases, if no vectors are assigned to some clusters on a particular iteration, the iteration produces an empty cluster. It may occur due to bad initialization of centroids or the dataset structure. In this case, the algorithm uses the following strategy to replace the empty cluster centers and decrease the value of the overall goal function.

Feature vectors, most distant from their assigned centroids, are selected as the new cluster centers. Information about these vectors is gathered automatically during the algorithm execution.

The answer is on this page: