- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I am using the kmeans algorithm from DAAL and noticed what I think is a bug.
My data consists of 150 observations with 2 features. I want to classify them in 3 clusters.
When I use the deterministicDense initialisation, the algorithm uses the first 3 observations as initial centroids. However, in my particular case, observations #2 and #3 are identical, which yields to identical centroids. In that case, the kmeans fails to converge to three clusters: one of the cluster is empty, with a corresponding centoid that falls far outside the range of the input data. The kmeans algorithm has worked as if I configured for 2 clusters.
The problem does not appear with other initialisation methods because it is very unlikely that they yield to repeated initial centroids.
I do not know if this is a bug or a feature, but I think that this should be at least documented.
Best regards,
Tim
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
and do you see such behavior with the latest 2018 u1 version of DAAL?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think so, it is the version of DAAL that comes with the compilers_and_libraries_2018.1.156.
Tim
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page