- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I wish to know if the dbscan clustering in sklearn from idp is accelerated or not. I tried normal python and idp for clustering 4000000(samples)X3(dimensions).
normal python:
(new) Kritiks-MacBook-Air:clustering kritiksoman$ python clustering.py The job took 21.877681970596313 seconds to complete (new) Kritiks-MacBook-Air:clustering kritiksoman$ source deactivate new idp: Kritiks-MacBook-Air:clustering kritiksoman$ source activate gdal_env (gdal_env) Kritiks-MacBook-Air:clustering kritiksoman$ python clustering.py The job took 23.75513982772827 seconds to complete
I am using 2013 MacBook Air (Intel i5, 4gb ram)
Please help.
Thanks
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
No, dbscan clustering is not accelerated. We can discuss your request with the Intel DAAL team to prioritize this functionality for the Intel DAAL product. Please help us to understand the priority. So far this is the only request that we have for dbscan.
Thank you,
Sergey Maidanov
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
I am interested in clustering Digital Elevation Model (DEM) data which has 3 dimensions. DBSCAN is very much efficient in this application and works in 10-20 seconds for 10^6 samples. But as soon as I move from prototyping my algorithm to deployment in large scale (10^8 to 10^12), the clustering process becomes so much slower that even on a server with 8 cores, the algorithm takes many many hours, which makes the overall purpose of my algorithm pointless. I want to cluster elevation data of large areas which would mean around 10^8 to 10^12 samples and 3 dimensions. I was hoping IDP would help as the performance benchmarks for sklearn are stated to be far better than regular python.
Priority wise, this clustering algorithm has several applications such as many use cases involving depth sensing, 3D point cloud, etc. I cannot use k-means as the number of clusters in the data is a priori. Please advise if any work around for this issue is available or any other accelerated clustering is available for clustering 3d spatial data.
Thanks
Kritik
Sergey Maidanov (Intel) wrote:
Hello,
No, dbscan clustering is not accelerated. We can discuss your request with the Intel DAAL team to prioritize this functionality for the Intel DAAL product. Please help us to understand the priority. So far this is the only request that we have for dbscan.
Thank you,
Sergey Maidanov
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Kritik,
Thank you for your inputs. These are useful! I will bring these inputs to the DAAL team for discussion. We will see if we can prioritize this work soon.
Thank you,
Sergey
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page