- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Intel® DAAL provides algorithms implementations for distributed computing.
All data management objects support Java serialization and can be elements of Apache Spark RDD collections.
Algorithms are divided into 2 or more steps which correspond to general map/reduce computation tasks (and in more complex cases sequence of map/reduce tasks).
Simpler scheme can be described in the following way:
-
You do map on available portions of data stored in RDD
-
you call computation of 1st step of chosen algorithm for each element of RDD and obtain PartialResult object
-
return PartialResult in RDD
-
-
You call collect() on PartialResults RDD
-
you call computation of 2nd step of chosen algorithm and obtain Results object which contains algorithm results
-
More details for each particular algorithm can be found within programming guide.
Package is supplied with four Spark samples, which implements this scheme for PCA (Correlation and SVD method), QR and SVD decompositions. We will extend number of samples in further updates. Let us know if you are interested in some specific algorithms.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page