Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software Development SDKs and Libraries
- Intel® oneAPI Data Analytics Library & Intel® Data Analytics Acceleration Library
- How to Utilize DAAL KNN to Sort Feature Vectors by Distance?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted
##

Al-Heji__Ali

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-12-2019
11:50 AM

234 Views

How to Utilize DAAL KNN to Sort Feature Vectors by Distance?

Hello DAALers,

I have written a simple serial implementation of KNN to calculate the distance from vector "a" for every vector in matrix "b", where

"a" is 1 by n

"b" is m by n

The distance is recorded for every i'th vector in "b" in result*.distance. At the end, the data in struct are sorted based on distance.*

I have been trying to map my inputs and outputs to DAAL's KD-Tree KNN, but not luck so far. I seem to be having difficulty in passing "a" and "b" in the data frame format expected by the function. Also, the example that comes with DAAL only shows how to invoke prediction on the testing data by training the model, but it is not clear how to retrieve the distances and indices from the model. I would highly appreciate the help as the KNN function is the performance bottleneck in my program.

struct knn_output

{

int index;

double distance;

};

void knn_serial(double* a, double* b, int m, int n, knn_output* result)

{

//Norm level used for distance calculation

double L = 2;

for (int i = 0; i < m; i++)

{

result*.distance = 0; result .index = i; for (int j = 0; j < n; j++) { result.distance = result.distance + pow(abs(a*

int compare_knn(const void* a, const void* b)

{

knn_output* a_knn = (knn_output*)a;

knn_output* b_knn = (knn_output*)b;

if (a_knn->distance < b_knn->distance) return -1;

else if (a_knn->distance > b_knn->distance) return 1;

else return 0;

};

Best regards,

Ali

8 Replies

Highlighted
##

Mikhail_A_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-14-2019
02:33 AM

234 Views

Hello.

Intel® Data Analytics Acceleration Library does not provide indices and distances to the user at the moment.

Could you please provide some additional details about your use-case and the reason to sort feature vectors?

Highlighted
##

Al-Heji__Ali

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-14-2019
04:44 AM

234 Views

Given an out-of-sample feature vector "a", I attempt to find the k-nearest (sorted by distance) feature vectors in in-sample feature matrix "b". Then, by applying a statistical learning algorithm, I can learn the relationship between k-nearest features vectors and their corresponding labels to predict the label of "a". Looking forward to seeing updates in the future. Also, please let me know if there is a current work around to this. DAAL KD-Tree KNN must internally calculate the distance and sort the feature vectors to invoke the final prediction.

Thanks,

Ali

Highlighted
##

So, is k-nearest neighbors the step of some algorithm, which predicts the label of "a", but in different approach relative to k-nearest neighbors? Or, do you just want to use some specific approach to estimate label based on distances (like weight classes of neighbors based on distances)?

Mikhail_A_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-14-2019
05:09 AM

234 Views

Highlighted
##

Al-Heji__Ali

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-14-2019
05:19 AM

234 Views

Yes, exactly I use the KNN as a setup for my own statistical learning algorithm to identify the indices of the k-nearest feature vectors in "b" (row position). I only apply learning on the k-nearest "b" feature vectors and their corresponding labels, then I apply the learned model to the out-of-sample feature vector "a" to predicts its "unknown" label. Sorry if I'm repeating myself. Please let me if we can still do this using current Intel libraries.

Thanks,

Ali

Highlighted
##

Al-Heji__Ali

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-15-2019
11:04 PM

234 Views

Hi Mikhail,

Please let me know if there's any workaround to retrieve indices and distances from KD-Tree KNN. If not, I would appreciate it if you can pass my request to the DAAL development team. Either way, please let me know.

Many thanks,

Ali

Highlighted
##

Mikhail_A_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-16-2019
12:18 AM

234 Views

Hi,

Thank you for request and providing the details about use case. At the moment, the only possible option is to use open source DAAL (https://github.com/intel/daal) to add interface to access to the data you need manually. Please note that distances are calculated internally in the algorithm, as you mentioned. Your request will be taken into account by DAAL team to expand API accordingly in future releases.

Highlighted
##

Al-Heji__Ali

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-16-2019
03:06 AM

234 Views

Thank you and looking forward to the updates.

Highlighted
##

Al-Heji__Ali

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-23-2019
04:06 AM

234 Views

*.idx". They can be found in "algorithms\kernel\k_nearest_neighbors\kdtree_knn_impl.i" (file is attached). However, I feel challenged when it comes to understanding how to call "indexes" after executing the model. *

For more complete information about compiler optimizations, see our Optimization Notice.