Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*
717 Discussions

SYCL kernel cannot call an undefined function without SYCL_EXTERNAL attribute

amaltaha
New Contributor I
3,730 Views
I am trying to calculate the euclidean distance for KNN but in parallel using dpc++. the training dataset contains 5 features and 1600 rows, while I want to calculate the distance between the current test point and each training point on the grid in parallel, but I keep getting an error regarding sycl kernal. code for the function:
std::vector<double> distance_calculation_FPGA(queue& q,const std::vector<std::vector<double>>& dataset,const std::vector<double>& curr_test) {
range<1> num_items{ dataset.size()};
std::vector<double>res;


res.resize(dataset.size());
buffer dataset_buf(dataset);
buffer curr_test_buf(curr_test);
buffer res_buf(res.data(), num_items);

q.submit([&](handler& h) {
    accessor a(dataset_buf, h, read_only);
    accessor b(curr_test_buf, h, read_only);

    accessor dif(res_buf, h, write_only, no_init);

   h.parallel_for(num_items, [=](auto i) {
  
        for (int j = 0; j <(const int) a[i].size(); ++j) {
            dif[i] += (a[i][j] - b[j]) * (a[i][j] - b[j]) ;
        }
        });
 });
for (int i = 0; i < res.size(); ++i) {
    std::cout << res[i] << std::endl;
} 
//old distance calculation (serial)
//for (int i = 0; i < dataset.size(); ++i) {
 //   double dis = 0;
   // for (int j = 0; j < dataset[i].size(); ++j) {
     //   dis += (curr_test[j] - dataset[i][j]) * (curr_test[j] - dataset[i][j]);
    //}
    //res.push_back(dis);
//}

return res;
}

the error I am receiving:

SYCL kernel cannot call a variadic function

SYCL kernel cannot call an undefined function without SYCL_EXTERNAL attribute

Would be extremely grateful for any help!

Thanks

0 Kudos
1 Solution
SantoshY_Intel
Moderator
3,632 Views

Hi,

 

We tried running your code by creating dummy 'dataset' and 'curr_test' variables. We were able to run the program successfully as shown in the below screenshot.

SantoshY_Intel_0-1652344669271.png

 

Please refer to the complete code attached below.

 

#include <CL/sycl.hpp>
#include <iostream>
using namespace sycl;

std::vector<double> distance_calculation_FPGA(queue& q,const std::vector<std::vector<double>>& dataset,const std::vector<double>& curr_test) 
{
    range<1> num_items{ dataset.size()};
    std::vector<double>res;

    res.resize(dataset.size());
    buffer dataset_buf(dataset);
    buffer curr_test_buf(curr_test);
    buffer res_buf(res.data(), num_items);

    q.submit([&](handler& h) {
    accessor a(dataset_buf, h, read_only);
    accessor b(curr_test_buf, h, read_only);

    accessor dif(res_buf, h, write_only, no_init);

    h.parallel_for(num_items, [=](auto i) {

        for (int j = 0; j <(const int) a[i].size(); ++j) {
//           dif[i] += (a[i][j] - b[j]) * (a[i][j] - b[j]) ;
	     dif[i]+=a[i][j];          
        }
        });
    });
    q.wait(); //We have added this line of code for synchronization.
    for (int i : res) { 
        std::cout <<i<< std::endl;
    } 
    return res;
    }


int main(){

	std::vector<std::vector<double>> dataset;
	for(int i=0;i<5;i++)
	{
		std::vector<double> d;

		for(int j=0;j<1600;j++)
		{
			d.push_back((double)j);
		}
		dataset.push_back(d);
	}

	std::vector<double> curr_test;
	for(int i=0;i<1600;i++)
	{
		curr_test.push_back((double)i);
	}
    queue q;
    std::cout << "Running on "<< 
    q.get_device().get_info<sycl::info::device::name>()<< std::endl; 
    //print the device name as a test to check the parallelisation

    distance_calculation_FPGA(q,dataset,curr_test);

    return 0;
    }

 

 

Thanks & Regards,

Santosh

 

View solution in original post

0 Kudos
5 Replies
SantoshY_Intel
Moderator
3,704 Views

Hi,

 

Thank you for posting in Intel Communities.

 

We assume that you are using the following scenario:

Trying to launch the distance_calculation_FPGA() function inside an SYCL kernel and printing its return value.

 

To resolve your issue, you can use the below workaround:

1. Create a header file (example: Header.h) and declare the function as "SYCL_EXTERNAL"

 

#include<CL/sycl.hpp>
extern SYCL_EXTERNAL std::vector<double> distance_calculation_FPGA(queue& q,const std::vector<std::vector<double>>& dataset,const std::vector<double>& curr_test);

 

2. Now include the header file in your source code.

 

#include <CL/sycl.hpp>
#include "Header.h"
std::vector<double> distance_calculation_FPGA(queue& q,const std::vector<std::vector<double>>& dataset,const std::vector<double>& curr_test){

}

 

3. Try to recompile and run your program.

If this resolves your issue, make sure to accept this as a solution. This would help others with a similar issue.

 

If you still face any issues, then please provide us with the complete sample reproducer code and steps to reproduce your issue from our end.

 

Thanks & Regards,

Santosh

 

 

0 Kudos
amaltaha
New Contributor I
3,678 Views

I am not calling the function inside the kernel, actually, the problem lies in the way I am doing the parallelism in parallel_for:

so it works normally if I do this: 

 

h.parallel_for(num_items, [=](auto i) {
  
        for (int j = 0; j <(const int) a[i].size(); ++j) {
            dif[i] += a[i];
        }
        });

 

but when I try to take the values in the second dimension like this:

h.parallel_for(num_items, [=](auto i) {
  
        for (int j = 0; j <(const int) a[i].size(); ++j) {
            dif[i] += a[i][j] ;
        }
        });

 It gives an error. 

please see the previous full code, I don't know if I defined the buffers and accessors correctly.

I tried to do the solution you provided, but it gave me many errors more than before, so it is not working, unfortunately. 

0 Kudos
SantoshY_Intel
Moderator
3,633 Views

Hi,

 

We tried running your code by creating dummy 'dataset' and 'curr_test' variables. We were able to run the program successfully as shown in the below screenshot.

SantoshY_Intel_0-1652344669271.png

 

Please refer to the complete code attached below.

 

#include <CL/sycl.hpp>
#include <iostream>
using namespace sycl;

std::vector<double> distance_calculation_FPGA(queue& q,const std::vector<std::vector<double>>& dataset,const std::vector<double>& curr_test) 
{
    range<1> num_items{ dataset.size()};
    std::vector<double>res;

    res.resize(dataset.size());
    buffer dataset_buf(dataset);
    buffer curr_test_buf(curr_test);
    buffer res_buf(res.data(), num_items);

    q.submit([&](handler& h) {
    accessor a(dataset_buf, h, read_only);
    accessor b(curr_test_buf, h, read_only);

    accessor dif(res_buf, h, write_only, no_init);

    h.parallel_for(num_items, [=](auto i) {

        for (int j = 0; j <(const int) a[i].size(); ++j) {
//           dif[i] += (a[i][j] - b[j]) * (a[i][j] - b[j]) ;
	     dif[i]+=a[i][j];          
        }
        });
    });
    q.wait(); //We have added this line of code for synchronization.
    for (int i : res) { 
        std::cout <<i<< std::endl;
    } 
    return res;
    }


int main(){

	std::vector<std::vector<double>> dataset;
	for(int i=0;i<5;i++)
	{
		std::vector<double> d;

		for(int j=0;j<1600;j++)
		{
			d.push_back((double)j);
		}
		dataset.push_back(d);
	}

	std::vector<double> curr_test;
	for(int i=0;i<1600;i++)
	{
		curr_test.push_back((double)i);
	}
    queue q;
    std::cout << "Running on "<< 
    q.get_device().get_info<sycl::info::device::name>()<< std::endl; 
    //print the device name as a test to check the parallelisation

    distance_calculation_FPGA(q,dataset,curr_test);

    return 0;
    }

 

 

Thanks & Regards,

Santosh

 

0 Kudos
amaltaha
New Contributor I
3,613 Views

Hello, 
Thank you so much for the support, actually I am using Microsoft Visual Studio 2022 to run OneAPI compiler for dpc++. I tried to run your code in it but it gave me the same exact error. But I tried to run in on the cloud like the screenshot, and it worked normally.  Very Glad I was able to run it on the cloud though thank you so much.
I solved the problem in Visual Studio by making a one-dimensional vector<double> instead of vector<vector<double>> and accessed the elements I wanted by x + y * col. 

I really appreciate the very professional efforts and support from you.

Thank you! 

0 Kudos
SantoshY_Intel
Moderator
3,592 Views

Hi,


Thanks for accepting our solution. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.


Thanks & Regards,

Santosh



0 Kudos
Reply