Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
660 Discussions

DPC++ project built for FPGA emulation takes very long time to implement on CPU

student4
Beginner
859 Views

Hi,

I have a project developed using standard C++ and another project with same function developed using DPC++ libraries and built in FPGA emulation mode. The standard C++ version runs fine when used with my test project. But the DPC++ version takes very long time to produce results when used in my test program. Should I disable/make changes in the project settings for a faster implementation.

I appreciate your help.

Thank you.

0 Kudos
6 Replies
BoonBengT_Intel
Moderator
819 Views

Hi @student4,

Thank you for posting in Intel community forum on your interest in oneAPI and hope all is well.
Mind if I asked what libraries is used in the DPC++ code? Would it be convenient for you to share the mention test project which contain both C++ and DPC++ code?


And are you using the Intel Devcloud to run the emulation?
Hope to hear from you soon.

Best Wishes
BB

0 Kudos
student4
Beginner
802 Views

Hi BB,

 

Thank you for your reply.

I am using Microsoft Visual Studio 2019 to run the emulation.

I am not able to attach the file. Please refer to the test project implementation below

#include <iostream>
#include <numeric>
#include <chrono>
#include <iomanip>
#include <complex>
#include <array>
#include <vector>
#include<fstream>
//#define standard_cpp
#define oneApi_FPGA

#ifdef standard_cpp
#include "../Standard_CPP/standardCPPFile.h"

#endif

#ifdef oneApi_FPGA
#include "../Oneapi_FPGA/DPCPPFile.h"
#endif
using namespace std;

 

void Test()
{

#ifdef standard_cpp
DPCPPFile obj;
#endif
#ifdef oneApi_FPGA
standardCPPFile obj;
#endif
for (int i = 0; i < 999000; i++)
{

obj.func();
}
}

int main()
{
Test();
cout << "success" << endl;

return 0;
}

The test project calls both standard C++ and DPC++ versions of the function->func 

 

This is a sample implementation of func in DPC++.

void DPCPPFile::func(int v)
{
std::vector<std::complex<float>>Atemp(7);
std::vector<std::complex<float>>Btemp(7);
std::vector<std::vector<std::complex<float>>>Ztemp(7);
// std::cout << "find best symbol index is" << v << std::endl;

cl::sycl::ext::intel::fpga_emulator_selector d_selector;
// queue declaration
cl::sycl::queue Q(d_selector);
sycl::buffer AtempBuff(Atemp);
sycl::buffer BtempBuff(Btemp);
sycl::buffer ZtempBuff(Ztemp);

Q.submit([&](sycl::handler& h)
{
sycl::accessor AtempAccess(AtempBuff, h, sycl::write_only);
sycl::accessor BtempAccess(BtempBuff, h, sycl::read_only);
sycl::accessor ZtempAccess(ZtempBuff, h, sycl::read_only);

h.parallel_for(sycl::range<1>(Nr), [=](auto idx)
{
AtempAccess[idx] = std::conj(BtempAccess[v - 1]) * ZtempAccess[v][idx];
});
});

sycl::host_accessor AtempHost(AtempBuff);

 

Below is the same implementation in standard CPP

void standardCPPFile::func(int v,int Nr)
{
std::vector<std::complex<float>>Atemp(7);
std::vector<std::complex<float>>Btemp(7);
std::vector<std::complex<float>>Ztemp(7);
// std::cout << "find best symbol index is" << v << std::endl;.
for (int i = 0; i < Nr; i++)
{
Atemp[Nr] += std::conj(Btemp[v - 1]) * Ztemp[v][Nr];
}


}

 

 

 

Thank you.

0 Kudos
BoonBengT_Intel
Moderator
737 Views

Hi @student4,

Would recommend to look into the basic structure of how to target and offload the devices in DPC++.

Please refer to the link below on sample 1 which will have details explanation and example.

https://www.intel.com/content/www/us/en/develop/documentation/explore-dpcpp-samples-from-intel/top.html

Hope that clarify.


Best Wishes

BB


0 Kudos
HRZ
Valued Contributor III
684 Views

Since you are building in emulation mode, it is expected that the code would take a very long time to execute since it is emulating the FPGA hardware in software. If you compile for and run your code on an actual FPGA, then it will be much faster. Emulation mode is just to ensure code correctness; the time it takes for the application to execute in emulation mode does NOT represent the time it would take for the code to run on an actual FPGA.

0 Kudos
BoonBengT_Intel
Moderator
664 Views

Hi @student4,


Good day, just checking in to see if there is any further doubts in regards to this matter.

Hope we have clarify your doubts.


Best Wishes

BB


0 Kudos
BoonBengT_Intel
Moderator
630 Views

Hi @student4,


Greetings, as we do not receive any further clarification on what is provided, we would assume challenge are overcome. Hence thread will no longer be monitored. For new queries, please feel free to open a new thread and we will be right with you. Pleasure having you here.


Best Wishes

BB


0 Kudos
Reply