I have a project developed using standard C++ and another project with same function developed using DPC++ libraries and built in FPGA emulation mode. The standard C++ version runs fine when used with my test project. But the DPC++ version takes very long time to produce results when used in my test program. Should I disable/make changes in the project settings for a faster implementation.
I appreciate your help.
Thank you for posting in Intel community forum on your interest in oneAPI and hope all is well.
Mind if I asked what libraries is used in the DPC++ code? Would it be convenient for you to share the mention test project which contain both C++ and DPC++ code?
And are you using the Intel Devcloud to run the emulation?
Hope to hear from you soon.
Thank you for your reply.
I am using Microsoft Visual Studio 2019 to run the emulation.
I am not able to attach the file. Please refer to the test project implementation below
using namespace std;
for (int i = 0; i < 999000; i++)
cout << "success" << endl;
The test project calls both standard C++ and DPC++ versions of the function->func
This is a sample implementation of func in DPC++.
void DPCPPFile::func(int v)
// std::cout << "find best symbol index is" << v << std::endl;
// queue declaration
sycl::accessor AtempAccess(AtempBuff, h, sycl::write_only);
sycl::accessor BtempAccess(BtempBuff, h, sycl::read_only);
sycl::accessor ZtempAccess(ZtempBuff, h, sycl::read_only);
h.parallel_for(sycl::range<1>(Nr), [=](auto idx)
AtempAccess[idx] = std::conj(BtempAccess[v - 1]) * ZtempAccess[v][idx];
Below is the same implementation in standard CPP
void standardCPPFile::func(int v,int Nr)
// std::cout << "find best symbol index is" << v << std::endl;.
for (int i = 0; i < Nr; i++)
Atemp[Nr] += std::conj(Btemp[v - 1]) * Ztemp[v][Nr];
Would recommend to look into the basic structure of how to target and offload the devices in DPC++.
Please refer to the link below on sample 1 which will have details explanation and example.
Hope that clarify.
Since you are building in emulation mode, it is expected that the code would take a very long time to execute since it is emulating the FPGA hardware in software. If you compile for and run your code on an actual FPGA, then it will be much faster. Emulation mode is just to ensure code correctness; the time it takes for the application to execute in emulation mode does NOT represent the time it would take for the code to run on an actual FPGA.
Greetings, as we do not receive any further clarification on what is provided, we would assume challenge are overcome. Hence thread will no longer be monitored. For new queries, please feel free to open a new thread and we will be right with you. Pleasure having you here.