Migrating to SYCL
One-stop forum for getting assistance migrating your existing code to SYCL
39 Discussions

CL_INVALID_OPERATION Error on OneAPI FPGA Emulator

jg_UF
Beginner
2,629 Views

Hello,

This issue is directly following the thread at https://community.intel.com/t5/Intel-oneAPI-Base-Toolkit/Error-running-OneAPI-FPGA-emulator/m-p/1258533/thread-id/1115 .

 

We've converted CUDA code to DPC++ with the DPCT tool, but are having trouble successfully running with an FPGA emulator on the DevCloud.  The goal is to eventually run this app on an actual FPGA.  Attached are the original source codes (CUDA and DPCT Migrated code) and the Intel-provided fixed code.   Please note that the 'USE_GPU' flag must be set to 1 in order to target an accelerator as opposed to CPU.

 

When attempting to compile and run the provided fixed source code for FPGA-Emulator on Stratix 10 PAC, I get this error:

 

u75801@s001-n142:~/temp-cmt$ dpcpp -fintelfpga CMT-bone-pca-fix.dp.cpp -DFPGA_EMULATOR=1 -o cmt.out
u75801@s001-n142:~/temp-cmt$ ./cmt.out
Running on device: Intel(R) FPGA Emulation Device
HOST MESSAGE : Memory Allocation took, 0.00121431 seconds
Max work group size: 4100
Native API failed. Native API returns: -59 (CL_INVALID_OPERATION) -59 (CL_INVALID_OPERATION)Exception caught at file:CMT-bone-pca-fix.dp.cpp, line:677
u75801@s001-n142:~/temp-cmt$

 

This error is very similar to one posted in the previous thread (community.intel.com/t5/Intel-oneAPI-Base-Toolkit/Error-running-OneAPI-FPGA-emulator/m-p/1268231#M1252).  However it seems that you guys had had success running the fixed code on FPGA-Emulator, FPGA, and GPU. 

 

Any suggestions on what the issue could be?

 

Thanks

0 Kudos
1 Solution
cw_intel
Moderator
2,404 Views

Hi,

 

I migrated the CUDA code and made some modifications, now it can be run successfully on  FPGA Emulator, CPU and GPU

To run on FPGA Emulator, 

$ dpcpp CMT-bone-pca_workarounds.dp.cpp

$ export SYCL_DEVICE_TYPE=ACC

$ SYCL_PI_TRACE=1 ./a.out


SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_opencl.so
SYCL_PI_TRACE[all]: Selected device ->
SYCL_PI_TRACE[all]: platform: Intel(R) FPGA Emulation Platform for OpenCL(TM)
SYCL_PI_TRACE[all]: device: Intel(R) FPGA Emulation Device
HOST MESSAGE : Memory Allocation took, 0.00134858 seconds
CUDA kernel avg duration: 0.00101494 seconds
CUDA kernel total duration: 0.97434193 seconds
Total kernel iterations: 960
Total time for grid dim 4 and element dim 5 : 0.980248
Cleanup: 0.00037760 seconds

View solution in original post

0 Kudos
5 Replies
SantoshY_Intel
Moderator
2,586 Views

Hi,

 

Thanks for reaching out to us.

 

Could you please confirm the version of DPC++ you used, by using the below command?

 

dpcpp --version

 

We have successfully run the code on GPU by using the below command to compile:

 

dpcpp filename.cpp -o executable

 

Refer to the below screenshot.

MicrosoftTeams-image (6).png

Regarding the error related to FPGA EMULATOR, we will get back to you soon.

 

Thanks & Regards,

Santosh

 

 

 

 

0 Kudos
jg_UF
Beginner
2,568 Views

Hi Santosh,

 

The DPC++ version is

Intel(R) oneAPI DPC++/C++ Compiler 2021.3.0 (2021.3.0.20210619)

 

Thanks

0 Kudos
SantoshY_Intel
Moderator
2,492 Views

Hi,


We are working on your issue and we will get back to you soon.


Thanks & Regards,

Santosh Yeduru


0 Kudos
cw_intel
Moderator
2,405 Views

Hi,

 

I migrated the CUDA code and made some modifications, now it can be run successfully on  FPGA Emulator, CPU and GPU

To run on FPGA Emulator, 

$ dpcpp CMT-bone-pca_workarounds.dp.cpp

$ export SYCL_DEVICE_TYPE=ACC

$ SYCL_PI_TRACE=1 ./a.out


SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_opencl.so
SYCL_PI_TRACE[all]: Selected device ->
SYCL_PI_TRACE[all]: platform: Intel(R) FPGA Emulation Platform for OpenCL(TM)
SYCL_PI_TRACE[all]: device: Intel(R) FPGA Emulation Device
HOST MESSAGE : Memory Allocation took, 0.00134858 seconds
CUDA kernel avg duration: 0.00101494 seconds
CUDA kernel total duration: 0.97434193 seconds
Total kernel iterations: 960
Total time for grid dim 4 and element dim 5 : 0.980248
Cleanup: 0.00037760 seconds

0 Kudos
cw_intel
Moderator
2,316 Views

Thanks for accepting our solution. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.


0 Kudos
Reply