Issues with the use of the flag "-fintelfpga"

FelipeML · ‎02-25-2020

Hello!

We are trying to use the emulator to check the results of an algorithm and we have found that the execution fails when we use the "-fintelfpga" flag.

The running example is a simple "triad" that implements two SYCL queues to heterogeneously perform the C=A+alpha*B operation splitting the computation between the CPU and FPGA. To launch the compilation and execution we use the following script:

#!/bin/bash
rm -rf *.o *.d *.out *.mon *.emu *.aocr *.aoco *.prj *.fpga_emu *.a triadd-sycl-gpu
source /opt/intel/inteloneapi/setvars.sh > /dev/null
 
rm triadd-sycl-emu
dpcpp src/main.cpp src/kernel.cpp -o triadd-sycl-emu -DFPGA_EMULATOR
sleep 1
./triadd-sycl-emu

Up to this point we don't have any problem and the execution works using the s001-n147 compilation node in interactive mode:

u37462@s001-n147:~/triadd$ ./triadd-sycl-emu
Experiment: 32 elements, offload-ratio 0.5
Device: Intel(R) FPGA Emulation Device
CPU: Intel(R) Xeon(R) Platinum 8153 CPU @ 2.00GHz
 
RESULT: 0.000 1.500 3.000 4.500 6.000 7.500 9.000 10.500 12.000 13.500 15.000 16.500 18.000 19.500 21.000 22.500 24.000 25.500 27.000 28.500 30.000 31.500 33.000 34.500 36.000 37.500 39.000 40.500 42.000 43.500 45.000 46.500
 
CPU: Heterogenous triad correct.
Kernel:    1.05579 s
Total:     1.05587 s

Then, adding the flag "-fintelfpga" to the compilation, as it is recommended to use in BaseKit-code-samples of the "Get started with the Intel oneAPI Base Toolkit on the DevCloud", there is no compilation error but we get the following execution error at runtime:

u37462@s001-n147:~/triadd$ ./triadd-sycl-emu
Experiment: 32 elements, offload-ratio 0.5
Device: Intel(R) FPGA Emulation Device
CPU: Intel(R) Xeon(R) Platinum 8153 CPU @ 2.00GHz
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  OpenCL API failed. OpenCL API returns: -42 (CL_INVALID_BINARY) -42 (CL_INVALID_BINARY)
Aborted

Looking at the link stages with the "-v" flag, we have seen that the only difference of adding "-fintelfpga" is that the target for “clang-offload-wrapper” changes from “spir64” to “spir64_fpga”. This may be one of the sources of errors we are having when compiling for FPGA.

Also, when we are not in interactive mode and we send the compilation to a queue without "-fintelfpga" we get this permission warning:

/var/spool/torque/mom_priv/epilogue.parallel: line 12: /var/spool/torque/mom_priv/epilogue.d//95-nvdir.epilogue: Permission denied

However, there are no errors when executing.

We would really appreciate any help!

Greetings.

MEIYAN_L_Intel · ‎02-26-2020

Hi,

I am checking internally about the information about the option "-fintelfpga" with the developer.

In the mean time, could you provide me the link as mentioned "adding the flag "-fintelfpga" to the compilation, as it is recommended to use in BaseKit-code-samples of the "Get started with the Intel oneAPI Base Toolkit on the DevCloud"?

Thanks

FelipeML · ‎02-26-2020

Hello!

Thank you for the attention.

The specific example I was referring to was the one in: https://devcloud.intel.com/oneapi/get-started/base-toolkit/

In the Vector-Add example when we do "make fpga_emu -f Makefile.fpga" we are using "-fintelfpga" to compile when we want to use the emulator.

In addition, the -fintelfpga flag to use the emulator is also used in all the examples that use FPGA within the BaseKit-code-samples/DPC++Compiler folder. For example, in "FPGATutorials/BestPractices/double_buffering/" when using "set(EMULATOR_COMPILE_FLAGS"-fintelfpga -DFPGA_EMULATOR")" in the CMakeLists.txt, as well as in all other examples in the "FPGAExampleDesigns/" and "FPGATutorials/" directories.

So I assumed that the use of "-fintelfpga" is necessary to use the emulator.

Regards.

FelipeML · ‎02-26-2020

Hi again!

I just found in the documentation a section where it says explicitly that we should "Use the following command to compile for emulation: dpcpp -fintelfpga *.cpp"

https://software.intel.com/en-us/oneapi-programming-guide-offline-compilation-for-fpga

Thank you for your help.

MEIYAN_L_Intel · ‎02-27-2020

Hi,

I had find the an useful link which have provide information for compilation in SYCL:

https://github.com/intel/BaseKit-code-samples/blob/master/DPC%2B%2BCompiler/FPGATutorials/Compilation/compile_flow/README.md

You could find the files directory in Devloud with link below:

https://github.com/intel/BaseKit-code-samples/tree/master/DPC%2B%2BCompiler

I had tried the command but there is some permission denied to view the report.

The permission denied is use to prevent IE from issuing "block content".

Thanks

FelipeML · ‎01-02-2021

Hi all and thanks in advance for any help that you could provide.

It seems that the problem with which I opened this thread is still unsolved, and we have just found a modification that seems to narrow down the problem.

We've been playing with the following code that is one of the examples available in the oneAPI training material:

using namespace sycl;
int main() 
{
  {
  range<1> r{SIZE};
  #ifdef FPGA_EMULATOR
  INTEL::fpga_emulator_selector device_selector;
  #else
  INTEL::fpga_selector device_selector;
  #endif
  queue q{device_selector};
  queue q_cpu{cpu_selector{}};
  buffer<int, 1> a_buf{r};
  buffer<int, 1> b_buf{r};
  buffer<int, 1> c_buf{r};
  // a ---- c --- d
  // b __/ 
  q.submit([&](handler& h) {
    accessor a(a_buf, h, write_only);
    h.parallel_for(r, [=](auto idx) {
      a[idx] = idx; }); 
  });
  q.submit([&](handler& h) {
    accessor b(b_buf, h, write_only);
    h.parallel_for(r, [=](auto idx) {
      b[idx] = -idx; }); 
  });
  q_cpu.submit([&](handler& h) { //fails with q_cpu, but not with q
    accessor a(a_buf, h, read_only);
    accessor b(b_buf, h, read_only);
    accessor c(c_buf, h, write_only);
    h.parallel_for(r, [=](auto idx) {
      c[idx] = a[idx] + b[idx]; }); 
  });
  q.submit([&](handler& h) {
    accessor c(c_buf, h, read_write);
    h.parallel_for(r, [=](auto idx) {
      c[idx] += 1; }); 
  }).wait();
  }

  std::cout << "DONE.\n";
  return 0;
}

As you can see in the comment of the 3rd kernel submission, submitting in the same code to the FPGA and the CPU at the same time and expecting the runtime to solve the data flow dependencies fails with the following message:

u32284@s001-n081:~/oneTBB/examples/SC20/lab$ dpcpp -fintelfpga vector-add-fpga.cpp -DFPGA_EMULATOR -o vadd.emu
u32284@s001-n081:~/oneTBB/examples/SC20/lab$ ./vadd.emu 
terminate called after throwing an instance of 'cl::sycl::runtime_error'
what(): Native API failed. Native API returns: -42 (CL_INVALID_BINARY) -42 (CL_INVALID_BINARY)
Aborted

If we change the second kernel so that we avoid submitting to the CPU device, the code does not return:

// change q_cpu.submit()... by this:
host_accessor a(a_buf, read_only);
host_accessor b(b_buf, read_only);
host_accessor c(c_buf, write_only);
for(int idx=0; idx<SIZE;idx++){
  c[idx] = a[idx] + b[idx]; 
}

And the only way to get it works, as far as we know, is by destroying the host_accessors:

{
  host_accessor a(a_buf, read_only);
  host_accessor b(b_buf, read_only);
  host_accessor c(c_buf, write_only);
  for(int idx=0; idx<SIZE;idx++){
    c[idx] = a[idx] + b[idx]; 
  }
}

u32284@s001-n081:~/oneTBB/examples/SC20/lab$ dpcpp -fintelfpga vector-add-fpga3.cpp -DFPGA_EMULATOR -o vadd.emu
u32284@s001-n081:~/oneTBB/examples/SC20/lab$ ./vadd.emu 
DONE.
u32284@s001-n081:~/oneTBB/examples/SC20/lab$

Has this been reported before? Are we doing something wrong or is the compiler/runtime that still needs some improvements?

Thanks once again.

KennyTan_Altera · ‎01-21-2021

Hi,

We are sorry to inform that if this thread had been close from Intel quite a while a go. This thread had been transitioned to community support.

If you would require further support from intel, you will need to open a thread on this.

Thanks,
Best regards,

Kenny

KennyTan_Altera · ‎01-21-2021

A new thread on this

FelipeML · ‎01-21-2021

Hi Kenny,

Thank you for letting me know. I just opened a new thread:

https://community.intel.com/t5/Intel-High-Level-Design/Problem-mixing-FPGA-and-CPU-kernels-that-resort-to-accessors-for/m-p/1248551/emcs_t/S2h8ZW1haWx8dG9waWNfc3Vic2NyaXB0aW9ufEtLNlJWVlk5UFdDNTdYfDEyNDg1NTF8U1VCU0NSSVBUSU9OU3xoSw#M1458

Best regards,

Felipe.