Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
722 Discussions

II is an approximation due to the following stallable instructions

JSYOO
Novice
682 Views

Hi I am analyzing the report from oneAPI FPGA report generation.

I am currently facing

 

Compiler failed to schedule this loop with smaller II due to memory dependency


So I came back to a simple vector add example provided from github oneAPI C++_SYCL_FPGA, but I am still seeing the same errors

Another message that concerns me is

II is an approximation due to the following stallable instructions:
Load Operation (handler.hpp: 1531 > vector_add.cpp: 19)
Load Operation (handler.hpp: 1531 > vector_add.cpp: 20)
Store Operation (handler.hpp: 1531 > vector_add.cpp: 22)

 

In my application, I also need to load data from global memory, compute and store back to global memory.

Can you suggest me a way to resolve this issue?

 

the source code of vector_add.cpp:

 

#include <iostream>

// oneAPI headers
#include <sycl/ext/intel/fpga_extensions.hpp>
#include <sycl/sycl.hpp>

// Forward declare the kernel name in the global scope. This is an FPGA best
// practice that reduces name mangling in the optimization reports.
class VectorAddID;

struct VectorAdd {
  int *const vec_a_in;
  int *const vec_b_in;
  int *const vec_c_out;
  int len;

  void operator()() const {
    for (int idx = 0; idx < len; idx++) {
      int a_val = vec_a_in[idx];
      int b_val = vec_b_in[idx];
      int sum = a_val + b_val;
      vec_c_out[idx] = sum;
    }
  }
};

constexpr int kVectSize = 256;

int main() {
  bool passed = true;
  try {
    // Use compile-time macros to select either:
    //  - the FPGA emulator device (CPU emulation of the FPGA)
    //  - the FPGA device (a real FPGA)
    //  - the simulator device
#if FPGA_SIMULATOR
    auto selector = sycl::ext::intel::fpga_simulator_selector_v;
#elif FPGA_HARDWARE
    auto selector = sycl::ext::intel::fpga_selector_v;
#else  // #if FPGA_EMULATOR
    auto selector = sycl::ext::intel::fpga_emulator_selector_v;
#endif

    // create the device queue
    sycl::queue q(selector);

    auto device = q.get_device();

    std::cout << "Running on device: "
              << device.get_info<sycl::info::device::name>().c_str()
              << std::endl;

    if (!device.has(sycl::aspect::usm_host_allocations)) {
      std::terminate();
    }

    // declare arrays and fill them
    // allocate in shared memory so the kernel can see them
    int *vec_a = sycl::malloc_shared<int>(kVectSize, q);
    int *vec_b = sycl::malloc_shared<int>(kVectSize, q);
    int *vec_c = sycl::malloc_shared<int>(kVectSize, q);
    for (int i = 0; i < kVectSize; i++) {
      vec_a[i] = i;
      vec_b[i] = (kVectSize - i);
    }

    std::cout << "add two vectors of size " << kVectSize << std::endl;

    q.single_task<VectorAddID>(VectorAdd{vec_a, vec_b, vec_c, kVectSize})
        .wait();

    // verify that vec_c is correct
    for (int i = 0; i < kVectSize; i++) {
      int expected = vec_a[i] + vec_b[i];
      if (vec_c[i] != expected) {
        std::cout << "idx=" << i << ": result " << vec_c[i] << ", expected ("
                  << expected << ") A=" << vec_a[i] << " + B=" << vec_b[i]
                  << std::endl;
        passed = false;
      }
    }

    std::cout << (passed ? "PASSED" : "FAILED") << std::endl;

    sycl::free(vec_a, q);
    sycl::free(vec_b, q);
    sycl::free(vec_c, q);
  } catch (sycl::exception const &e) {
    // Catches exceptions in the host code.
    std::cerr << "Caught a SYCL host exception:\n" << e.what() << "\n";

    // Most likely the runtime couldn't find FPGA hardware!
    if (e.code().value() == CL_DEVICE_NOT_FOUND) {
      std::cerr << "If you are targeting an FPGA, please ensure that your "
                   "system has a correctly configured FPGA board.\n";
      std::cerr << "Run sys_check in the oneAPI root directory to verify.\n";
      std::cerr << "If you are targeting the FPGA emulator, compile with "
                   "-DFPGA_EMULATOR.\n";
    }
    std::terminate();
  }
  return passed ? EXIT_SUCCESS : EXIT_FAILURE;
}

 

 

the full message from loop analysis details:

VectorAddID.B1:
  • Hyper-Optimized loop structure: disabled.
  • Memory dependency
  • Compiler failed to schedule this loop with smaller II due to memory dependency:
  • Compiler failed to schedule this loop with smaller II due to memory dependency:
  • Most critical loop feedback path during scheduling:
  • II is an approximation due to the following stallable instructions:
  • Maximum concurrent iterations: Capacity of loop
  • See
    FPGA Handbook : Loops
    for more information
0 Kudos
3 Replies
whitepau_altera
Employee
639 Views

Hello!

You can learn about this in the loop_initiation_interval tutorial and the kernel_args_restrict tutorial.

 

Basically, you need to tell the compiler that the kernel arguments don't alias with the kernel_args_restrict attribute:

struct FunctorKernel {
   // -------------------------------------------
   //         Kernel interface definition.
   // -------------------------------------------
   
   [[intel::kernel_args_restrict]]
   void operator()() const {
      // ----------------------------------------
      //       Kernel code implementation.
      // ----------------------------------------
   }
};

 

0 Kudos
BoonBengT_Altera
Moderator
590 Views

Hi @JSYOO,


Greetings, just checking in to see if there is any further doubts in regards to this matter.

Hope your doubts have been clarified.


Best Wishes

BB


0 Kudos
BoonBengT_Altera
Moderator
561 Views

Hi @JSYOO,

Greetings, as we do not receive any further clarification/updates on the matter, hence would assume challenge are overcome. Please login to ‘ https://supporttickets.intel.com/s/?language=en_US’, view details of the desire request, and post a feed/response within the next 15 days to allow me to continue to support you. After 15 days, this thread will be transitioned to community support. The community users will be able to help you on your follow-up questions. For new queries, please feel free to open a new thread and we will be right with you. Pleasure having you here.


Best Wishes

BB


0 Kudos
Reply