- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi I am analyzing the report from oneAPI FPGA report generation.
I am currently facing
Compiler failed to schedule this loop with smaller II due to memory dependency
So I came back to a simple vector add example provided from github oneAPI C++_SYCL_FPGA, but I am still seeing the same errors
Another message that concerns me is
II is an approximation due to the following stallable instructions:
Load Operation (handler.hpp: 1531 > vector_add.cpp: 19)
Load Operation (handler.hpp: 1531 > vector_add.cpp: 20)
Store Operation (handler.hpp: 1531 > vector_add.cpp: 22)
In my application, I also need to load data from global memory, compute and store back to global memory.
Can you suggest me a way to resolve this issue?
the source code of vector_add.cpp:
#include <iostream>
// oneAPI headers
#include <sycl/ext/intel/fpga_extensions.hpp>
#include <sycl/sycl.hpp>
// Forward declare the kernel name in the global scope. This is an FPGA best
// practice that reduces name mangling in the optimization reports.
class VectorAddID;
struct VectorAdd {
int *const vec_a_in;
int *const vec_b_in;
int *const vec_c_out;
int len;
void operator()() const {
for (int idx = 0; idx < len; idx++) {
int a_val = vec_a_in[idx];
int b_val = vec_b_in[idx];
int sum = a_val + b_val;
vec_c_out[idx] = sum;
}
}
};
constexpr int kVectSize = 256;
int main() {
bool passed = true;
try {
// Use compile-time macros to select either:
// - the FPGA emulator device (CPU emulation of the FPGA)
// - the FPGA device (a real FPGA)
// - the simulator device
#if FPGA_SIMULATOR
auto selector = sycl::ext::intel::fpga_simulator_selector_v;
#elif FPGA_HARDWARE
auto selector = sycl::ext::intel::fpga_selector_v;
#else // #if FPGA_EMULATOR
auto selector = sycl::ext::intel::fpga_emulator_selector_v;
#endif
// create the device queue
sycl::queue q(selector);
auto device = q.get_device();
std::cout << "Running on device: "
<< device.get_info<sycl::info::device::name>().c_str()
<< std::endl;
if (!device.has(sycl::aspect::usm_host_allocations)) {
std::terminate();
}
// declare arrays and fill them
// allocate in shared memory so the kernel can see them
int *vec_a = sycl::malloc_shared<int>(kVectSize, q);
int *vec_b = sycl::malloc_shared<int>(kVectSize, q);
int *vec_c = sycl::malloc_shared<int>(kVectSize, q);
for (int i = 0; i < kVectSize; i++) {
vec_a[i] = i;
vec_b[i] = (kVectSize - i);
}
std::cout << "add two vectors of size " << kVectSize << std::endl;
q.single_task<VectorAddID>(VectorAdd{vec_a, vec_b, vec_c, kVectSize})
.wait();
// verify that vec_c is correct
for (int i = 0; i < kVectSize; i++) {
int expected = vec_a[i] + vec_b[i];
if (vec_c[i] != expected) {
std::cout << "idx=" << i << ": result " << vec_c[i] << ", expected ("
<< expected << ") A=" << vec_a[i] << " + B=" << vec_b[i]
<< std::endl;
passed = false;
}
}
std::cout << (passed ? "PASSED" : "FAILED") << std::endl;
sycl::free(vec_a, q);
sycl::free(vec_b, q);
sycl::free(vec_c, q);
} catch (sycl::exception const &e) {
// Catches exceptions in the host code.
std::cerr << "Caught a SYCL host exception:\n" << e.what() << "\n";
// Most likely the runtime couldn't find FPGA hardware!
if (e.code().value() == CL_DEVICE_NOT_FOUND) {
std::cerr << "If you are targeting an FPGA, please ensure that your "
"system has a correctly configured FPGA board.\n";
std::cerr << "Run sys_check in the oneAPI root directory to verify.\n";
std::cerr << "If you are targeting the FPGA emulator, compile with "
"-DFPGA_EMULATOR.\n";
}
std::terminate();
}
return passed ? EXIT_SUCCESS : EXIT_FAILURE;
}
the full message from loop analysis details:
- Hyper-Optimized loop structure: disabled.
- Memory dependency
- Compiler failed to schedule this loop with smaller II due to memory dependency:
- From: Load Operation (handler.hpp: 1531>vector_add.cpp: 19)
- To: Store Operation (handler.hpp: 1531>vector_add.cpp: 22)
- Compiler failed to schedule this loop with smaller II due to memory dependency:
- From: Load Operation (handler.hpp: 1531>vector_add.cpp: 20)
- To: Store Operation (handler.hpp: 1531>vector_add.cpp: 22)
- Most critical loop feedback path during scheduling:
- 70.00 clock cycles Load Operation (handler.hpp: 1531>vector_add.cpp: 19)
- 10.00 clock cycles Store Operation (handler.hpp: 1531>vector_add.cpp: 22)
- 1.16 clock cycle 32-bit Integer Add Operation (handler.hpp: 1531>vector_add.cpp: 21)
- II is an approximation due to the following stallable instructions:
- Load Operation (handler.hpp: 1531>vector_add.cpp: 19)
- Load Operation (handler.hpp: 1531>vector_add.cpp: 20)
- Store Operation (handler.hpp: 1531>vector_add.cpp: 22)
- Maximum concurrent iterations: Capacity of loop
- Use theLoop Analysisviewer to estimate capacity
- SeeFPGA Handbook : Loopsfor more information
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
You can learn about this in the loop_initiation_interval tutorial and the kernel_args_restrict tutorial.
Basically, you need to tell the compiler that the kernel arguments don't alias with the kernel_args_restrict attribute:
struct FunctorKernel { // ------------------------------------------- // Kernel interface definition. // ------------------------------------------- [[intel::kernel_args_restrict]] void operator()() const { // ---------------------------------------- // Kernel code implementation. // ---------------------------------------- } };
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @JSYOO,
Greetings, just checking in to see if there is any further doubts in regards to this matter.
Hope your doubts have been clarified.
Best Wishes
BB
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @JSYOO,
Greetings, as we do not receive any further clarification/updates on the matter, hence would assume challenge are overcome. Please login to ‘ https://supporttickets.intel.com/s/?language=en_US’, view details of the desire request, and post a feed/response within the next 15 days to allow me to continue to support you. After 15 days, this thread will be transitioned to community support. The community users will be able to help you on your follow-up questions. For new queries, please feel free to open a new thread and we will be right with you. Pleasure having you here.
Best Wishes
BB

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page