Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
674 Discussions

Ignored explicit memory operation with PAC FPGA


Hello everyone,

we've recently released a C++ template library for stencil simulations using FPGAs, called StencilStream. It uses a central working kernel which communicates with IO kernels. The working kernel is only submitted once, but the IO kernels are submitted multiple times with different buffers. After all of these kernel submissions have finished, the final data needs to be copied back to another buffer. We tried to do this with an explicit memory operation (See the The SYCL specification 1.2.1, section 4.8.6), but apparently it is not executed.

The version is tagged and we've synthesized the "main" and "hotspot" examples for the Intel PAC S10 with USM features, so you can reproduce the error. The code in question is stencil/stencil.hpp, line 305.

Steps to reproduce the error:

  • Download and unpack main.intel_pac.beta-10.tar.gz.
  • Load Intel oneAPI beta-10 and the BSP for the Intel PAC S10 with USM features.
  • Run the following command:


./main 512 512 10​0

The program will execute and terminate just fine since the grid in use is smaller than 1024x1024 cells and therefore the manual copy operation is used.


  • Run the following command:


./main 1024 1024 100​

The program will terminate with a long list of error messages like these:

(0, 1) => 1 (!= -1)
(0, 2) => 2 (!= -2)
(0, 3) => 3 (!= -3)
(0, 4) => 4 (!= -4)
(0, 5) => 5 (!= -5)
(0, 6) => 6 (!= -6)
(0, 7) => 7 (!= -7)
(0,  => 8 (!= -8)
(0, 9) => 9 (!= -9)

Error messages like (X, Y) => A (!= B) say that the final value of the cell with the position of (X, Y) is A, but it's supposed to be B. In fact, the initial value of these cells is always X+Y and the working kernel is only supposed to negate them. Since the grid is exactly 1024x1024 cells big, the Explicit, built-in memory operation is used and awaited. This means that the built-in copy operation is not executed.


We've initially discovered the problem with our Nallatech/Bittware 520N card, which means that this problem isn't exclusive to the PAC. However, we weren't able to reproduce this problem with a smaller application, so it might have something to do with the number of kernel submissions. Is this a known problem?


Jan-Oliver Opdenhövel

Student Assistant at Paderborn Center for Parallel Computing

0 Kudos
2 Replies

Hi @JanOliverOpdenhoevel

Thanks for reaching out to us!

We are having a dedicated forum for FPGA related queries. Since your issue is related to FPGA, we are moving this query to FPGA forum for a faster response.

Have a Good day!

Thanks & Regards


0 Kudos

Hi ,

Please let us know whether the same issue exists with the latest version of oneAPI also.

Thanks and Regards


0 Kudos