Hi everyone,I downloaded several example designs from altera.com (https://www.altera.com/products/design-software/embedded-software-developers/opencl/developer-zone.h...). I have the intelFPGA_pro 17.0 suite installed on my machine. Compiling and running hello_world and vector_addition was ok. As I am more interested in doing some HDL library, I tried the fourth example : OpenCL_library (to be found here (https://www.altera.com/support/support-resources/design-examples/design-software/opencl/library-desi...)). I followed strictly the README.html. Producing double_lib.aoclib went ok. But when I tried to get the .aocx with
aoc device/example1.cl -o bin/example1.aocx -I device/lib1 -L device -l double_lib.aoclib, the synthesis failed after some time (approx. 1 hour). I attached the quartus_sh_compile.log as I found some error in it but if you think another log file is worth some attention let me know. The first error I encountered was:
Error (13661): VHDL Association List error at i_sfc_logic_c0_entry_test_builtin_c0_enter31.vhd(187): formal "reset_kind" does not exist File: /home_nfs/bourgea/project/xpress_rtl_lib1/library_example1/bin/example1/ip/kernel_system/kernel_system_example1_system/example1_system_140/synth/i_sfc_logic_c0_entry_test_builtin_c0_enter31.vhd Line: 187The same error is repeated quite some times on other lines and other files. They all concern some
formal "reset_kind" does not existHow can I fix this error ? IMHO, this is not a huge bug but a design example not working out of the box is concerning. some details about my tools :
$ aoc --version Intel(R) FPGA SDK for OpenCL(TM), 64-Bit Offline Compiler Version 17.0.0 Build 290
I've ran into this error before. If my memory serves me, the reset_kind is set by the VHDL in the dspba_library folder.If the reset_kind is missing, then try copying the VHDL files from ALTERA_ROOT/17.0/base/quartus/dspba/Libraries/vhdl/base/ into the lib1/dspba_library folder and recompiling the aoclib and aocx binary. More specifically, the dspba_library.vhd and dspba_library_package.vhd files are replaced (might want to open the files to check that they have the reset_kind generic string in there as well) If that doesn't work, you can try removing the dspba_library requirements in the XML. I haven't found any use for this when making HDL modules for OpenCL so far but I have kept them in when I was running the library examples. Edit: I should also probably note that these VHDL files are created via DSP Builder which is probably why it is referencing the dspba_library files, although creating custom logic in DSP Builder seems to work fine for me without this folder when integrating into OpenCL.
Thank you for your answer!So I replaced the dspba library (which contained only a delay) with the one present under the 17.0/quartus ... directory and everything went ok. The version of dsbpa_library_package.vhd is not consistent with the instantiation made in the RTL inside the design archive : this is an error from Intel I guess. I still have some issue :) The host/src/main.cpp is buggy. Led to some segfault. Indeed line 109 is wrong. It should have been argc > 1. Ok, this was not a big deal compiling with -g to find the bug. Nevertheless, it is not very clear to me how you can distribute some code with such basic mistake. When I finally managed to run the code, I have differences between the library and the builtin computations. As I am no FP expert, I am not able to judge if the differences are an issue or not for this representation. Can anyone give me his idea about that ? Here is what I obtain :
Loading bin/example1.aocx ... Reprogramming device with handle 1 Create buffers Generate random data for conversion... Enqueueing both library and builtin in kernels 4 times with global size 65536 Kernel computation using library function took 5.80343 seconds Kernel computation using built-in function took 5.50581 seconds Reading results to buffers... Checking results... ERROR at i=378457, library = 4.34817e+78, builtin = 4.34817e+78 (diff = 8.22752e+62) ERROR at i=2110585, library = 3.85196e+78, builtin = 3.85196e+78 (diff = 8.22752e+62) ERROR at i=2443099, library = 3.83155e+78, builtin = 3.83155e+78 (diff = 8.22752e+62) ERROR at i=4106325, library = 3.59287e+78, builtin = 3.59287e+78 (diff = 4.11376e+62) ERROR at i=5232350, library = 3.96714e+78, builtin = 3.96714e+78 (diff = 8.22752e+62) ERROR at i=5772993, library = 3.42469e+78, builtin = 3.42469e+78 (diff = 4.11376e+62) ERROR at i=9001548, library = 3.24539e+78, builtin = 3.24539e+78 (diff = 4.11376e+62) ERROR at i=9420205, library = 3.29633e+78, builtin = 3.29633e+78 (diff = 4.11376e+62) ERROR at i=9897702, library = 3.33317e+78, builtin = 3.33317e+78 (diff = 4.11376e+62) ERROR at i=14739922, library = 3.18863e+78, builtin = 3.18863e+78 (diff = 4.11376e+62) ERROR at i=16586817, library = 3.24749e+78, builtin = 3.24749e+78 (diff = 8.22752e+62) ERROR at i=16911415, library = 5.01582e+78, builtin = 5.01582e+78 (diff = 8.22752e+62) ERROR at i=18581853, library = 4.12608e+78, builtin = 4.12608e+78 (diff = 8.22752e+62) ERROR at i=22256451, library = 4.04979e+78, builtin = 4.04979e+78 (diff = 8.22752e+62) ERROR at i=23040205, library = 6.14437e+78, builtin = 6.14437e+78 (diff = 8.22752e+62) ERROR at i=24551760, library = 3.9172e+78, builtin = 3.9172e+78 (diff = 8.22752e+62) ERROR at i=28629117, library = 3.30068e+78, builtin = 3.30068e+78 (diff = 4.11376e+62) ERROR at i=29464917, library = 3.21391e+78, builtin = 3.21391e+78 (diff = 8.22752e+62) ERROR at i=32483842, library = 3.21641e+78, builtin = 3.21641e+78 (diff = 8.22752e+62) ERROR at i=33286137, library = 3.81234e+78, builtin = 3.81234e+78 (diff = 8.22752e+62) ERROR at i=37822849, library = 3.22525e+78, builtin = 3.22525e+78 (diff = 4.11376e+62) ERROR at i=39991094, library = 3.26492e+78, builtin = 3.26492e+78 (diff = 4.11376e+62) ERROR at i=40443348, library = 5.47729e+78, builtin = 5.47729e+78 (diff = 8.22752e+62) ERROR at i=40957965, library = 4.07241e+78, builtin = 4.07241e+78 (diff = 8.22752e+62) ERROR at i=42809418, library = 4.01519e+78, builtin = 4.01519e+78 (diff = 8.22752e+62) ERROR at i=43761108, library = 3.22017e+78, builtin = 3.22017e+78 (diff = 4.11376e+62) ERROR at i=46441867, library = 3.93527e+78, builtin = 3.93527e+78 (diff = 8.22752e+62) ERROR at i=47337034, library = 3.51861e+78, builtin = 3.51861e+78 (diff = 4.11376e+62) ERROR at i=49678506, library = 6.00668e+78, builtin = 6.00668e+78 (diff = 8.22752e+62) ERROR at i=51987862, library = 4.11377e+78, builtin = 4.11377e+78 (diff = 8.22752e+62) ERROR at i=52642351, library = 4.33496e+78, builtin = 4.33496e+78 (diff = 8.22752e+62) ERROR at i=53943756, library = 3.39121e+78, builtin = 3.39121e+78 (diff = 4.11376e+62) ERROR at i=54387122, library = 3.1474e+78, builtin = 3.1474e+78 (diff = 4.11376e+62) ERROR at i=54568122, library = 3.14223e+78, builtin = 3.14223e+78 (diff = 4.11376e+62) ERROR at i=54889226, library = 3.28632e+78, builtin = 3.28632e+78 (diff = 8.22752e+62) ERROR at i=56124223, library = 3.64337e+78, builtin = 3.64337e+78 (diff = 4.11376e+62) ERROR at i=60308209, library = 3.51323e+78, builtin = 3.51323e+78 (diff = 4.11376e+62) ERROR at i=63655771, library = 3.92707e+78, builtin = 3.92707e+78 (diff = 8.22752e+62) ERROR at i=64794095, library = 4.98027e+78, builtin = 4.98027e+78 (diff = 8.22752e+62) FAILED with 39 errors.To make it a little more clear, I printed the whole bunch of errors (instead of 10 initially) and added the result of the fabs function on line 227. Thanks !
Haha, yeah no problem. I've had a few hurdles with these.I believe these values are accurate. It'll largely depend on your application and its tolerance for error but generally I see roughly about 7 digits of accuracy (about -130dB ish) before floating point error begins to occur when using a float. For double precision, I usually see around 14 digits of accuracy (about -300dB I think?) before floating point error begins to occur. Since your values are e+78 and your errors start occurring around e+62, you do end up getting around 14 digits of accuracy which I would presume that this is using double precision and that you're getting about as close as double precision can describe. Here the error would be negligible I would think. The error introduced would then be due to running the computations in a differently (aka OpenCL and the library are slightly different on the binary). The interesting thing to me though is that I would normally see this when comparing across different architectures such as CPU/GPU/FPGA when looking at FP error but since these are both computed on the FPGA, I'd imagine there should be a way to get them to come out the exact same. Since they're running on the same FPGA, it would have to comedown to the FP operations are done differently in OpenCL vs the HDL code which may be due to how they both conform to the FP IEEE-754 standard. If I'd had to guess I'd maybe think it would be due to how they handle rounding or fp denorm/normalization. I'm not entirely sure, however, that's been my experience without diving too deep but that's my two cents.