Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
722 토론

Intel HLS (19.3) - Bad implementation of memory.

Pan
초급자
4,144 조회수

The expected implementation should have 1 Write (2 shared stores) and 1 Read (1 load) port. But those are the implementations in different cases:

  1. REG2_DATA_WIDTH 40 - Wrong implementation. The memory ended with multiple writes and ARB. II=33.
  2. REG2_DATA_WIDTH 41 - Expected implementation. II=1.
  3. REG2_DATA_WIDTH 48 - Wrong implementation. Multiple stores. II=3
  4. REG2_DATA_WIDTH 49 - Same as item 1 (40).
  5. REG2_DATA_WIDTH 64 - Expected implementation. II=1.

There is a method to trick compiler. And that is to define only one write into memory. But this shouldn't be the way to do it. It also raises another problems in more complex components.

The question. Why does this happen and what to do to avoid it? Is it compiler fault I have to circumvent?

The Conclusion. I ended with two possible versions why does this happen:

  1. The compiler tries to implement optimized memory, therefore it forces multiple narrower stores rather than one wider store.
  2. The compiler doesn't properly distinguish excluded stores. (Less likely)

 

0 포인트
1 솔루션
MEIYAN_L_Intel
3,933 조회수

Hi,

I had tried to compile the code and the attached code is not able to compile.

I have added the following line to make it compile:

 

#include <HLS/ac_int.h>

#include <HLS/hls.h>

#define HLS_COMPONENT component

#define REG_ADDR WIDTH 64

 

This appears to be a regression from 19.1; when compiled using 19.1, the 40-bit and 41-bit both worked nicely. It appears that in the 40-bit case, the compiler decomposes the 40-bit store into power-of-2 stores, 8+8+8+16. A workaround is to set the type in memory to be a 64-bit ac_int.

 

Thanks

원본 게시물의 솔루션 보기

0 포인트
7 응답
HRZ
소중한 기여자 III
3,933 조회수

If you post a simplified code example that can quickly be compiled to show the problem, it would be much easier to understand and try to solve the problem.

0 포인트
Pan
초급자
3,933 조회수

I have already uploaded component part only. But here you go with full main.cpp with testbench. The results should be:

15

30

201

101

(For some reason I cannot upload more than one file at once.)

0 포인트
MEIYAN_L_Intel
3,933 조회수

Hi,

I am still looking into implementation of memory in HLS.

Thanks

Pan
초급자
3,933 조회수

Hi,

is there any progress?

Have a nice day

0 포인트
MEIYAN_L_Intel
3,934 조회수

Hi,

I had tried to compile the code and the attached code is not able to compile.

I have added the following line to make it compile:

 

#include <HLS/ac_int.h>

#include <HLS/hls.h>

#define HLS_COMPONENT component

#define REG_ADDR WIDTH 64

 

This appears to be a regression from 19.1; when compiled using 19.1, the 40-bit and 41-bit both worked nicely. It appears that in the 40-bit case, the compiler decomposes the 40-bit store into power-of-2 stores, 8+8+8+16. A workaround is to set the type in memory to be a 64-bit ac_int.

 

Thanks

0 포인트
Pan
초급자
3,933 조회수

Hi,

Thanks for your time. I know the behaviour of 19.3 compilation, the fact that 19.1 compiles the right way is interesting. The workaround is obvious, but it will consume resources.

Maybe the question is more about whether this problem will be resolved in the new version of Intel HLS?

 

Again, thanks for your time and have a great day.

0 포인트
MEIYAN_L_Intel
3,933 조회수

Hi,

I had reported this problem to the developer and they had noted this problem as well.

They plan to solved this problem in newer version of Intel HLS.

Thank you for your information.

Thanks

응답