Application Acceleration With FPGAs
Programmable Acceleration Cards (PACs), DCP, FPGA AI Suite, Software Stack, and Reference Designs
477 Discussions

Double Buffering on Intel FPGA?

RJin1
Beginner
1,589 Views

Hi all,

I read the paper "Best-Effort FPGA Programming: A Few Steps Can Go a Long Way". They used HLS on Xilinx devices as example. Besides the normal optimization, I found the double buffering is interesting:

void aes(...) { ... }

void load(...) { ... }

void store(...) { ... }

void compute(...) { ... }

void kernel(char *data, int size) {

char buf_data[3][PE_NUM][PE_BATCH];

 

#pragma HLS array_partition var=buf_data complete dim=1

#pragma HLS array_partition var=buf_data cyclic=PE_NUM dim=2

 

for (int i=0; i < size/BATCH_SIZE; i++) {

switch (i % 3) {

case 0:

load(buf_data[0], data+i*BATCH_SIZE);

compute(buf_data[1]);

store(data+i*BATCH_SIZE, buf_data[2]);

break;

case 1:

load(buf_data[1], data+i*BATCH_SIZE);

compute(buf_data[2]);

store(data+i*BATCH_SIZE, buf_data[0]);

break;

case 2:

load(buf_data[2], data+i*BATCH_SIZE);

compute(buf_data[0]);

store(data+i*BATCH_SIZE, buf_data[1]);

break;

}

}

}

Can the Intel compiler successfully imply this pipeline? I tried this on compiler version 16 but it seems that the throughput improvement is very limited.

0 Kudos
1 Reply
MuhammadAr_U_Intel
750 Views
Hi, There is continuous improvements on HLS software with every release. I would suggest using the latest version of HLS compiler 18.1 to see what optimization/ pipelining are done by software. Thanks, Arslan
0 Kudos
Reply