Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
674 Discussions

performances on Arria 10

scali
Novice
677 Views

Hello,

 

I would like to ask you a question about the performances on the FPGA board Arria 10.

 

I'm using DPC++ and adapted the sample code of the matrix multiplication

https://github.com/oneapi-src/oneAPI-samples/tree/master/DirectProgramming/DPC%2B%2B/DenseLinearAlgebra/matrix_mul

to run on FPGAs, as described in

https://pp4fpgas.readthedocs.io/en/latest/devcloud.html

The kernel is the following:

h.parallel_for(range(M, P), [=](auto index) {
// Get global position in Y direction.
int row = index[0];
// Get global position in X direction.
int col = index[1];
 
float sum = 0.0f;
 
// Compute the result of one element of c
for (int i = 0; i < width_a; i++) {
sum += a[row][i] * b[i][col];
}
 
c[index] = sum;

});

Changing the matrix size from 128 to 4096 and running the kernel on GPUs, CPUs and FPGAs I have observed the performance on Arria 10 is always below 1 GFlops, while on GPUs and CPUs I can reach far better performances.

 

I've recently found in

https://software.intel.com/content/www/us/en/develop/download/oneapi-fpga-optimization-guide.html

(Section  4.2.2 ), that I probably need to specify the work group size manually, but I always get performances below 1 GFlops.

 

Could you please tell me if I need to change some lines in the kernel or if I need to use particular compilation flags for optimization?

 

Many thanks. Any suggestion is very welcome.

 

0 Kudos
1 Reply
AnilErinch_A_Intel
624 Views

Hi ,

Please let us know which CPU and GPU you are comparing the A10 against.

Also in the section 4 of the optimization guide there are different optimization methods mentioned. Are you getting the same results while applying the optimizations. Like when you say less than 1 G Flops , does it vary like 800 M Flops ,900 M Flops etc.

Please let us know further details.

Thanks and Regards

Anil


0 Kudos
Reply