Showing results for 
Search instead for 
Did you mean: 
Valued Contributor III

Basic Roofline Model for Arria 10

In the HPC community a rough model based on 2 ceilings : the peak flops and the bandwidth is used as 

a framework to evaluate architectures and algorithms (Sam Williams,D.Patterson The Roofline Model).  

This one is for the Arria 10 FPGA, where the measured PCIe gen3 b/w is around 6GB/s and the peak flop/s is around 1.5 Tf/s. 

A simple vector addition c = a + b[i] makes 1 flops every 12 bytes transferred therefore the  

Arithmetic Intensity is 1/12 and the limit performance is 6*1/12 = 0.5 gflop/s.
0 Kudos
0 Replies