Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Altera_Forum
Honored Contributor I
799 Views

how to understand burst sizes info from profiler

hi, 

 

The profiler is showing me the following measurements for the read on the "K_contributors" global arg: 

 

Bandwidth: 0.1 MB/s, 100 % efficiency 

Average Burst Size: 2.0 

(Max Burst size: 16 ) 

 

 

// THIS IS A SINGLE WORK-ITEM KERNEL# define MAX_CONTRIBUTORS 8128 void Krnl_IntraE(... __global const char3* restrict K_contributors, ) { __local char3 localcache ; for (ushort i=0; i<MAX_CONTRIBUTORS; i++) { localcache = K_contributors ; } ... }  

 

As the loop-index "i" is increased consecutively, I expected larger burst sizes than 2. 

is there any explanation for this?
0 Kudos
2 Replies
Altera_Forum
Honored Contributor I
44 Views

You should unroll the loop so that the compiler would infer a wider port to memory, allowing for larger burst size. There is little to no runtime coalescing done for single work-item kernels and hence, you should not expect a large burst size without unrolling, just because the accesses are consecutive.

Altera_Forum
Honored Contributor I
44 Views

1. Why Bandwidth is 0.1MB/s??Is there something wrong with profiler? I also encounter this problem in quartus 17.0 

 

2. I have my kernel code like 

 

typedef struct{ 

float a[20]; 

}A 

 

__kernel foo(__global *A data){ 

A localdata[100]; 

for(i=0;i<100;i++){ 

localdata[i]=data[i+index]; 

 

I expect every memory access will bust coalescing read global memory for 20 float, so Average Burst Size suppose larger than 1. 

but in profiler Average Burst Size shows only 4~6. how to increase my access efficiency?
Reply