Re: how to understand burst sizes info from profiler

Altera_Forum · ‎01-31-2018

hi,

The profiler is showing me the following measurements for the read on the "K_contributors" global arg:

Bandwidth: 0.1 MB/s, 100 % efficiency

Average Burst Size: 2.0

(Max Burst size: 16 )


// THIS IS A SINGLE WORK-ITEM KERNEL# define MAX_CONTRIBUTORS 8128
void Krnl_IntraE(...
         __global const char3* restrict K_contributors,
)
{
    __local char3  localcache   ;
    for (ushort i=0; i<MAX_CONTRIBUTORS; i++) {
         localcache  = K_contributors ;    
    }
...
}

As the loop-index "i" is increased consecutively, I expected larger burst sizes than 2.

is there any explanation for this?

Altera_Forum · ‎01-31-2018

You should unroll the loop so that the compiler would infer a wider port to memory, allowing for larger burst size. There is little to no runtime coalescing done for single work-item kernels and hence, you should not expect a large burst size without unrolling, just because the accesses are consecutive.

Altera_Forum · ‎02-08-2018

1. Why Bandwidth is 0.1MB/s??Is there something wrong with profiler? I also encounter this problem in quartus 17.0

2. I have my kernel code like

typedef struct{

float a[20];

}A

__kernel foo(__global *A data){

A localdata[100];

for(i=0;i<100;i++){

localdata[i]=data[i+index];

}

I expect every memory access will bust coalescing read global memory for 20 float, so Average Burst Size suppose larger than 1.

but in profiler Average Burst Size shows only 4~6. how to increase my access efficiency?