memory replication

Altera_Forum · ‎04-10-2018

I'm confused about when and why the compiler replicates memory. I have a kernel that reads in a block of data and fills up a 2D memory array. THe read process is done serially through a channel. After having read in a block of data the computation portion of the kernel begins. During the computation phase data is read out in parallel from the memory bank. I want the memory bank to be implemented as a bank of parallel BRAMs and not replicated. The code is below.

local msg_t msgMem[180][512];

// ****************************************

// store input data across M memory banks

// ****************************************

for(uint k=0; k<NLDPC/M; k++)

{

# pragma unroll 1

for(uint r=0; r<M; r++){

msgMem[k][r] = read_channel_intel(LDPC_DEC_SDATA_OUT);

}

/*

More code here

*/

# pragma unroll

for(int r=0; r<M; r++)

msg[r] = msgMem[jOffset][r];

Thanks in advance for the help,

Altera_Forum · ‎04-11-2018

what is your struct of msg_t?

compiler will auto banking at lowest dimension.

your lowest dimension is msg_t.

maybe you can try use another struct

typedef struct{

msg_t kk[512];

} msg_ch;

local msg_ch msgMem[180];

for(uint k=0; k<NLDPC/M; k++)

{

# pragma unroll 1

for(uint r=0; r<M; r++){

msgMem[k].kk[r] = read_channel_intel(LDPC_DEC_SDATA_OUT);

}

msg = msgMem[jOffset];

Altera_Forum · ‎04-11-2018

--- Quote Start ---

I want the memory bank to be implemented as a bank of parallel BRAMs and not replicated.

--- Quote End ---

Can you clarify what you mean by a "bank of parallel BRAMs"? Since each BRAM on the FPGA only has two ports, if you have a large local buffer implemented on BRAMs with dynamic access, the compiler will have to replicate the whole buffer enough times to be able to satisfy all accesses to and from the buffer in parallel. Also it would help if you archive the report folder and post it here so that we can exactly see how many times and why the buffer is replicated.