Community
cancel
Showing results for 
Search instead for 
Did you mean: 
tde_m
Novice
802 Views

Systolic array: channel indexing

Hello,

I'm trying to write a basic example of a systolic array. It is essentially a chain of processing elements connected through channels: the first one read data from memory and inject it into the chain. Each of the internal PE read the data from the previous neighbor, increments it by one and forward to the following neighbor. At the end, data is written into memory.

 

The code is the following. I have used the num_compute_units to have hardware replication.

 

 

 

#pragma OPENCL EXTENSION cl_intel_channels : enable #define N 16 channel float chan[N-1];   //reads 10 elements and inject into channel void start_pe(__global const float *A){ for(int i=0;i<10;i++) write_channel_intel(chan[0],A[0]); }   //read the element, increments and then pass it to the next one void generic_pe(const int idx){ for(int i=0;i<10;i++){ float el=read_channel_intel(chan[idx-1]); el++; write_channel_intel(chan[idx],el); } }     //receives the elements and write them in memory void end_pe(__global float * B){ for(int i=0;i<10;i++) { float el=read_channel_intel(chan[N-2]); B[i]=el; } }   __attribute__((max_work_group_size(64,1,1))) __attribute__((num_compute_units(N))) __kernel void chain(__global const float * restrict A,__global float * restrict B) { const size_t gid = get_global_id(0); if(gid==0) start_pe(A); else if(gid==N-1) end_pe(B); else generic_pe(gid); }

Each PEs, loops over 10 elements.

 

With Quartus 18.0, emulation is ok, but when I try to compile for the arria10, the compilations stops immediately by stating:

 

 

 

Compiler Error: Indexing into channel array chan could not be resolved to all constants

and indicate the line 16.

It looks strange to me, since the index is indeed constant.

 

In the Programming Guide it is written that "channels extension does not support dynamic indexing into arrays of channel IDs" and to use static indexes.

 

However, in the same guide (under "Using Channels with Kernel Copies"), it seems that channel arrays indexed with the return value of global id are valid.

 

Do you have any idea of the source of the problem?

Thanks

0 Kudos
2 Replies
HRZ
Valued Contributor II
38 Views

You seem to be confusing replication of NDRange kernels with Single Work-item ones. The former is completely controlled by the compiler and you CANNOT use it to design a systolic array since you cannot customize the kernel copies. What the guide is referring to is replication of Single Work-item kernels where you can then use the get_compute_id() function to get a compile-time constant ID for each kernel copy and use it to customize each copy and its channel connections. In your case the get_global_id() returns a run-time variable value which cannot be used as channel index.

 

tde_m
Novice
38 Views

Yes, thanks. I should use the get_compute_id!

Reply