Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
All support for Intel NUC 7 - 13 systems has transitioned to ASUS. Read latest update.
16483 Discussions

Systolic array: channel indexing



I'm trying to write a basic example of a systolic array. It is essentially a chain of processing elements connected through channels: the first one read data from memory and inject it into the chain. Each of the internal PE read the data from the previous neighbor, increments it by one and forward to the following neighbor. At the end, data is written into memory.


The code is the following. I have used the num_compute_units to have hardware replication.




#pragma OPENCL EXTENSION cl_intel_channels : enable #define N 16 channel float chan[N-1];   //reads 10 elements and inject into channel void start_pe(__global const float *A){ for(int i=0;i<10;i++) write_channel_intel(chan[0],A[0]); }   //read the element, increments and then pass it to the next one void generic_pe(const int idx){ for(int i=0;i<10;i++){ float el=read_channel_intel(chan[idx-1]); el++; write_channel_intel(chan[idx],el); } }     //receives the elements and write them in memory void end_pe(__global float * B){ for(int i=0;i<10;i++) { float el=read_channel_intel(chan[N-2]); B[i]=el; } }   __attribute__((max_work_group_size(64,1,1))) __attribute__((num_compute_units(N))) __kernel void chain(__global const float * restrict A,__global float * restrict B) { const size_t gid = get_global_id(0); if(gid==0) start_pe(A); else if(gid==N-1) end_pe(B); else generic_pe(gid); }

Each PEs, loops over 10 elements.


With Quartus 18.0, emulation is ok, but when I try to compile for the arria10, the compilations stops immediately by stating:




Compiler Error: Indexing into channel array chan could not be resolved to all constants

and indicate the line 16.

It looks strange to me, since the index is indeed constant.


In the Programming Guide it is written that "channels extension does not support dynamic indexing into arrays of channel IDs" and to use static indexes.


However, in the same guide (under "Using Channels with Kernel Copies"), it seems that channel arrays indexed with the return value of global id are valid.


Do you have any idea of the source of the problem?


0 Kudos
2 Replies
Valued Contributor III

You seem to be confusing replication of NDRange kernels with Single Work-item ones. The former is completely controlled by the compiler and you CANNOT use it to design a systolic array since you cannot customize the kernel copies. What the guide is referring to is replication of Single Work-item kernels where you can then use the get_compute_id() function to get a compile-time constant ID for each kernel copy and use it to customize each copy and its channel connections. In your case the get_global_id() returns a run-time variable value which cannot be used as channel index.


0 Kudos

Yes, thanks. I should use the get_compute_id!

0 Kudos