Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16639 Discussions

Ordering of channel operations

tde_m
Novice
2,506 Views

Hello,

I'm currently struggling with enforcing the order of write/read to channels.

I know, from the Intel programming guide, that two independent channels operation can be re-ordered by the compiler to generate efficient hardware.

 

However, this seems to occur even if there is a clear dependence. I've created a minimal working example for this:

#pragma OPENCL EXTENSION cl_intel_channels : enable   //message typedef struct{ bool request; int data; }message_t;   //represents the status of the computation typedef struct{ bool start; message_t m; }computation_t;   channel message_t channels[2] __attribute__((depth(2)));   // Auxiliary function for receiving data void receive(computation_t *status, int *data){ if(status->start){ //at the beginning send the request for data write_channel_intel(channels[0],status->m); status_>start=false; } //receive the data and store it status->m=read_channel_intel(channels[1]); *data=status->m.data; }   __kernel void comp(const int N, const int start, __global int *mem){ int data; computation_t status; status.start=true; status.m.data=N; for(int i=0;i<N;i++) { //receive data, increment and store it to memory receive(&status,&data); data++; mem[i]=data; }   }     //generates a stream of data upon request __kernel void generator(){ //receive the request message_t m=read_channel_intel(channels[0]); for(int i=0;i<m.data;i++) { message_t send; send.data=i; send.request=false; write_channel_intel(channels[1],send); } }

The "comp" kernel is characterized from a pipelined loop in which it receives data coming from the "generator" kernel using the "receive" function. At the first iteration, a request is sent to the generator in order to let it generate the right amount of data.

 

If I try to compile this, the channel operations of the "receive" function are re-oderdered, as can be seen from the report: quartus_report.png

 

This occurs even if there is a clear dependency between the two.

Clearly, if in hardware it is first executed the read, this will lead to deadlock.

 

This happens with Quartus 18.1 and 19.1 (Stratix 10 as target board).

 

In you opinion, is it a compiler bug or I have to handle this in a different way?

 

Thanks

0 Kudos
22 Replies
HRZ
Valued Contributor III
377 Views

Regarding the channel reordering, I think I now understand that the compiler always detaches channel operations from other read/write operations and uses extra registers (register renaming?) to handle dependencies such as the one discussed here which makes sense. Hence, it this case, if a cycle of channels did not exist, the channel operations in the "receive" kernel would still have been reordered, but no data corruption would have happened because the dependency is handled using extra registers. However, due to the cycle of channels and the channel reordering, a deadlock happens at run-time unless channel ordering is enforced using mem_fence.

 

Still, since I also thought all this time that channel reordering will not happen when data dependencies are involved, I would say the relationship between channel ordering and data dependencies could be very confusing for people who do not come across this thread and it is probably best if it is explained somewhere in the documentation.

0 Kudos
Douglas_P_Intel
Employee
377 Views

I agree that the documentation needs to be clarified regarding channel ordering. I will request clarification in the documents.

0 Kudos
Reply