Showing results for 
Search instead for 
Did you mean: 
Honored Contributor I

Documentation is confusing about calls and synchronization


I am reading the OpenCL documentation, and I found two major confusing concepts: 


1. It is mentionned in page 28 of the Programming Guide that: "for a given kernel, you can only assign one call site per channel ID" (the same for pipes), and then they used many times multiple calls to the same channel and in the same kernel, and even used in loops (page 31 and 33). 

Can anyone clarify this detail to me? 


2. In the synchronization of pipes (example page 56), why when we use blocking attribute the calls are not ordered, and we need to add fences (mem_fence()), how can the calls be blocking and not ordered in the same time ?
0 Kudos
1 Reply
Honored Contributor I

1. Reading from or writing to the same channel in a for loop, unless the for loop is unrolled, will just create one "call site"; after all, there is only one iteration from the loop accessing the channel in each clock and only one read/write port will exist. Why do you think this creates multiple call sites? Other than the first example in page 28 where the documentation shows what type of channel usage will results in compilation failure due to multiple call sites, no other example with multiple call sites exists in any other part of the documentation. Note that it is also possible to declare multiple channels in the same way as you declare an array (channel_name[number_of_channels]) and in this case, all channel_name[i] (0 <= i <= number_of_channels - 1) will be a separate channel and call sites to all of these channels can coexist in the same kernel, as long as the channel ID is not repeated. 


2. Blocking channel read/writes and ordering of channels are not tied to each other; unless you use fences or there is a dependency between the operation of your channels, the compiler can and will change the order of channels (in the same way as it would change the order of the rest of the operations in the kernel) to create the most efficient hardware.