- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When we access memory in the opencl kernel like this:
for (int i = 0; i < N; i++)
... = A[i]
Are they executed in non-blocking manner? Meaning does the generated FSM wait for the memory load to complete before sending another load request to memory, or it sends out mutliple load requests one after another and then handle the responses in-order when they come back?
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In case of Single Work-item kernels, loops are pipelined. This also applies to the memory accesses inside loops. Hence, access requests are sent back to back and after a certain delay, data is received in the same order. If the buffer between the kernel and memory becomes empty, then the kernel will stall waiting for new data to arrive.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page