OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1719 Discussions

Strange crash... access to __local memory

Polar01
Beginner
1,702 Views
Hi,
I have implement a 'scan' algorithm in OpenCL. Because it is an open source library I test in on several machines and OpenCL-SDK.
But, it crash with the Intel SDK ! (Not with the other ones).
What I have discover is that the problem is maybe related to the "__local" memory !
I have the following kernel :
__kernelvoid kernel__ExclusivePrefixScan(...,__local T* localBuffer,...)
And I set up my buffer with the following command :
clStatus = clSetKernelArg(_kernel_Scan, 2, _workgroupSize * 2 * sizeof(int), 0);
checkCLStatus(clStatus); <= CL_SUCCESS !!!
Where _workgroupSize= 128; So, I reserve 1024 bytes only !!!
You can find the code at :http://code.google.com/p/clpp/
Krys
0 Kudos
1 Solution
Evgeny_F_Intel
Employee
1,699 Views
Hi,

I investigatedon rev41.

Well, the issue here is not the __local memory but the memory overrun during write in line:
line 175, clppScan.cl : blockSums[bid] = localBuffer[localBufferFullSize-1];

Looking on the host code I saw that you allocate memory buffer that is not sufficient for the operation of the algorithm.

One of the issues is buffer size calculation (line 219, clppScan.cpp). You are use workgroup size of 128, while providing local size of 64 (line 72) to NDRange. Thus, causes number of workgroups to be greater than size of the allocated buffer and as a result you have memory overrun.

After the change first NDRange passed, I added clFinish() after it, but then the next NDRange failed. This is because the same reason. The intermidiate buffer size doesn't match the number of work groups, probably you should decrease the global size in the next pass.

Regards,
Evgeny

View solution in original post

0 Kudos
21 Replies
Polar01
Beginner
94 Views
Great,
Thanks for your support
0 Kudos
Reply