OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1663 Discussions

Another kernel crash, with reproducer

ABoxe
Beginner
163 Views

This one is very simple - just reading in blocks of an image and storing in LDS.

Crashes with access violation on read.

Windows 7, latest SDK, CPU device.

//////////////////////////////////////////////////////////////////////////////////////////////////////////

// image is of dimension 512 x 512
//size_t local_work_size[3] = 32, 32/4
//size_t global_work_size[3] = {512, 512/4,1};


#define CODEBLOCKX 32
#define CODEBLOCKY 32
#define CODEBLOCKY_QUARTER 8
#define BOUNDARY 1
#define STATE_BUFFER_SIZE 1156
#define STATE_BUFFER_SIZE_QUARTER 289


void kernel run(read_only image2d_t R) {

    local int state[STATE_BUFFER_SIZE];

    //initialize pixels (including top and bottom boundary pixels)
    int2 posIn = (int2)(get_global_id(0) + get_global_id(0)*CODEBLOCKX,  get_global_id(1)*CODEBLOCKY);
    local int* statePtr = state + BOUNDARY + get_local_id(0);
    for (int i = 0; i < 4; ++i) {
        *statePtr = read_imagei(R, sampler, posIn).x;
        posIn.y += CODEBLOCKY_QUARTER;
        statePtr += STATE_BUFFER_SIZE_QUARTER;

    }
}

0 Kudos
3 Replies
ABoxe
Beginner
163 Views

Unfortunately, I am simply unable to work on this kernel anymore on my laptop which only has a CPU : it is constantly crashing.

It has been four days since I reported this, and no response. Very frustrated!!!!

Robert_I_Intel
Employee
163 Views

Aaron,

Sorry for the late reply. Hopefully, you resolved this by yourself by now. If not, here is the problem:

Your STATE_BUFFER_SIZE is 1156. Your get_local_id(0) could be from 0 to 31.

Note that statePtr starts at state + 32 for the get_local_id(0) == 31.

On the fourth iteration thru the loop, your statePtr will be state+32+STATE_BUFFER_SIZE - way beyond the extents of your state buffer.

One simple way to fix it is to make your state buffer larger by 33, so the last pointer advance is still valid.

Hope, that helps!

Robert_I_Intel
Employee
163 Views

Aaron,

One more thing: in this line

 int2 posIn = (int2)(get_global_id(0) + get_global_id(0)*CODEBLOCKX,  get_global_id(1)*CODEBLOCKY);

get_global_id(0) value is between 0 and 511 and get_global_id(1) is between 0 and 127. Are you sure that is what you want?

Or did you mean get_group_id(0) and get_group_id(1), which will be between 0 and 15?

Reply