OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1719 Discussions

Kernel execution crash on Linux!

zhaopeng
Beginner
666 Views
Hello,
I write a kernel for streamline generation. The interface is as follow.
__kernel void generateStreamlines (__global float3* flowData, __global float3* seeds, __global float3* streamlineVertex, __global float3* streamlineTangent,__global uint* streamlineVertexNum, GenerationParameters generationParameters, UniformGridInfo gridInfo)
GenerationParameters andUniformGridInfo are structs defined as follow.
typedef struct {
float3 lowBound;
float3 upBound;
float initStepSize;
float maxLength; //max streamline length
float terminationSpeed;
float adaptiveUpBound;//if >adaptiveUpBound, enlarge step size
float adaptiveLowBound;//if < adaptiveLowBound, reduce step size
uint maxStepNum;
} GenerationParameters;
typedef struct {
float3 origin;
int3 extentLow;
int3 extentUp;
float3 spacing;
float3 reciprocalSpacing;
float reciprocalSpacingXY;
float reciprocalSpacingYZ;
float reciprocalSpacingZX;
float reciprocalSpacingXYZ;
} UniformGridInfo;
The execution of this kernel crashed on the OpenSUSE 12.1 64bit with core i5-2500K. I test this kernel using AMD and Nvidia OpenCL implementation and all is OK.It is strange that changing the order of arguments or removing one argument can solve this problem for Intel OpenCL driver. Any advice? Thanks a lot!
ZHAO Peng
0 Kudos
5 Replies
zhaopeng
Beginner
666 Views
After more test I find that the cause may be related to the alignment of float3. So I replace float3 with plain float and everything is OK.
Be careful of float3 which may introduce problems. Even worse different OpenCL implementations have different symptoms
0 Kudos
IDZ_A_Intel
Employee
666 Views
Hi,

float3 needs to be aligned to 16byte boundary. I am guessing that your data was not aligned. The times it worked, the data was aligned by chance so you didn't see the crash.

Make sure you align the data to a float4 boundary and let me know if you still see the crash.

Thanks,
Raghu
0 Kudos
Raghupathi_M_Intel
666 Views

Hi,

float3 needs to be aligned to 16byte boundary. I am guessing that your data was not aligned. The times it worked, the data was aligned by chance so you didn't see the crash.

Make sure you align the data to a float4 boundary and let me know if you still see the crash.

Thanks,

Raghu

0 Kudos
zhaopeng
Beginner
666 Views
Thanks for your reply!
From the specification
The OpenCL compiler is responsible for aligning data items to the appropriate alignment asrequired by the data type. For arguments to a __kernel function declared to be a pointer to adata type, the OpenCL compiler can assume that the pointee is always appropriately aligned asrequired by the data type.
Do you mean Intel's compiler do nothing about the alignment and leave it to users? I am not clear how to align data to float4 boundary. Do you mean change my 3-components data array to 4-components data array by adding one more component or just make the whole array size to be the multiple of 16?
I'll stick to use float because I need to process large data and don't want to waste an extra component. But I'd like to give a test.
Thanks again!
ZHAO Peng
0 Kudos
Raghupathi_M_Intel
666 Views
If the compiler is responsible for aligning data then it should, otherwise it is a bug.If my understanding is right the compiler assumes that the __kernel arguments are properly aligned.

What exactly is causing the problem? The structs or the float3 arguments? The float3 components are aligned to 16byte boundary so you don't need to add the padding. Can you create a small reproducer?

You can read about alignment attribute in section 6.10 (1.1) or 6.11 (1.2), for forcing alignment.

Thanks,
Raghu
0 Kudos
Reply