OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1717 Discussions

Intel Phi does not write data to buffer when clEnqueueReadBuffer is called CentOS

Jackson_H_
Beginner
809 Views

Hello everyone!

I'm running into a problem where data is not being written to my buffer when the kernels finish. I've tested my kernel in isolation in Eclipse running in Ubuntu on an Intel i5 CPU and it seems to output the correct results. When I move it over to CentOS I can't get printf statements to return from the kernel and my output buffers are never written to. Here is an example of my code:

double * coef_elts = (double *) calloc(p * voxels, sizeof(double));

return_vec_1 = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, sizeof(double) * p * voxels, coef_elts, &err);

err = clSetKernelArg(kernel, 26, sizeof(cl_mem), &return_vec_1);

err = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &global_size, NULL, 0, NULL, NULL);

err = clEnqueueReadBuffer(queue, return_vec_1, CL_TRUE, 0, sizeof(double) * p * voxels, coef_elts, 0, NULL, NULL);

When I read the output it only contain the 0 data assigned by calloc. This wasn't the case in eclipse. If anyone has any suggestions on the code or getting an output in CentOS it would be much appreciated. I am aware CentOS is not supported but unfortunately I cannot change the OS.

Thank you!

0 Kudos
8 Replies
Robert_I_Intel
Employee
809 Views

Try the following just to make sure that the kernel completes:

cl_event event = NULL;

err = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &global_size, NULL, 0, NULL, &event );

err = = clWaitForEvents(1, &event);

status = clReleaseEvent(event);

You could also put the completion event on clEnqueueReadBuffer, though that is not strictly necessary.

Another workaround is to put clFlush and/or clFinish between clEnqueueNDRangeKernel and clEnqueueReadBuffer.

Let me know how it worked out.

0 Kudos
Jackson_H_
Beginner
809 Views

I've what you have suggested and I'm still getting the same results. I tried changing the CL_MEM_USE_HOST_PTR flag to CL_MEM_COPY_HOST_PTR and CL_MEM_ALLOC_HOST_PTR just incase it would make a difference but those also didn't work. I'm trying to run the code on the Intel Xeon processor instead of the Phi to check whether it's a problem with the Phi or the OS.

0 Kudos
Robert_I_Intel
Employee
809 Views

Jackson,

Could you by please provide the reproducer, if possible? Also, which OS version, processor and OpenCL driver version are you using?

At a minimum, what is your kernel?

 

Thanks!

0 Kudos
Jackson_H_
Beginner
809 Views

Update: The code works on the Xeon processor so the problem is most likely with the Phi. I do have to leave the computer for a while but I will return with the information you wanted. Also, I'm unfamiliar with what "reproducer" means. If you can elaborate for me I would be happy to provide it when I return.

0 Kudos
Robert_I_Intel
Employee
809 Views

A reproducer is a buildable minimal code sample that reproduces the problem. Usually, we use it to reproduce the issue on our end and file the bug with the driver team.

0 Kudos
Jackson_H_
Beginner
809 Views

I've talked to my partner in charge of the dev environment and he says the problem might be because I can't compile the code for the phi on the machine that I'm writing the code on since it does not have a phi. I've been sending the bin files over to the node that does have the phi. If that is the problem we can fix that pretty easy. Here is the stuff you asked for just in case you want to test it out anyway.

OS: CentOS version 7.0.1406

Processor: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz

Co-processor: Intel Corporation Xeon Phi coprocessor 31S1 (rev 11)

OpenCL driver version: 4.6.0.92_x64

0 Kudos
Jackson_H_
Beginner
809 Views

I've attached the reproducer and kernel.

0 Kudos
Yuri_K_Intel
Employee
809 Views
Hi Jackson, I think I cannot reproduce the behavior you mentioned so far. I have got the same results both for Xeon Phi and CPU. The output is attached. I'm using Intel OpenCL runtime version 14.2 (package opencl_runtime_14.2_x64_4.5.0.8.tgz from https://software.intel.com/en-us/articles/opencl-drivers) - this is the latest supported version for Xeon Phi. Please, also make sure that you have corresponding/compatible version of MPSS installed on target machine. As far as I remember it should be version 3.3.x. Thanks, Yuri
0 Kudos
Reply