OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.

Serializing 2 kernels

ZVere
Beginner
371 Views

Hello,

My input data is N rows X M columns matrix. Each cell is a float number.

The first stage:
Subtract each row from its previous one.
The output data is (N-1) rows X M columns.
For the subtraction, I think (not sure) I have to keep the input matrix and put the output in a new matrix.

Second stage:
FFT on each row. The output is (N-1) rows X M columns.
For the FFT process, the work item is a butterfly. for M items in a row I have M/4 butterflies.

Is it possible to do the 2 operations without coming back to the host after the first stage ?

Best regards,
Z.V

0 Kudos
1 Reply
Robert_I_Intel
Employee
371 Views

Hi Zvi,

Depends on your hardware. If you have 5th or 6th generation Intel processors (Broadwell or Skylake), which support OpenCL 2.0, you can enqueue the second kernel from the first one (see https://software.intel.com/en-us/articles/gpu-quicksort-in-opencl-20-using-nested-parallelism-and-work-group-scan-functions for example on how to do that or https://software.intel.com/en-us/articles/sierpinski-carpet-in-opencl-20 for a toy example).

0 Kudos
Reply