- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
My input data is N rows X M columns matrix. Each cell is a float number.
The first stage:
Subtract each row from its previous one.
The output data is (N-1) rows X M columns.
For the subtraction, I think (not sure) I have to keep the input matrix and put the output in a new matrix.
Second stage:
FFT on each row. The output is (N-1) rows X M columns.
For the FFT process, the work item is a butterfly. for M items in a row I have M/4 butterflies.
Is it possible to do the 2 operations without coming back to the host after the first stage ?
Best regards,
Z.V
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Zvi,
Depends on your hardware. If you have 5th or 6th generation Intel processors (Broadwell or Skylake), which support OpenCL 2.0, you can enqueue the second kernel from the first one (see https://software.intel.com/en-us/articles/gpu-quicksort-in-opencl-20-using-nested-parallelism-and-work-group-scan-functions for example on how to do that or https://software.intel.com/en-us/articles/sierpinski-carpet-in-opencl-20 for a toy example).

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page