Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7117 Discussions

How to call the DFT function in oneMKL in the sycl kernel?

XinyeChu
Novice
1,375 Views
There are multiple errors reported:
SYCL kernel cannot call an undefined function without SYCL_EXTERNAL attribute 
SYCL kernel cannot call a variadic function Code 
#include <CL/sycl.hpp>
#include <mkl_dfti.h>
#include <iostream>
#include <complex>
#include <vector>

using namespace std;
using namespace sycl;

int main() {
    complex<double> input[48] = { 0.5, -0.991445, 0.258819, 0.793353, -0.866025, -0.130526, 0.965926, -0.608761, -0.5, 0.991445, -0.258819, -0.793353, 0.866025, 0.130526, -0.965926, 0.608761, 0.5, -0.991445, 0.258819, 0.793353, -0.866025, -0.130526, 0.965926, -0.608761, -0.5, 0.991445, -0.258819, -0.793353, 0.866025, 0.130526, -0.965926, 0.608761, 0.5, -0.991445, 0.258819, 0.793353, -0.866025, -0.130526, 0.965926, -0.608761, -0.5, 0.991445, -0.258819, -0.793353, 0.866025, 0.130526, -0.965926, 0.608761 };
    complex<double> output[48];  // Output for complex-to-complex DFT;

    // Create queue and device selector
    queue q(cpu_selector{});

    // Allocate memory for input and output on the device
    buffer<complex<double>, 1> input_buffer(input, range<1>(48));
    buffer<complex<double>, 1> output_buffer(output, range<1>(48));

    // Submit work to the queue for execution
    q.submit([&](handler& h) {
        // Get accessors to access data on the device
        auto input_accessor = input_buffer.get_access<access::mode::read>(h);
        auto output_accessor = output_buffer.get_access<access::mode::write>(h);

        // Execute DFT in kernel
        h.parallel_for(range<1>(1), [=](id<1> idx) {
            // Call MKL DFT setup and compute in kernel
            DFTI_DESCRIPTOR_HANDLE my_desc_handle = NULL; // Descriptor handle for complex-to-complex FFT;  
            MKL_LONG status; // Variable to store command execution status;
            status = DftiCreateDescriptor(&my_desc_handle, DFTI_DOUBLE, DFTI_COMPLEX, 1, 48); // Create a descriptor;  
            status = DftiSetValue(my_desc_handle, DFTI_PLACEMENT, DFTI_NOT_INPLACE); // Set non-inplace operation;
            status = DftiSetValue(my_desc_handle, DFTI_NUMBER_OF_TRANSFORMS, 1); // Set number of transforms to 1;
            status = DftiCommitDescriptor(my_desc_handle); // Commit descriptor to make configuration effective;
            status = DftiComputeForward(my_desc_handle, input_accessor.get_pointer(), output_accessor.get_pointer()); // Perform forward FFT;
            status = DftiFreeDescriptor(&my_desc_handle); // Free descriptor;
            });
        });

    // Wait for the queue to finish execution
    q.wait();

    // Output complex-to-complex FFT result
    cout << "FFT result:" << "\n";
    auto result = output_buffer.get_access<access::mode::read>();
    for (int i = 0; i < 48; i++) {
        cout << result[i] << "\n";
    }
    cout << "\n";

    return 0;
}
0 Kudos
1 Solution
Gajanan_Choudhary
1,021 Views

Hi @XinyeChu,

 

The DftiCreateDescriptor and other APIs you are using in your code are not SYCL APIs. They are C APIs that work only on CPUs. For using DFT on GPU devices, you need to use oneMKL's SYCL/DPC++ APIs. There are examples of its usage distributed in the share/doc/mkl/examples/ directory contained in the oneMKL library that you have already downloaded. You would need to look at the sycl/dft/source/ directory under that for the DFT samples (possibly after extracting the "examples_sycl.zip"/"examples_sycl.tgz" archive files if they aren't already extracted out). One example is also available online in the documentation.

 

Hope that helps.

View solution in original post

0 Kudos
6 Replies
SofeaAzrin_A_Intel
1,319 Views

Thanks for reaching out to us.


We are producing the error from our side. Meanwhile, could you please provide us with the following details so that we could try to address the issue that you are getting?


The version of the DPC++ compiler being used and the OS environment details.





0 Kudos
XinyeChu
Novice
1,241 Views

My environment:

CPU: Intel(R) Core(TM) i7-1065G7 CPU 

GPU: Intel(R) Iris(R) Plus Graphics

OS: Win11

IDE: VS2022

OneAPI BaseToolkit: 2024.0.1.45

0 Kudos
XinyeChu
Novice
1,133 Views

I can currently in the CPU(host) use oneMKL's DFT function to implement data calculations. But now I want to implement oneMKL's DFT operation in the device GPU. I tried many things but failed.So I came here hoping to get help, thank you!Btw,I use Intel oneAPI's DPC++ language.

My environment:

CPU: Intel(R) Core(TM) i7-1065G7 CPU;

GPU: Intel(R) Iris(R) Plus Graphics;

OS: Win11;

IDE: VS2022;

OneAPI BaseToolkit: 2024.0.1.45;

OneMath Kernel Library:2024.1.0.696;

  1. I've tried encapsulating the DFT settings in a non-kernel function and then the device kernel calls that function. But it will report errors: 【SYCL kernel cannot call an undefined function without SYCL_EXTERNAL attribute】【 SYCL kernel cannot call a variadic function】

  2. I checked the SYCL2020 standard: 【SYCL device code, as defined by this specification, does not support virtual function calls, function pointers in general, exceptions, runtime type information or the full set of C++ libraries that may depend on these features or on features of a particular host compiler. Nevertheless, these basic restrictions can be relieved by some specific Khronos or vendor extensions.】From the above, I can conclude that the DFT of Intel's oneMKL library should be callable in the kernel.(maybe?)

  3. Expect to be able to call this DFT in the kernel (Intel GPU).

  4. Below is the DFT implementing oneMKL in CPU:

 

#include <iostream>
#include <mkl_dfti.h>
#include <complex>
#include <vector>
using namespace std;

int main() {

    complex<double> input[48] = { 0.5,-0.991445,0.258819,0.793353,-0.866025,-0.130526,0.965926,-0.608761,-0.5,0.991445,-0.258819,-0.793353,0.8 66025,0.130526,-0.965926,0.608761, 0.5,-0.991445,0.258819,0.793353,-0.866025,-0.130526,0.965926,-0.608761,-0.5,0.991445,-0.258819,-0.793353,0.866025,0.13052 6,-0.965926,0.608761,0.5,-0.991445,0.258819,0.793353,- 0.866025,-0.130526,0.965926,-0.608761,-0.5,0.991445,-0.258819,-0.793353,0.866025,0.130526,-0.965926,0.608761 };
    complex<double> output[48]; // Output for complex-to-complex DFT;
    
    DFTI_DESCRIPTOR_HANDLE my_desc_handle = NULL; // Descriptor handle for complex to complex FFT;

    MKL_LONG status; // Variable to store command execution status;
    status = DftiCreateDescriptor(&my_desc_handle, DFTI_DOUBLE, DFTI_COMPLEX, 1, 48); // Create a descriptor;
    status = DftiSetValue(my_desc_handle, DFTI_PLACEMENT, DFTI_NOT_INPLACE); // Set non-in-place operations;
    status = DftiSetValue(my_desc_handle, DFTI_NUMBER_OF_TRANSFORMS, 1); //Set the number of transformations to 1;
    status = DftiCommitDescriptor(my_desc_handle); // Submit the descriptor to make its configuration effective;
    status = DftiComputeForward(my_desc_handle, input, output); // Execute forward FFT;
    status = DftiFreeDescriptor(&my_desc_handle); // Release descriptor;

    cout << "FFT result:" << endl; //Output the FFT result from complex number to complex number
    for (int i = 0; i < 48; i++) {
        cout << output[i] << "\n";
    }
    cout << endl << endl;

    return 0;
}

 

0 Kudos
XinyeChu
Novice
1,065 Views
Or is there any DFT code that can run on GPU? I want to run it
0 Kudos
XinyeChu
Novice
1,130 Views

Hope to get a reply as soon as possible, thank you!

0 Kudos
Gajanan_Choudhary
1,022 Views

Hi @XinyeChu,

 

The DftiCreateDescriptor and other APIs you are using in your code are not SYCL APIs. They are C APIs that work only on CPUs. For using DFT on GPU devices, you need to use oneMKL's SYCL/DPC++ APIs. There are examples of its usage distributed in the share/doc/mkl/examples/ directory contained in the oneMKL library that you have already downloaded. You would need to look at the sycl/dft/source/ directory under that for the DFT samples (possibly after extracting the "examples_sycl.zip"/"examples_sycl.tgz" archive files if they aren't already extracted out). One example is also available online in the documentation.

 

Hope that helps.

0 Kudos
Reply