Program runs fine on CPU but not on GPU (Inbuilt Intel GPU)

Abhijit4 · ‎02-23-2025

Hello Intel Community,

I am running a simple program(generation of chirp) on my system that works perfectly fine when executed on the CPU. However, when I try to run the same program utilizing the inbuilt Intel GPU, it encounters issues and doesn’t perform as expected.

My system details are shown below:

[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2024.17.3.0.08_160000]
[opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) i5-10310U CPU @ 1.70GHz OpenCL 3.0 (Build 0) [2024.17.3.0.08_160000]
[opencl:gpu:2] Intel(R) OpenCL HD Graphics, Intel(R) UHD Graphics OpenCL 2.1 NEO [27.20.100.8478]

The code is shown below:

#include <iostream>   // C++ I/O
#include <cmath>      // Mathematical operations
#include <mkl.h>      // Intel MKL functions
#include <complex>    // C++ complex numbers
#include <CL/sycl.hpp>  // SYCL Header

// Constants
constexpr double SPEED_OF_SOUND = 1500.0;  
constexpr double SAMPLING_FREQUENCY = 1000000.0;  
constexpr double CENTER_FREQUENCY = 115000.0;  
constexpr double BANDWIDTH = 10000.0;  
constexpr double CHIRP_DURATION = 0.001;  
constexpr int NUM_SAMPLES = static_cast<int>(CHIRP_DURATION * SAMPLING_FREQUENCY);
constexpr double FSTART = (CENTER_FREQUENCY - (BANDWIDTH / 2));  

// Function to generate LFM chirp signal using SYCL
void generateChirp(double* signal, int num_samples, double fs, double fSTART, double bandwidth, double duration, sycl::queue& q) {
    double k = bandwidth / duration;  // Chirp rate (Hz/s)

    // SYCL parallel execution
    q.submit([&](sycl::handler& h) {
        h.parallel_for(sycl::range<1>(num_samples), [=](sycl::id<1> n) {
            double t = static_cast<double>(n) / fs;  // Time in seconds
            double phase = 2.0 *3.14* (fSTART * t + 0.5 * k * t * t);
            signal[n] = cos(phase);  // Generate chirp waveform
        });
    }).wait(); // Wait for GPU execution to complete
}

int main() {
    std::cout << "C++ with SYCL Conversion Started..." << std::endl;

    // Create a SYCL queue for GPU execution
    sycl::queue q{ sycl::gpu_selector() };

    // Allocate memory for signals
    double* transmitted_signal = new double[NUM_SAMPLES];
    double* received_signal = new double[NUM_SAMPLES];
    double* received_signal_noisy = new double[NUM_SAMPLES];

    // Generate Chirp Signal
    generateChirp(transmitted_signal, NUM_SAMPLES, SAMPLING_FREQUENCY, FSTART, BANDWIDTH, CHIRP_DURATION, q);

    std::cout << "Chirp signal generated successfully!" << std::endl;

    // Free allocated memory
    delete[] transmitted_signal;
    delete[] received_signal;
    delete[] received_signal_noisy;

    return 0;
}

Ben_A_Intel · ‎02-27-2025

Hello,

I think your program is not working on your integrated GPU because it is operating on memory allocated using "new":

double* transmitted_signal = new double[NUM_SAMPLES];

This requires the "usm_system_allocations" device aspect, which many GPUs do not support.

For most GPUs, you want to allocate some other type of USM via SYCL instead, such as device USM, host USM, or shared USM. See:

https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#_kinds_of_unified_shared_memory

I also see that your drivers are fairly old. I would strongly suggest updating your drivers to the latest version if at all possible.

Cheers!

Program runs fine on CPU but not on GPU (Inbuilt Intel GPU)

OpenCL* for GPU