Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and GDB*

No device error with vector_add sample

Daniel_D
Beginner
827 Views

Hi,

 

I created a new DPC++ project (vector_add) with the VS2019 wizzad. When debug it I get an exception when the queue is created, saying that no device is available.

 

This is the function causing the exception:

queue create_device_queue() {
// create device selector for the device of your interest
#ifdef FPGA_EMULATOR
// DPC++ extension: FPGA emulator selector on systems without FPGA card
ext::intel::fpga_emulator_selector dselector;
#elif defined(FPGA)
// DPC++ extension: FPGA selector on systems with FPGA card
ext::intel::fpga_selector dselector;
#else
// the default device selector: it will select the most performant device
// available at runtime.
default_selector dselector; // also tried cpu_selector and gpu_selector - always the same exception
#endif

try {
// create the devices queue with the selector above and the exception
// handler to catch async runtime errors the device queue is used to enqueue
// the kernels and encapsulates all the states needed for execution
queue q(dselector, ehandler); // EXCEPTION HAPPENS HERE

return q;
}
catch (const sycl::exception& e) {
// catch the exception from devices that are not supported.
std::cerr << "An exception is caught when creating a device queue."
<< std::endl;
std::cerr << EXCEPTION_MSG;
std::terminate();
}
}

 

The contents of the exception message is:

No device of requested type available. Please check https://software.intel.com/content/www/us/en/develop/articles/intel-oneapi-dpcpp-system-requirements... -1 (CL_DEVICE_NOT_FOUND)

 

My hardware configuration is:

Intel® Core™ i9-9900K CPU @ 3.60GHz with 32GB RAM and 6GB available

 

 

I also tried the console application, but this one failed to compile:

Rebuild started...
1>------ Rebuild All started: Project: DPCPPConsoleApplication1, Configuration: Debug x64 ------
1>llvm-objcopy.exe: : error : 'x64\\Debug\\DPCPPConsoleApplication1.obj': function not supported
1>C:\PROGRA~2\Intel\oneAPI\compiler\20214~1.0\windows\bin\clang-offload-bundler: : error : 'llvm-objcopy' tool failed
1>dpcpp: : error : clang-offload-bundler command failed with exit code 1 (use -v to see invocation)
1>Done building project "DPCPPConsoleApplication1.vcxproj" -- FAILED.
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========

 

 

 

 

Any idea how I can get the sample to work?

 

Thanks,

Daniel

0 Kudos
5 Replies
DitiD_Intel
Moderator
770 Views

Hi,

 

Thank you for posting in Intel Communities.

 

>>No device of requested type available. Please check https://software.intel.com/content/www/us/en/develop/articles/intel-oneapi-dpcpp-system-requirements... -1 (CL_DEVICE_NOT_FOUND)

 

You will get an exception if the filtered list of devices does not include a device that satisfies the selector.

 

>> How I can get the sample to work?

 

Could you please try changing your backend to OpenCL on VS as mentioned below:

 

The Path is as follows - Go to project properties > Debugging > Environment > Click on Edit > Type the value of the environment variable.

 

Value of Environment variable - SYCL_DEVICE_FILTER=opencl:cpu

 

Then try rebuilding. 

 

Please do let us know if this helps resolve your issue.

 

Thanks & Regards,

Ditipriya.

 

 

Daniel_D
Beginner
765 Views

Hi Ditipriya,

 

thanks for your reply. I reinstalled windows10, VS2019 and oneAPI and I'm one step further. If I run the VectorAdd sample  I see now this:

Running on device: Intel(R) Graphics [0x3e98]
Vector size: 10000
Vector add failed on device.

 

switching to OpenCL using SYCL_DEVICE_FILTER=opencl:cpu the printed result is different:

Running on device: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz
Vector size: 10000
dummy: error: cannot open output file R:\Temp\OpenCLKernel-43e20f.dll: function not supported

 

Drive R is my Ramdrive with 6GB free space.

 

Daniel

 

 

 

DitiD_Intel
Moderator
727 Views

Hi,

 

Could you please try executing this sample code for vector add on oneAPI command prompt. 

 

Sample Code: 

 


#include <CL/sycl.hpp>
#include <vector>
#include <iostream>

#include <CL/sycl/INTEL/fpga_extensions.hpp>


using namespace sycl;

// Vector type and data size for this example.
size_t vector_size = 10000;
typedef std::vector<int> IntVector;

// Create an exception handler for asynchronous SYCL exceptions
static auto exception_handler = [](sycl::exception_list e_list) {
    for (std::exception_ptr const& e : e_list) {
        try {
            std::rethrow_exception(e);
        }
        catch (std::exception const& e) {
#if _DEBUG
            std::cout << "Failure" << std::endl;
#endif
            std::terminate();
        }
    }
};

//************************************
// Vector add in DPC++ on device: returns sum in 4th parameter "sum_parallel".
//************************************
void VectorAdd(queue& q, const IntVector& a_vector, const IntVector& b_vector,
    IntVector& sum_parallel) {
    // Create the range object for the vectors managed by the buffer.
    range<1> num_items{ a_vector.size() };

    // Create buffers that hold the data shared between the host and the devices.
    // The buffer destructor is responsible to copy the data back to host when it
    // goes out of scope.
    buffer a_buf(a_vector);
    buffer b_buf(b_vector);
    buffer sum_buf(sum_parallel.data(), num_items);

    // Submit a command group to the queue by a lambda function that contains the
    // data access permission and device computation (kernel).
    q.submit([&](handler& h) {
        // Create an accessor for each buffer with access permission: read, write or
        // read/write. The accessor is a mean to access the memory in the buffer.
        accessor a(a_buf, h, read_only);
        accessor b(b_buf, h, read_only);

        // The sum_accessor is used to store (with write permission) the sum data.
        accessor sum(sum_buf, h, write_only, no_init);

        // Use parallel_for to run vector addition in parallel on device. This
        // executes the kernel.
        //    1st parameter is the number of work items.
        //    2nd parameter is the kernel, a lambda that specifies what to do per
        //    work item. The parameter of the lambda is the work item id.
        // DPC++ supports unnamed lambda kernel by default.
        h.parallel_for(num_items, [=](auto i) { sum[i] = a[i] + b[i]; });
        });
}

//************************************
// Initialize the vector from 0 to vector_size - 1
//************************************
void InitializeVector(IntVector& a) {
    for (size_t i = 0; i < a.size(); i++) a.at(i) = i;
}

//************************************
// Demonstrate vector add both in sequential on CPU and in parallel on device.
//************************************
int main(int argc, char* argv[]) {
    // Change vector_size if it was passed as argument
    if (argc > 1) vector_size = std::stoi(argv[1]);
  
    cpu_selector d_selector;


    // Create vector objects with "vector_size" to store the input and output data.
    IntVector a, b, sum_sequential, sum_parallel;
    a.resize(vector_size);
    b.resize(vector_size);
    sum_sequential.resize(vector_size);
    sum_parallel.resize(vector_size);

    // Initialize input vectors with values from 0 to vector_size - 1
    InitializeVector(a);
    InitializeVector(b);

    try {
        queue q(d_selector, exception_handler);

        // Print out the device information used for the kernel code.
        std::cout << "Running on device: "
            << q.get_device().get_info<info::device::name>() << "\n";
        std::cout << "Vector size: " << a.size() << "\n";

        // Vector addition in DPC++
        VectorAdd(q, a, b, sum_parallel);
    }
    catch (exception const& e) {
        std::cout << "An exception is caught for vector add.\n";
        std::terminate();
    }

    // Compute the sum of two vectors in sequential for validation.
    for (size_t i = 0; i < sum_sequential.size(); i++)
        sum_sequential.at(i) = a.at(i) + b.at(i);

    // Verify that the two vectors are equal.  
    for (size_t i = 0; i < sum_sequential.size(); i++) {
        if (sum_parallel.at(i) != sum_sequential.at(i)) {
            std::cout << "Vector add failed on device.\n";
            return -1;
        }
    }

    int indices[]{ 0, 1, 2, (static_cast<int>(a.size()) - 1) };
    constexpr size_t indices_size = sizeof(indices) / sizeof(int);

    // Print out the result of vector add.
    for (int i = 0; i < indices_size; i++) {
        int j = indices[i];
        if (i == indices_size - 1) std::cout << "...\n";
        std::cout << "[" << j << "]: " << a[j] << " + " << b[j] << " = "
            << sum_parallel[j] << "\n";
    }

    a.clear();
    b.clear();
    sum_sequential.clear();
    sum_parallel.clear();

    std::cout << "Vector add successfully completed on device.\n";
    return 0;
}

 

 

 

In case the code is creating a similar kind of issue, please provide us with the logs by setting the command mentioned below and finally run the executable.

set SYCL_PI_TRACE=2

 

Thanks & Regards,

Ditipriya.

 

 

 

DitiD_Intel
Moderator
655 Views

Hi,

 

We have not heard back from you. Could you please provide an update?

 

Thanks & Regards,

Ditipriya.

 

DitiD_Intel
Moderator
602 Views

Hi,


We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.


Thanks & Regards,

Ditipriya.


Reply