Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*
713 Discussions

How can I submit two queues parallelly in DPC++?

tanzl_ustc
Beginner
906 Views

Hello, I am a beginner of DPC++. Recently I ran into a problem about submitting two queues parallelly on two devices.

Now I have two Intel GPUs, I want to submit my two queues to them. One queue for one GPU. So maybe I only need half original time to compute my task.

Could you give me a piece of simple example code about parallel task submission? I can not post my code online for some reasons. Thanks!

0 Kudos
3 Replies
SeshaP_Intel
Moderator
868 Views

Hi,

 

Thanks for posting in Intel communities.

 

We can use a custom device selector to run multiple device queues parallelly.

Custom Device Selector is a user-defined class, which is derived from the device selector class. 

We can select any device(CPU (or) any Accelerator) using this Custom Device Selector.

Please refer to the below code snippet for more details.

You can use 2 Intel GPUs to run the program through a custom device selector as defined in the below code.

 

#include<CL/sycl.hpp>
#include<vector>
#include<iostream>
#include<string>
using namespace cl::sycl;
using namespace std;
static const int N = 4;
class my_selector1 : public device_selector
{
public:
int operator()(const device &dev) const
{
int score = -1;
if ( (dev.is_gpu()) && (dev.get_info<info::device::name>().find("GPU1")!= std::string::npos) )//Replace GPU1 with your available INTEL GPU
{
score += 25;
std::cout << "my_selector1 = "<< dev.get_info<info::device::name>()<<"\n" ;
}

return score;
}
};
class my_selector2 : public device_selector
{
public:
int operator()(const device &dev) const
{
int score = -1;
if ( (dev.is_gpu()) && (dev.get_info<info::device::name>().find("GPU2")== std::string::npos) )//Replace GPU2 with your available INTEL GPU
{
score += 800;
std::cout << "my_selector2 = "<< dev.get_info<info::device::name>()<<"\n" ;
}
return score;
}
};

int main()
{
auto Q1 = queue{ my_selector1{} };
int *a1 = malloc_shared<int>(N, Q1);
for(int i=0; i<N; i++) a1[i] = i;
std::cout << "Selected device: " <<Q1.get_device().get_info<info::device::name>() << "\n";
Q1.single_task([=](){
    for(int i=0;i<N;i++){
      a1[i] *= 2;
    }
  }).wait();

auto Q2 = queue{ my_selector2{} };
int *a2 = malloc_shared<int>(N, Q2);
for(int i=0; i<N; i++) a2[i] = i;
std::cout << "Selected device: " <<Q2.get_device().get_info<info::device::name>() << "\n";
Q2.single_task([=](){
    for(int i=0;i<N;i++){
      a2[i] *= 3;
    }
  }).wait();

for(int i=0; i<N; i++) std::cout << a1[i] << std::endl;
for(int i=0; i<N; i++) std::cout << a2[i] << std::endl;
free(a1, Q1);
free(a2, Q2);

return 0;
}

 

Thanks and Regards,

Pendyala Sesha Srinivas

 

0 Kudos
SeshaP_Intel
Moderator
836 Views

Hi,


We haven't heard back from you. Could you please provide an update on your issue?


Thanks and Regards,

Pendyala Sesha Srinivas


0 Kudos
SeshaP_Intel
Moderator
806 Views

Hi,


We assume that your issue is resolved. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.


Thanks and Regards,

Pendyala Sesha Srinivas


0 Kudos
Reply