Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel ICX Compiler , Intel® DPC++ Compatibility Tool, and GDB*

char array sample

bo__john
Beginner
1,719 Views

I wants to got sample code of char type array in the buffer matrix and with multiple loops with devcloud dpc++ compiler.

 

 

like

 
  struct s
    {
        int id;
        char a[50];
         };
       

 

struct s dt_v1[1188];

 

     strcpy(dt_v1[0].a, "102755703");
strcpy(dt_v1[1].a, "ab10");
strcpy(dt_v1[2].a, "cd10");
strcpy(dt_v1[3].a, "EF13");
strcpy(dt_v1[4].a, "5");
strcpy(dt_v1[5].a, "##184");
strcpy(dt_v1[6].a, "@@1");
strcpy(dt_v1[7].a, "&&13");
strcpy(dt_v1[8].a, "%%14");
strcpy(dt_v1[9].a, "!!1");

 

 

and how to transfer the string to int or float, like

 

dint_1 = atoi(dt_r1[dv_i1].a);

 

 

,

how to connect with parallel loop and buffer array. complete code sample needs. thanks.

 

buffer<char, 1>

    h.parallel_for<class computeB>(range<1>{N}, [=](id<1> ID) {
      accB[ID] = accA[ID] + 1;

 

 

 

so I searching a lots of SYCL documents, but platform devcloud oneAPI dpc++ not workable, and

how to write the full example code for that kinds operation,

 

https://github.com/intel/HPCKit-code-samples

 

 

this address with a lots of complex sample for array type with int and float, not multi matrix char array sample.

 

 

Thanks so lot.

 

 

0 Kudos
9 Replies
GouthamK_Intel
Moderator
1,719 Views

Hi,

Thanks for reaching out to us.

We are working on it and will get back to you.

 

-Goutham

0 Kudos
GouthamK_Intel
Moderator
1,719 Views

Hi John,

Please find the below sample code for creating buffers and accessors for struct.

#include <CL/sycl.hpp>
#include <iostream>

using namespace std;
using namespace cl::sycl;

 struct myStruct
    {
        int id=1234;
        char a[50];
    };

int main(){

    struct myStruct dt_v1[10];
    
    strcpy(dt_v1[0].a, "102755703");
    strcpy(dt_v1[1].a, "ab10");
    strcpy(dt_v1[2].a, "cd10");
    strcpy(dt_v1[3].a, "EF13");
    strcpy(dt_v1[4].a, "5");
    strcpy(dt_v1[5].a, "##184");
    strcpy(dt_v1[6].a, "@@1");
    strcpy(dt_v1[7].a, "&&13");
    strcpy(dt_v1[8].a, "%%14");
    strcpy(dt_v1[9].a, "!!1");
    
    int test1=atoi(dt_v1[4].a);
    std::cout<<"Converting char[] to int "<<test1<<std::endl;

// ---------SYCL SCOPE STARTS------------
    {
        default_selector device_selector; 
        queue device_queue(device_selector);
        cout<<device_queue.get_device().get_info<info::device::name>()<<std::endl;  //print name of the device it is running on.
        buffer<struct myStruct,1> a(dt_v1,range<1>{10});               
        device_queue.submit([&](handler &cgh){        
            auto Accessor =a.get_access<access::mode::write>(cgh);           
            cgh.parallel_for<class StructClass>(range<1>{10},[=](id<1> index){
             struct myStruct* myAcc=(struct myStruct*)(&Accessor[index]);      
                // your code logic starts from here..
                // to access array "a" use this    
                     //char* myArray=myAcc->a;
                // To access int "id" use this
                    //int myId=myAcc->id;
            });
        });
    }
    return 0;
}

Feel free to reach out to us, in case you have any more queries.

 

Thanks

Goutham

 

 

 

0 Kudos
bo__john
Beginner
1,719 Views

Hi,

Thank you for your quick response, and I try to this code with information, it is workable, but I didn't find the way,

how to do  individual  with function, and global declearation of struct. but I put it in main() function is runnable.

 

#include <CL/sycl.hpp>
#include <iostream>

using namespace std;
using namespace cl::sycl;

 struct myStruct
    {
        int id=1234;
        char a[50];
    };
   

   
struct myStruct dt_v1[1650];
 
int main(){

 
     abc();

 

}

// or void

int abc()

{


    strcpy(dt_v1[1].a, "102738922");
strcpy(dt_v1[2].a, "2019");
strcpy(dt_v1[3].a, "10");
strcpy(dt_v1[4].a, "25");
strcpy(dt_v1[5].a, "3U8093");
strcpy(dt_v1[6].a, "ZUUU");
strcpy(dt_v1[7].a, "WSSS");
strcpy(dt_v1[8].a, "2019");
strcpy(dt_v1[9].a, "10");
strcpy(dt_v1[10].a, "25");
strcpy(dt_v1[11].a, "13");
strcpy(dt_v1[12].a, "45");
strcpy(dt_v1[13].a, "0");
strcpy(dt_v1[14].a, "2019");
strcpy(dt_v1[15].a, "10");
strcpy(dt_v1[16].a, "25");
strcpy(dt_v1[17].a, "18");
strcpy(dt_v1[18].a, "40");
strcpy(dt_v1[19].a, "0");
strcpy(dt_v1[20].a, "I");
strcpy(dt_v1[21].a, "ac_1");
strcpy(dt_v1[22].a, "TYPE_C");
strcpy(dt_v1[23].a, "184");
strcpy(dt_v1[24].a, "43");
strcpy(dt_v1[25].a, "1");

...

}

 

 

int abc2()

{


    strcpy(dt_v2[1].a, "102738922");
strcpy(dt_v2[2].a, "2019");
strcpy(dt_v2[3].a, "10");
strcpy(dt_v2[4].a, "25");
strcpy(dt_v2[5].a, "3U8093");
strcpy(dt_v2[6].a, "ZUUU");
strcpy(dt_v2[7].a, "WSSS");
strcpy(dt_v2[8].a, "2019");
strcpy(dt_v2[9].a, "10");
strcpy(dt_v2[10].a, "25");
strcpy(dt_v2[11].a, "13");
strcpy(dt_v2[12].a, "45");
strcpy(dt_v2[13].a, "0");
strcpy(dt_v2[14].a, "2019");
strcpy(dt_v2[15].a, "10");
strcpy(dt_v2[16].a, "25");
strcpy(dt_v2[17].a, "18");
strcpy(dt_v2[18].a, "40");
strcpy(dt_v2[19].a, "0");
strcpy(dt_v2[20].a, "I");
strcpy(dt_v2[21].a, "ac_1");
strcpy(dt_v2[22].a, "TYPE_C");
strcpy(dt_v2[23].a, "184");
strcpy(dt_v2[24].a, "43");
strcpy(dt_v2[25].a, "1");

...

}

At second, possible running the parallel function like that ?

 please help give the dpcpp_parallel fuction like below, if i have more initial data array,

struct myStruct dt_v1[1650];

struct myStruct dt_v2[1650];

struct myStruct dt_v3[1650];

 

if all the function in main, so big volume code, the structure is hard find whats the function name.

 


void dpcpp_parallel(float *a,float *v,float *d)
{
    try{
        // Setting up a queue to default DPC++ device selected by runtime
        queue device_queue;
        // Setup buffers for input and output vectors
        buffer<float, 1> bufv1(a,range<1>(N));
        buffer<float, 1> bufv2(v, range<1>(N));
        buffer<float, 1> bufv3(d, range<1>(N));
    
        auto start_time = std::chrono::high_resolution_clock::now();
        std::cout<<"Target Device: "<<device_queue.get_device().get_info<info::device::name>()<<"\n";
        //Submit Command group function object to the queue
        device_queue.submit([&](handler& cgh) {

        auto acc_vect1 = bufv1.get_access<access::mode::read>(cgh);
        auto acc_vect2 = bufv2.get_access<access::mode::read>(cgh);
        auto acc_vect3 = bufv3.get_access<access::mode::write>(cgh);
    
        cgh.parallel_for<class CompIntegral>(range<1>(N), [=](id<1> i) {

            float dx = (float) ((acc_vect1 - acc_vect2)/d_x);
            float area_int = 0;

            for(int j =0; j< d_x; j++)
            {
                float xC = acc_vect1 + dx * j;
                float yC = function_x(xC);
                area_int = xC * yC;
                area_int += area_int;
            }
            acc_vect3 = area_int;


            });
        });
        device_queue.wait();
        auto current_time = std::chrono::high_resolution_clock::now();
        std::cout << "Parallel: Program has been running for " << std::chrono::duration<double>(current_time - start_time).count() << " seconds" << std::endl;
        }
     catch (cl::sycl::exception e) {
         std::cout << "SYCL exception caught: " << e.what() << std::endl;
     }
}


int main(){
    float lBound,uBound,i_area,d;

    //Initialize the lower bound and upper bound of the x axis arrays. Below we are initializing such that upper bound is always greater than the lower bound.
    for (int i=0;i<N;i++)
    {
        lBound = i+40 + 10;
        uBound = (i+1)*40 + 70;
        i_area = 0;
        d = 0;
    }
    //Call the dpcpp_parallel function with lBound and uBound as inputs and i_area array as the output
    dpcpp_parallel(lBound,uBound,i_area);

 

}

 

 

 

0 Kudos
GouthamK_Intel
Moderator
1,719 Views

Hi John,

Please find the below sample code as per your requirements.

 

#include <CL/sycl.hpp>
#include <iostream>

using namespace std;
using namespace cl::sycl;


 struct myStruct
    {
        int id=1234;
        char a[50];
    };

struct myStruct dt_v1[10];
struct myStruct dt_v2[10];


void abc1(){
    strcpy(dt_v1[0].a, "102755703");
    strcpy(dt_v1[1].a, "ab10");
    strcpy(dt_v1[2].a, "cd10");
    strcpy(dt_v1[3].a, "EF13");
    strcpy(dt_v1[4].a, "5");
    strcpy(dt_v1[5].a, "##184");
    strcpy(dt_v1[6].a, "@@1");
    strcpy(dt_v1[7].a, "&&13");
    strcpy(dt_v1[8].a, "%%14");
    strcpy(dt_v1[9].a, "!!1");

}


void abc2(){
    strcpy(dt_v2[0].a, "102755703");
    strcpy(dt_v2[1].a, "ab10");
    strcpy(dt_v2[2].a, "cd10");
    strcpy(dt_v2[3].a, "EF13");
    strcpy(dt_v2[4].a, "5");
    strcpy(dt_v2[5].a, "##184");
    strcpy(dt_v2[6].a, "@@1");
    strcpy(dt_v2[7].a, "&&13");
    strcpy(dt_v2[8].a, "%%14");
    strcpy(dt_v2[9].a, "!!1");

}


void dpcpp_parallel(){ 
    // ---------SYCL SCOPE STARTS------------
    {
        default_selector device_selector; 
        queue device_queue(device_selector);
        cout<<device_queue.get_device().get_info<info::device::name>()<<std::endl;  //print name of the device it is running on.
        buffer<struct myStruct,1> buff_dt_v1(dt_v1,range<1>{10});
        buffer<struct myStruct,1> buff_dt_v2(dt_v2,range<1>{10});
        device_queue.submit([&](handler &cgh){        
            auto acc_dt_v1 =buff_dt_v1.get_access<access::mode::write>(cgh);        
            auto acc_dt_v2 =buff_dt_v2.get_access<access::mode::write>(cgh); 
            cgh.parallel_for<class StructClass>(range<1>{10},[=](id<1> index){
             struct myStruct* myAcc1=(struct myStruct*)(&acc_dt_v1[index]);
             struct myStruct* myAcc2=(struct myStruct*)(&acc_dt_v2[index]);
            //**************your code logic starts from here**************************
                // to access array "a" use this    
                     char* myArray1=myAcc1->a;
                     char* myArray2=myAcc2->a;
                // To access int "id" use this
                    int myId=myAcc->id;
            });
        });
    }
}

int main(){

    int test1=atoi(dt_v1[4].a);
    std::cout<<"Converting char[] to int "<<test1<<std::endl;
    abc1(); // Initialisation for dt_v1[]
    abc2(); // Initialisation for dt_v2[]
    dpcpp_parallel();    
    return 0;
}

 

Hope this helps. Feel free to reach out to us, in case you have any more queries.

 

 

Thanks

Goutham

0 Kudos
bo__john
Beginner
1,719 Views

Hi, Goutham,

 

   I am copy this code to platform" ssh devcloud" using bash, it is not working. Please see the information below:

 

Makefile

 

 

//

DPCPP_CXX = dpcpp
DPCPP_CXXFLAGS = -o
DPCPP_LDFLAGS =
DPCPP_EXE_NAME = t27
DPCPP_SOURCES = src/t27.cpp

 

all:
    $(DPCPP_CXX) $(DPCPP_CXXFLAGS) $(DPCPP_EXE_NAME) $(DPCPP_SOURCES) $(DPCPP_LDFLAGS)

 

build_dpcpp:
    $(DPCPP_CXX) $(DPCPP_CXXFLAGS) $(DPCPP_EXE_NAME) $(DPCPP_SOURCES) $(DPCPP_LDFLAGS)

 


run:
    ./$(DPCPP_EXE_NAME)

 

//

 

 

 

instruction

 

//   

qsub -l nodes=2:gpu:ppn=2 -d . build.sh
    
 
    qsub -l nodes=2:gpu:ppn=2 -d . run.sh

//

 

 

 

result

 

 

 

//

 

u35272@login-1:~/exc/dpc1/dpc_4$ cat *o*493

########################################################################
#      Date:           Sat Jan 25 13:01:30 PST 2020
#    Job ID:           473493.v-qsvr-1.aidevcloud
#      User:           u35272
# Resources:           neednodes=2:gpu:ppn=2,nodes=2:gpu:ppn=2,walltime=06:00:00
########################################################################

dpcpp -o t27 src/t27.cpp
Makefile:23: recipe for target 'build_dpcpp' failed

 

//

 

 

and  code I writing also not working on this platform..

 

//

 

u35272@login-1:~/exc/dpc1/dpc_4$ cat src/t26.cpp
#include <CL/sycl.hpp>
#include <iostream>

using namespace std;
using namespace cl::sycl;

 struct myStruct
    {
        int id=1234;
        char a[50];
    };
    
    
struct myStruct dt_v1[1650];
 
int main(){

 
     abc();
     
     
     
 
    
    int test1=atoi(dt_v1[4].a);
    std::cout<<"Converting char[] to int "<<test1<<std::endl;

// ---------SYCL SCOPE STARTS------------
    {
        default_selector device_selector;
        queue device_queue(device_selector);
        cout<<device_queue.get_device().get_info<info::device::name>()<<std::endl;  //print name of the device it is running on.
        buffer<struct myStruct,1> a(dt_v1,range<1>{1650});               
        device_queue.submit([&](handler &cgh){        
            auto Accessor =a.get_access<access::mode::write>(cgh);           
            cgh.parallel_for<class StructClass>(range<1>{1650},[=](id<1> index){
             struct myStruct* myAcc=(struct myStruct*)(&Accessor[index]);      
                // your code logic starts from here..
                // to access array "a" use this    
                     //char* myArray=myAcc->a;
                // To access int "id" use this
                    //int myId=myAcc->id;
            });
        });
    }
    
    
      for (int i = 1; i < 1651; i++) {
    std::cout << dt_v1.a <<std::endl;
  }
    return 0;
}

 
int abc(){
     
     
 
    strcpy(dt_v1[1].a, "102738922");
strcpy(dt_v1[2].a, "2019");
strcpy(dt_v1[3].a, "10");
strcpy(dt_v1[4].a, "25");
strcpy(dt_v1[5].a, "3U8093");
strcpy(dt_v1[6].a, "ZUUU");
strcpy(dt_v1[7].a, "WSSS");
strcpy(dt_v1[8].a, "2019");
strcpy(dt_v1[9].a, "10");
strcpy(dt_v1[10].a, "25");
strcpy(dt_v1[11].a, "13");
strcpy(dt_v1[12].a, "45");
strcpy(dt_v1[13].a, "0");
strcpy(dt_v1[14].a, "2019");
strcpy(dt_v1[15].a, "10");

 

...

//

 

 

if I copy all function in main() function, it is workable.

 

//

u35272@login-1:~/exc/dpc1/dpc_4$ cat src/t25.cpp
#include <CL/sycl.hpp>
#include <iostream>

using namespace std;
using namespace cl::sycl;

 struct myStruct
    {
        int id=1234;
        char a[50];
    };
    

 
int main(){

 
     struct myStruct dt_v1[1650];
     
     
 
    strcpy(dt_v1[1].a, "102738922");
strcpy(dt_v1[2].a, "2019");
strcpy(dt_v1[3].a, "10");
strcpy(dt_v1[4].a, "25");
strcpy(dt_v1[5].a, "3U8093");
strcpy(dt_v1[6].a, "ZUUU");
strcpy(dt_v1[7].a, "WSSS");
strcpy(dt_v1[8].a, "2019");
strcpy(dt_v1[9].a, "10");
strcpy(dt_v1[10].a, "25");

 

 

...

 

strcpy(dt_v1[1649].a, "39");
strcpy(dt_v1[1650].a, "1");

 
 
    
    int test1=atoi(dt_v1[4].a);
    std::cout<<"Converting char[] to int "<<test1<<std::endl;

// ---------SYCL SCOPE STARTS------------
    {
        default_selector device_selector;
        queue device_queue(device_selector);
        cout<<device_queue.get_device().get_info<info::device::name>()<<std::endl;  //print name of the device it is running on.
        buffer<struct myStruct,1> a(dt_v1,range<1>{1650});               
        device_queue.submit([&](handler &cgh){        
            auto Accessor =a.get_access<access::mode::write>(cgh);           
            cgh.parallel_for<class StructClass>(range<1>{1650},[=](id<1> index){
             struct myStruct* myAcc=(struct myStruct*)(&Accessor[index]);      
                // your code logic starts from here..
                // to access array "a" use this    
                     //char* myArray=myAcc->a;
                // To access int "id" use this
                    //int myId=myAcc->id;
            });
        });
    }
    
    
      for (int i = 1; i < 1651; i++) {
    std::cout << dt_v1.a <<std::endl;
  }
    return 0;
}

 

 

//

0 Kudos
GouthamK_Intel
Moderator
1,719 Views

Hi John,

Can you please try running the below code again. I verified in devcloud and it is working fine. 

If still, your problem persists. please attach your source code, Makefile, build.sh, run.sh, .o**** file and .e***** files. So that we can investigate more about your issue.

#include <CL/sycl.hpp>
#include <iostream>

using namespace std;
using namespace cl::sycl;


 struct myStruct
    {
        int id=1234;
        char a[50];
    };

struct myStruct dt_v1[10];
struct myStruct dt_v2[10];


void abc1(){
    strcpy(dt_v1[0].a, "102755703");
    strcpy(dt_v1[1].a, "ab10");
    strcpy(dt_v1[2].a, "cd10");
    strcpy(dt_v1[3].a, "EF13");
    strcpy(dt_v1[4].a, "5");
    strcpy(dt_v1[5].a, "##184");
    strcpy(dt_v1[6].a, "@@1");
    strcpy(dt_v1[7].a, "&&13");
    strcpy(dt_v1[8].a, "%%14");
    strcpy(dt_v1[9].a, "!!1");

}


void abc2(){
    strcpy(dt_v2[0].a, "102755703");
    strcpy(dt_v2[1].a, "ab10");
    strcpy(dt_v2[2].a, "cd10");
    strcpy(dt_v2[3].a, "EF13");
    strcpy(dt_v2[4].a, "5");
    strcpy(dt_v2[5].a, "##184");
    strcpy(dt_v2[6].a, "@@1");
    strcpy(dt_v2[7].a, "&&13");
    strcpy(dt_v2[8].a, "%%14");
    strcpy(dt_v2[9].a, "!!1");

}


void dpcpp_parallel(){ 
    // ---------SYCL SCOPE STARTS------------
    {
        default_selector device_selector; 
        queue device_queue(device_selector);
        cout<<device_queue.get_device().get_info<info::device::name>()<<std::endl;  //print name of the device it is running on.
        buffer<struct myStruct,1> buff_dt_v1(dt_v1,range<1>{10});
        buffer<struct myStruct,1> buff_dt_v2(dt_v2,range<1>{10});
        device_queue.submit([&](handler &cgh){        
            auto acc_dt_v1 =buff_dt_v1.get_access<access::mode::write>(cgh);        
            auto acc_dt_v2 =buff_dt_v2.get_access<access::mode::write>(cgh); 
            cgh.parallel_for<class StructClass>(range<1>{10},[=](id<1> index){
             struct myStruct* myAcc1=(struct myStruct*)(&acc_dt_v1[index]);
             struct myStruct* myAcc2=(struct myStruct*)(&acc_dt_v2[index]);
            //**************your code logic starts from here**************************
                // to access array "a" use this    
                     char* myArray1=myAcc1->a;
                     char* myArray2=myAcc2->a;
                // To access int "id" use this
                    int myId=myAcc1->id;
            });
        });
    }
}

int main(){

    int test1=atoi(dt_v1[4].a);
    std::cout<<"Converting char[] to int "<<test1<<std::endl;
    abc1(); // Initialisation for dt_v1[]
    abc2(); // Initialisation for dt_v2[]
    dpcpp_parallel();    
    return 0;
}

 

Thanks

Goutham

0 Kudos
bo__john
Beginner
1,719 Views

Hi, Goutham,

 

  The last code is ok, but the data function must at the up side of the main function.

 

  I have new question, I wants copy the first array to second array as regular period, like that:

 

struct myStruct dt_v1[10];
struct myStruct dt_v2[10];
struct myStruct dt_v13[3];
struct myStruct dt_v14[3];
struct myStruct dt_v15[3];



void abc1(){
    strcpy(dt_v1[0].a, "102755703");
    strcpy(dt_v1[1].a, "ab10");
    strcpy(dt_v1[2].a, "cd10");
    strcpy(dt_v1[3].a, "13");
    strcpy(dt_v1[4].a, "aa5");
    strcpy(dt_v1[5].a, "aa184");
    strcpy(dt_v1[6].a, "1");
    strcpy(dt_v1[7].a, "&&13");
    strcpy(dt_v1[8].a, "%%14");
    strcpy(dt_v1[9].a, "!!1");

}

int i, j1, j2, j3, k;


// I put this loop in main() but compile ok,  run failed for a large number, 
//it only running more than 100, but my dat is more than 1000++

for(i=0; i<10; i++)
{
	j1 =i*3;
    j2 = j1+1;
    j3 = j1+2;
    k=j1/3;
strcpy(dt_v13.a, dt_v1[j1].a);
strcpy(dt_v14.a, dt_v1[j2].a);
strcpy(dt_v15.a, dt_v1[j3].a);

}


// this loop in main()
 for (int i = 0; i < 3; i++) {
    std::cout << dt_v3.a <<std::endl;
  }

 

 

 

 

my last code and result

int main(){

 
   
     int j, test1;
     
     
 

    abc(); // Initialisation for dt_v1[]
 
    dpcpp_parallel();    
 
     
      for (int i = 1; i < 1651; i++) {
		  j=i*25;
    std::cout << dt_v1.a <<std::endl;
 
  }
    return 0;
}

 

 

 

u35272@login-2:~/exc/dpc1/dpc_5$ cat *o*917

########################################################################
#      Date:           Wed Jan 29 12:59:16 PST 2020
#    Job ID:           477917.v-qsvr-1.aidevcloud
#      User:           u35272
# Resources:           neednodes=2:gpu:ppn=2,nodes=2:gpu:ppn=2,walltime=06:00:00
########################################################################

./t29
Intel(R) Gen9 HD Graphics NEO
102809408
102755703
102772414
102756988
102782321
102755748
102743577
102824987
102750121
102784113
102752389
102834161
102818434
102829292
102813096
102744544
102838772
102847134
102738468
102740481
102844083
102752193
102757704
102822715
102815659
102807175
102801481
102760240
102744845
102746524
102750141
102812579
102813198
102817848
102775056
102826594
102738494
102753935
102761366
102745783
102797165
102749800
102754511
102843550
102749845
102805684
102739925
102822741
102825670
102834045
102771447
102749888
102781802
102834752
102829958
102990200
102782250
102756310
102831321
102827544
102820133
102754463
102829326
102752257
102738743

Makefile:25: recipe for target 'run_dpcpp' failed

########################################################################
# End of output for job 477917.v-qsvr-1.aidevcloud
# Date: Wed Jan 29 12:59:19 PST 2020
########################################################################

 

 

 

then

At first copy dt_v1[0].a and dt_v1[5].a to new array dt_v3[0].a and dt_v3[1].a, then atoi to integer,

then operation to a integer array,

may be arithmetic operation , like sum to new array C[index] = sum, and where can do cout of those array.

please give a example like that:

 

 

 

 

        //Submitting command group to queue to compute matrix mulitiplication c=a*b
        device_queue.submit([&](handler &cgh){
            // Read from a and b, write to c
            auto A = a.get_access<access::mode::read>(cgh);
            auto B = b.get_access<access::mode::read>(cgh);
            auto C = c.get_access<access::mode::write>(cgh);

            int WidthA = a.get_range()[1];

            //Executing kernel
            cgh.parallel_for<class MatrixMult>(range<2>{M, P}, [=](id<2> index){
	        //Get global position in Y direction
	        int row = index[0];
	        //Get global position in X direction
	        int col = index[1];

	        double sum = 0.0;
	        //Compute the result of one element in c
	        for (int i = 0; i < WidthA; i++) {
	            sum += A[row] * B[col];
	        }

	        C[index] = sum;
            });

        });
    }    //End of scope, so we wait for kernel producing result data to host memory c_back to complete
   

 

 

 

and if I do the parallel loop with difference length of different array, use the function like below:

 

 

 

// Matrix size constants
#define SIZE     1200     // Must be a multiple of 8.
#define M        SIZE/8
#define N        SIZE/4
#define P        SIZE/2

     // Submitting command group to queue to initialize matrix a
        device_queue.submit([&](handler &cgh) {
            // Getting write only access to the buffer on a device
            auto Accessor = a.get_access<access::mode::write>(cgh);
            // Executing kernel
            cgh.parallel_for<class FillBuffer_a>( range<2>{M, N}, [=](id<2> index) {
                // a is identity matrix
                Accessor[index] = 1.0;
            });
        });
    
        //Submitting command group to queue to initialize matrix b
        device_queue.submit([&](handler &cgh) {
            // Getting write only access to the buffer on a device
            auto Accessor = b.get_access<access::mode::write>(cgh);
            //Executing kernel
            cgh.parallel_for<class FillBuffer_b>( range<2>{N, P}, [=](id<2> index){
	        // each column of b is the sequence 1,2,...,N	    
                Accessor[index] = index[0] + 1.;
            });    
        });   

 

 

 

 

Please give a complete code example.

 

Thank You!

 

John

0 Kudos
GouthamK_Intel
Moderator
1,719 Views

Hi John,

Glad to hear that the solution provided helped and your issue is resolved. 

Can you please raise a new thread for your new issue? 

Please confirm whether we can close this thread.

Thanks, Have a good day!

 

Regards,

Goutham

0 Kudos
GouthamK_Intel
Moderator
1,719 Views

Hi John,

Could you please confirm whether we can close this thread.

 

Regards

Goutham

0 Kudos
Reply