Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel ICX Compiler , Intel® DPC++ Compatibility Tool, and GDB*

malloc_device a 2D array

leilag
Novice
2,799 Views

Hi,

 

This is mainly a follow-up on my previous issue .

  1. @NoorjahanSk_Intel recommended creating a 1D pointer and iterating through "row*array_width+column". But the point in creating those three row arrays was to treat each of those rows separately. Basically, like the previous example, we would loop over all of the indices in a row in one kernel and we loop over the elements of other rows in different kernels. This means that if I want to create a 1D array instead of my current 2D arrays, I would need to pass the whole array to a kernel and define the range such that it contains certain elements of the array (i.e. one third of the array). I haven't seen any examples or functionalities in DPC++ that can define a range which is a sub-array of the original array (or even with a specific stride). Am I missing some concepts in DPC++ or is this kind of functionality not supported by DPC++ yet?
  2. I tried @NoorjahanSk_Intel 's solution and it did work using "malloc_shared" but this isn't my ultimate goal. I was hoping to be able to manage memory explicitly and allocating memory on the device but the code crashes running on intel's gpus with segmentation fault. Here is the code:
    #include <CL/sycl.hpp>
    #include <array>
    #include <iostream>
    #if FPGA || FPGA_EMULATOR
    #include <CL/sycl/INTEL/fpga_extensions.hpp>
    #endif
    
    using namespace sycl;
    
    #define M 4
    #define N 5
    #define M_LEN (M + 2)
    #define N_LEN (N + 2)
    constexpr size_t  DOMAIN_SIZE = M_LEN*N_LEN;
    #define DIM 1
    
    int main() {
        auto R = range<1>{DOMAIN_SIZE};
        default_selector d_selector;
        queue q(d_selector);
    
        int **u = malloc_device<int *>(DOMAIN_SIZE, q);
        int **v = malloc_device<int *>(DOMAIN_SIZE, q);
        int **p = malloc_device<int *>(DOMAIN_SIZE, q);
        for(int i=0;i<3;i++) {
                u[i] = malloc_device<int>(DOMAIN_SIZE, q);
                v[i] = malloc_device<int>(DOMAIN_SIZE, q);
                p[i] = malloc_device<int>(DOMAIN_SIZE, q);
        }
       free(u,q);
       free(v,q);
       free(p,q);
       return 0;
    }

Am I doing something wrong or is this kind of memory allocation not supported on the device?

 

Thanks,

Leila

0 Kudos
1 Solution
NoorjahanSk_Intel
Moderator
2,568 Views

Hi,

>>Are there any sources with examples for the USM model that I could look at. 

 

The one which i have referred in previous response is page no. of pdf. Apologies for any confusion.

You can find below page numbers of textbook

 

Please refer textbook DataParallel C++ textbook by James Reinders page no: 54

Please refer textbook DataParallel C++ textbook by James Reinders page no: 160

 

>>Would you recommend any sources in that direction?

 

Please refer below links

 

https://techdecoded.intel.io/essentials/dpc-part-1-an-introduction-to-the-new-programming-model/#gs.8r8j97

https://techdecoded.intel.io/essentials/dpc-part-2-programming-best-practices/#gs.8r8md1

https://software.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html

DataPrallel C++ textbook by James Reinders Textbook.

 

Thanks & Regards

Noorjahan

 

View solution in original post

0 Kudos
10 Replies
NoorjahanSk_Intel
Moderator
2,771 Views

Hi,

Thanks for Reaching out to us.

 1)To pass a sub-array of original array we can try using explicit memory operations which is a action part in command group handler.

In this operation we have an action that copies data between different indexes.

Please refer textbook DataPrallel C++ textbook by James Reinders page no:80.


 2)>>segmentation fault

  If we use malloc_device()allocation type ,it allocates memory on device which is not accessible on host. If we want to move data from device to host and vice-versa we should use memcpy() operation.

  We can use memset function for malloc_device() allocation type. This function is used to initialize the memory.Instead of using for loop on host we can use memset as it works in parallel way.

We can also achieve this through writing kernels to fill the memory.

Please refer textbook DataPrallel C++ textbook by James Reinders page no:184.


Thanks & Regards

Noorjahan.


0 Kudos
leilag
Novice
2,721 Views

Hello @NoorjahanSk_Intel ,

 

Thanks for your reply.

I need to spend some more time to go through the book and understand your response.

I am writing to request you to keep this thread open until I am done with some of the deadlines in the next couple of weeks. I would like to follow up on this topic.

 

Best,

Leila

0 Kudos
NoorjahanSk_Intel
Moderator
2,640 Views

Hi,

Reminder:

Could you please let us know whether you have any issues related to the information provided above.


Thanks & Regards

Noorjahan


0 Kudos
leilag
Novice
2,600 Views

Hello Noorjahan,

 

Thanks for your followup.

I did look into those pages from the book but I saw examples of the buffer model. I couldn't quite understand what you meant. Are there any sources with examples for the USM model that I could look at. 

Also, I just realized that I could check out the SYCL tutorials/sources as well to learn about DPC++! Would you recommend any sources in that direction?

 

Thanks,

Leila

0 Kudos
NoorjahanSk_Intel
Moderator
2,569 Views

Hi,

>>Are there any sources with examples for the USM model that I could look at. 

 

The one which i have referred in previous response is page no. of pdf. Apologies for any confusion.

You can find below page numbers of textbook

 

Please refer textbook DataParallel C++ textbook by James Reinders page no: 54

Please refer textbook DataParallel C++ textbook by James Reinders page no: 160

 

>>Would you recommend any sources in that direction?

 

Please refer below links

 

https://techdecoded.intel.io/essentials/dpc-part-1-an-introduction-to-the-new-programming-model/#gs.8r8j97

https://techdecoded.intel.io/essentials/dpc-part-2-programming-best-practices/#gs.8r8md1

https://software.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html

DataPrallel C++ textbook by James Reinders Textbook.

 

Thanks & Regards

Noorjahan

 

0 Kudos
NoorjahanSk_Intel
Moderator
2,543 Views

Hi,

 Has the information provided helped? If yes then could you please confirm whether we can close this thread from our end.

 

Thanks & Regards

Noorjahan

 

0 Kudos
leilag
Novice
2,531 Views

Hello Noorjahan,

 

Sorry for my late reply. I was traveling!

>> The one which i have referred in previous response is page no. of pdf. Apologies for any confusion. 

The page number is not still clear to me. Could you please repeat that? Thanks!

 

Other than that. The resources are helpful. Thanks!

 

Best,

Leila

 

0 Kudos
NoorjahanSk_Intel
Moderator
2,487 Views

Hi,

>>The page number is not still clear to me. Could you please repeat that?


1) To pass part of a array we can try using explicit memory operations one of the action part of command group handler.

Using these operations we can copy data between pointers/accessors.

Please go through Chapter:2(page no:54) from DataParallel C++ textbook by James Reinder for better understanding.


2) malloc_device() allocation type is accessible only on device side.

To initialize the memory we can use memset function that allocates memory which is available on both device and host for malloc_device() allocation type.

Please go through Chapter:6 (page no:160) from DataParallel C++ textbook by James Reinder for better understanding.


I hope this will clear your queries.


As you marked as a solution, can we close this thread.


Thanks & Regards

Noorjahan


0 Kudos
leilag
Novice
2,481 Views

Thanks for the clarification.

0 Kudos
NoorjahanSk_Intel
Moderator
2,455 Views

Hi,

Thanks for the confirmation!

As this issue has been resolved, we will no longer respond to this thread.

If you require any additional assistance from Intel, please start a new thread.


Thanks & Regards

Noorjahan.


0 Kudos
Reply