Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*

malloc_device a 2D array

leilag
初心者
6,072件の閲覧回数

Hi,

 

This is mainly a follow-up on my previous issue .

  1. @NoorjahanSk_Intel recommended creating a 1D pointer and iterating through "row*array_width+column". But the point in creating those three row arrays was to treat each of those rows separately. Basically, like the previous example, we would loop over all of the indices in a row in one kernel and we loop over the elements of other rows in different kernels. This means that if I want to create a 1D array instead of my current 2D arrays, I would need to pass the whole array to a kernel and define the range such that it contains certain elements of the array (i.e. one third of the array). I haven't seen any examples or functionalities in DPC++ that can define a range which is a sub-array of the original array (or even with a specific stride). Am I missing some concepts in DPC++ or is this kind of functionality not supported by DPC++ yet?
  2. I tried @NoorjahanSk_Intel 's solution and it did work using "malloc_shared" but this isn't my ultimate goal. I was hoping to be able to manage memory explicitly and allocating memory on the device but the code crashes running on intel's gpus with segmentation fault. Here is the code:
    #include <CL/sycl.hpp>
    #include <array>
    #include <iostream>
    #if FPGA || FPGA_EMULATOR
    #include <CL/sycl/INTEL/fpga_extensions.hpp>
    #endif
    
    using namespace sycl;
    
    #define M 4
    #define N 5
    #define M_LEN (M + 2)
    #define N_LEN (N + 2)
    constexpr size_t  DOMAIN_SIZE = M_LEN*N_LEN;
    #define DIM 1
    
    int main() {
        auto R = range<1>{DOMAIN_SIZE};
        default_selector d_selector;
        queue q(d_selector);
    
        int **u = malloc_device<int *>(DOMAIN_SIZE, q);
        int **v = malloc_device<int *>(DOMAIN_SIZE, q);
        int **p = malloc_device<int *>(DOMAIN_SIZE, q);
        for(int i=0;i<3;i++) {
                u[i] = malloc_device<int>(DOMAIN_SIZE, q);
                v[i] = malloc_device<int>(DOMAIN_SIZE, q);
                p[i] = malloc_device<int>(DOMAIN_SIZE, q);
        }
       free(u,q);
       free(v,q);
       free(p,q);
       return 0;
    }

Am I doing something wrong or is this kind of memory allocation not supported on the device?

 

Thanks,

Leila

0 件の賞賛
1 解決策
NoorjahanSk_Intel
モデレーター
5,841件の閲覧回数

Hi,

>>Are there any sources with examples for the USM model that I could look at. 

 

The one which i have referred in previous response is page no. of pdf. Apologies for any confusion.

You can find below page numbers of textbook

 

Please refer textbook DataParallel C++ textbook by James Reinders page no: 54

Please refer textbook DataParallel C++ textbook by James Reinders page no: 160

 

>>Would you recommend any sources in that direction?

 

Please refer below links

 

https://techdecoded.intel.io/essentials/dpc-part-1-an-introduction-to-the-new-programming-model/#gs.8r8j97

https://techdecoded.intel.io/essentials/dpc-part-2-programming-best-practices/#gs.8r8md1

https://software.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html

DataPrallel C++ textbook by James Reinders Textbook.

 

Thanks & Regards

Noorjahan

 

元の投稿で解決策を見る

10 返答(返信)
NoorjahanSk_Intel
モデレーター
6,044件の閲覧回数

Hi,

Thanks for Reaching out to us.

 1)To pass a sub-array of original array we can try using explicit memory operations which is a action part in command group handler.

In this operation we have an action that copies data between different indexes.

Please refer textbook DataPrallel C++ textbook by James Reinders page no:80.


 2)>>segmentation fault

  If we use malloc_device()allocation type ,it allocates memory on device which is not accessible on host. If we want to move data from device to host and vice-versa we should use memcpy() operation.

  We can use memset function for malloc_device() allocation type. This function is used to initialize the memory.Instead of using for loop on host we can use memset as it works in parallel way.

We can also achieve this through writing kernels to fill the memory.

Please refer textbook DataPrallel C++ textbook by James Reinders page no:184.


Thanks & Regards

Noorjahan.


leilag
初心者
5,994件の閲覧回数

Hello @NoorjahanSk_Intel ,

 

Thanks for your reply.

I need to spend some more time to go through the book and understand your response.

I am writing to request you to keep this thread open until I am done with some of the deadlines in the next couple of weeks. I would like to follow up on this topic.

 

Best,

Leila

NoorjahanSk_Intel
モデレーター
5,913件の閲覧回数

Hi,

Reminder:

Could you please let us know whether you have any issues related to the information provided above.


Thanks & Regards

Noorjahan


leilag
初心者
5,873件の閲覧回数

Hello Noorjahan,

 

Thanks for your followup.

I did look into those pages from the book but I saw examples of the buffer model. I couldn't quite understand what you meant. Are there any sources with examples for the USM model that I could look at. 

Also, I just realized that I could check out the SYCL tutorials/sources as well to learn about DPC++! Would you recommend any sources in that direction?

 

Thanks,

Leila

NoorjahanSk_Intel
モデレーター
5,842件の閲覧回数

Hi,

>>Are there any sources with examples for the USM model that I could look at. 

 

The one which i have referred in previous response is page no. of pdf. Apologies for any confusion.

You can find below page numbers of textbook

 

Please refer textbook DataParallel C++ textbook by James Reinders page no: 54

Please refer textbook DataParallel C++ textbook by James Reinders page no: 160

 

>>Would you recommend any sources in that direction?

 

Please refer below links

 

https://techdecoded.intel.io/essentials/dpc-part-1-an-introduction-to-the-new-programming-model/#gs.8r8j97

https://techdecoded.intel.io/essentials/dpc-part-2-programming-best-practices/#gs.8r8md1

https://software.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html

DataPrallel C++ textbook by James Reinders Textbook.

 

Thanks & Regards

Noorjahan

 

NoorjahanSk_Intel
モデレーター
5,816件の閲覧回数

Hi,

 Has the information provided helped? If yes then could you please confirm whether we can close this thread from our end.

 

Thanks & Regards

Noorjahan

 

leilag
初心者
5,804件の閲覧回数

Hello Noorjahan,

 

Sorry for my late reply. I was traveling!

>> The one which i have referred in previous response is page no. of pdf. Apologies for any confusion. 

The page number is not still clear to me. Could you please repeat that? Thanks!

 

Other than that. The resources are helpful. Thanks!

 

Best,

Leila

 

NoorjahanSk_Intel
モデレーター
5,760件の閲覧回数

Hi,

>>The page number is not still clear to me. Could you please repeat that?


1) To pass part of a array we can try using explicit memory operations one of the action part of command group handler.

Using these operations we can copy data between pointers/accessors.

Please go through Chapter:2(page no:54) from DataParallel C++ textbook by James Reinder for better understanding.


2) malloc_device() allocation type is accessible only on device side.

To initialize the memory we can use memset function that allocates memory which is available on both device and host for malloc_device() allocation type.

Please go through Chapter:6 (page no:160) from DataParallel C++ textbook by James Reinder for better understanding.


I hope this will clear your queries.


As you marked as a solution, can we close this thread.


Thanks & Regards

Noorjahan


leilag
初心者
5,754件の閲覧回数

Thanks for the clarification.

NoorjahanSk_Intel
モデレーター
5,728件の閲覧回数

Hi,

Thanks for the confirmation!

As this issue has been resolved, we will no longer respond to this thread.

If you require any additional assistance from Intel, please start a new thread.


Thanks & Regards

Noorjahan.


返信