Solved: malloc_device a 2D array

leilag · ‎07-15-2021

Hi,

This is mainly a follow-up on my previous issue .

@NoorjahanSk_Intel recommended creating a 1D pointer and iterating through "row*array_width+column". But the point in creating those three row arrays was to treat each of those rows separately. Basically, like the previous example, we would loop over all of the indices in a row in one kernel and we loop over the elements of other rows in different kernels. This means that if I want to create a 1D array instead of my current 2D arrays, I would need to pass the whole array to a kernel and define the range such that it contains certain elements of the array (i.e. one third of the array). I haven't seen any examples or functionalities in DPC++ that can define a range which is a sub-array of the original array (or even with a specific stride). Am I missing some concepts in DPC++ or is this kind of functionality not supported by DPC++ yet?

I tried @NoorjahanSk_Intel 's solution and it did work using "malloc_shared" but this isn't my ultimate goal. I was hoping to be able to manage memory explicitly and allocating memory on the device but the code crashes running on intel's gpus with segmentation fault. Here is the code:

#include <CL/sycl.hpp>
#include <array>
#include <iostream>
#if FPGA || FPGA_EMULATOR
#include <CL/sycl/INTEL/fpga_extensions.hpp>
#endif

using namespace sycl;

#define M 4
#define N 5
#define M_LEN (M + 2)
#define N_LEN (N + 2)
constexpr size_t  DOMAIN_SIZE = M_LEN*N_LEN;
#define DIM 1

int main() {
    auto R = range<1>{DOMAIN_SIZE};
    default_selector d_selector;
    queue q(d_selector);

    int **u = malloc_device<int *>(DOMAIN_SIZE, q);
    int **v = malloc_device<int *>(DOMAIN_SIZE, q);
    int **p = malloc_device<int *>(DOMAIN_SIZE, q);
    for(int i=0;i<3;i++) {
            u[i] = malloc_device<int>(DOMAIN_SIZE, q);
            v[i] = malloc_device<int>(DOMAIN_SIZE, q);
            p[i] = malloc_device<int>(DOMAIN_SIZE, q);
    }
   free(u,q);
   free(v,q);
   free(p,q);
   return 0;
}

Am I doing something wrong or is this kind of memory allocation not supported on the device?

Thanks,

Leila

NoorjahanSk_Intel · ‎08-12-2021

Hi,

>>Are there any sources with examples for the USM model that I could look at.

The one which i have referred in previous response is page no. of pdf. Apologies for any confusion.

You can find below page numbers of textbook

Please refer textbook DataParallel C++ textbook by James Reinders page no: 54

Please refer textbook DataParallel C++ textbook by James Reinders page no: 160

>>Would you recommend any sources in that direction?

Please refer below links

https://techdecoded.intel.io/essentials/dpc-part-1-an-introduction-to-the-new-programming-model/#gs.8r8j97

https://techdecoded.intel.io/essentials/dpc-part-2-programming-best-practices/#gs.8r8md1

https://software.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html

DataPrallel C++ textbook by James Reinders Textbook.

Thanks & Regards

Noorjahan

View solution in original post

NoorjahanSk_Intel · ‎07-16-2021

Hi,

Thanks for Reaching out to us.

1)To pass a sub-array of original array we can try using explicit memory operations which is a action part in command group handler.

In this operation we have an action that copies data between different indexes.

Please refer textbook DataPrallel C++ textbook by James Reinders page no:80.

2)>>segmentation fault

If we use malloc_device()allocation type ,it allocates memory on device which is not accessible on host. If we want to move data from device to host and vice-versa we should use memcpy() operation.

We can use memset function for malloc_device() allocation type. This function is used to initialize the memory.Instead of using for loop on host we can use memset as it works in parallel way.

We can also achieve this through writing kernels to fill the memory.

Please refer textbook DataPrallel C++ textbook by James Reinders page no:184.

Thanks & Regards

Noorjahan.

leilag · ‎07-20-2021

Hello @NoorjahanSk_Intel ,

Thanks for your reply.

I need to spend some more time to go through the book and understand your response.

I am writing to request you to keep this thread open until I am done with some of the deadlines in the next couple of weeks. I would like to follow up on this topic.

Best,

Leila

NoorjahanSk_Intel · ‎08-03-2021

Hi,

Reminder:

Could you please let us know whether you have any issues related to the information provided above.

Thanks & Regards

Noorjahan

leilag · ‎08-05-2021

Hello Noorjahan,

Thanks for your followup.

I did look into those pages from the book but I saw examples of the buffer model. I couldn't quite understand what you meant. Are there any sources with examples for the USM model that I could look at.

Also, I just realized that I could check out the SYCL tutorials/sources as well to learn about DPC++! Would you recommend any sources in that direction?

Thanks,

Leila

NoorjahanSk_Intel · ‎08-12-2021

Hi,

>>Are there any sources with examples for the USM model that I could look at.

The one which i have referred in previous response is page no. of pdf. Apologies for any confusion.

You can find below page numbers of textbook

Please refer textbook DataParallel C++ textbook by James Reinders page no: 54

Please refer textbook DataParallel C++ textbook by James Reinders page no: 160

>>Would you recommend any sources in that direction?

Please refer below links

https://techdecoded.intel.io/essentials/dpc-part-1-an-introduction-to-the-new-programming-model/#gs.8r8j97

https://techdecoded.intel.io/essentials/dpc-part-2-programming-best-practices/#gs.8r8md1

https://software.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html

DataPrallel C++ textbook by James Reinders Textbook.

Thanks & Regards

Noorjahan

NoorjahanSk_Intel · ‎08-19-2021

Hi,

Has the information provided helped? If yes then could you please confirm whether we can close this thread from our end.

Thanks & Regards

Noorjahan

leilag · ‎08-19-2021

Hello Noorjahan,

Sorry for my late reply. I was traveling!

>> The one which i have referred in previous response is page no. of pdf. Apologies for any confusion.

The page number is not still clear to me. Could you please repeat that? Thanks!

Other than that. The resources are helpful. Thanks!

Best,

Leila

NoorjahanSk_Intel · ‎08-23-2021

Hi,

>>The page number is not still clear to me. Could you please repeat that?

1) To pass part of a array we can try using explicit memory operations one of the action part of command group handler.

Using these operations we can copy data between pointers/accessors.

Please go through Chapter:2(page no:54) from DataParallel C++ textbook by James Reinder for better understanding.

2) malloc_device() allocation type is accessible only on device side.

To initialize the memory we can use memset function that allocates memory which is available on both device and host for malloc_device() allocation type.

Please go through Chapter:6 (page no:160) from DataParallel C++ textbook by James Reinder for better understanding.

I hope this will clear your queries.

As you marked as a solution, can we close this thread.

Thanks & Regards

Noorjahan

leilag · ‎08-23-2021

Thanks for the clarification.

NoorjahanSk_Intel · ‎08-23-2021

Hi,

Thanks for the confirmation!

As this issue has been resolved, we will no longer respond to this thread.

If you require any additional assistance from Intel, please start a new thread.

Thanks & Regards

Noorjahan.