Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel ICX Compiler , Intel® DPC++ Compatibility Tool, and GDB*
561 Discussions

Problem about image(as a buffer) size and media_block_load/store

tanzl_ustc
Beginner
731 Views

Hello,intel engineers.

Recently I met a problem about image size.As you will see in image below,I wrote a simple code for you to understand my problem.when SIZE=64,128 even 2048,it works well.As the size gets larger, problems arise.

tanzl_ustc_0-1652022935841.png

I was wondering if it is due to the size limitation of image.Could you please give me some possible reasons of this problem,thanks~.I have attach this code(acc_test.cpp) in my zip file.

Besides,I really need a very large image because recently I am rewriting a CM kernel code(gemm_genx.cpp,attached in the zip file) to its DPC++ ESIMD version.The CM kernel code is about large-scale gemm.At first I wrote a code(gemm_genx_dpcpp.cpp in attached zip file) using share_pointer to transport data between host and device,but I found that the algorithm requires 2D read/write.However media_block_load  and media_block_store requires accessor instead of pointer,so I chose to use block_load/store.Through coordinate conversion, one-dimensional reading replaces two-dimensional reading.After testing, the DPC++ code performance is very poor.After that I used vtune to get an analysis of the DPC++ code, and I found that heavy address computation and memory access operation could be the bottleneck of my DPC++ code.Now I have to use media_block_load and media_block_store.Could you please tell me how to properly use media_block_load/store,thank you very much!

Labels (1)
0 Kudos
5 Replies
tanzl_ustc
Beginner
720 Views

Besides,I have tried to write a code using buffer/accessor to transport data between host and device.I have already found that the problem of my new code could be the size of image I mentioned above.I was wondering if you could take a look on my code to see if there are some other problems exist.I really lack experience in using media_block_load/store, and it is difficult to find suitable examples to learn on the Internet.Thank you very much,the file has already been attached.

0 Kudos
tanzl_ustc
Beginner
720 Views

Some codes are generated from my code generator so the variable name seems to be meaningless, please take gemm_genx_dpcpp.cpp as a reference,thanks!

0 Kudos
NoorjahanSk_Intel
Moderator
656 Views

Hi,

 

Thanks for reaching out to us.

 

>>Could you please give me some possible reasons of this problem?

If you change your code to below code then it will work without any errors.

 

#include <CL/sycl.hpp>
#include<iostream>
using namespace sycl;
using namespace std;
const long int SIZE = 2048*(2048/4);
int main() {    
  sycl::float4 *b = new sycl::float4[SIZE];
  for (int i = 0; i<SIZE; ++i) {
    b[i] = (float)-i;
  }
  {
    image<2>  b_device(b,image_channel_order::rgba,
                       image_channel_type::fp32, range<2>(2048, 512));
  }
  delete[] b;
  return 0;
}

 

 

If you want to use the initial code then you can increase the stack size using the below command and can run your program without any errors.

 

ulimit -s 65536

 

Here, we increase the stack size to 64MB as the default stack size is limited by 10MB.

 

>>I used vtune to get an analysis of the DPC++ code

 

Could you please let us know how did you analyze the performance of the dpc++ code(steps if any)?

 

>>the DPC++ code performance is very poor

Could you please let us know on what factors you are stating that dpc++ performance is poor?

 

Also please provide us with the command that you have used to run the CM kernel code

 

Thanks & Regards,

Noorjahan.

 

0 Kudos
NoorjahanSk_Intel
Moderator
618 Views

Hi,


We haven't heard back from you. Could you please provide an update on your issue?


Thanks & Regards,

Noorjahan.


0 Kudos
NoorjahanSk_Intel
Moderator
592 Views

Hi,


We have not heard back from you, so we will close this inquiry now. If you need further assistance, please post a new question.


Thanks & Regards,

Noorjahan.


0 Kudos
Reply