Intel® oneAPI Base Toolkit
Support for core tools and libraries to build and deploy high-performance data-centric applications

Several DPCPP Document Issues

Jin_Hao
Employee
366 Views

Several DPCPP Document Issues as in the attached file.

 

Thanks

0 Kudos
3 Replies
AbhishekD_Intel
Moderator
353 Views

Hi,


Thanks for reaching out to us and thank you for the detailed information and suggestions. We are looking into all of your suggestions/issues internally and we will definitely update the documents, where ever it is required.



Warm Regards,

Abhishek


Alina_S_Intel
Employee
331 Views

Could you please send us a link to the slides you mentioned? If it is an internal link, please, contact me via email.


1.   USM documentation error or implementation bug?

I understand this concept in the following order:


usm_restricted_shared_allocations = 1 only if usm_shared_allocations = 1 . If usm_shared_allocations = 0, usm_restricted_shared_allocations should also be 0.  However, if  usm_shared_allocations = 1, usm_restricted_shared_allocations  is not required to be 1. So, based on your output:


  • Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz (by default_selector) - correct
    • usm_shared_allocations : 1  -> USM restricted shared allocations: 0
  • Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz (by host_selector) - correct
    •  USM shared allocations : 1 ->  USM restricted shared allocations: 0
  • Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz (by cpu_selector) - correct
    •  USM shared allocations: 1 ->  USM restricted shared allocations: 0
  • Intel(R) FPGA Emulation Device (by accelerator_selector) - correct
    •  USM shared allocations: 1 -> USM restricted shared allocations: 1
  • Intel PAC Platform (pac_ee00000) (by fpga_selector) - incorrect
    •  USM shared allocations: 0 -> USM restricted shared allocations: 1

Intel PAC Platform seems wrong to me. Is this the issue you reported?


2.   dpcpp program structure.pptx Version 2021/3/23Issue

  • page 9 : Device_selector is an advanced concept in comparison with cpu, gpu and default selectors. Device_selector can't be demonstrated as easily as other selectors but we need to mention it as an opportunity to define a custom selctor for full control.
  • page 11: thank you for letting us know.
  • page 20: I understand 'write a kernel' as 'write any operations with data' here. The order here is how it is in code.
  • page 22: thank you for letting us know.


3.   Data Parallel C++ - New Features P15

  • The difference between malloc_host and malloc_shared is that malloc_host is located on host and malloc_shared can migrate between host and device. The difference between implicit data management and malloc_device that that malloc_device is controlled by the developer and implicit data management is controlled by runtime [1]


  • malloc_host is implicit.

"Implicit data movement with USM is accomplished with host and shared allocations. With these types of allocations, we do not need to explicitly insert copy operations to move data between host and device. Instead, we simply access the pointers inside a kernel, and any required data movement is performed automatically without programmer intervention (as long as your device supports these allocations). This greatly simplifies porting of existing codes: simply replace any malloc or new with the appropriate USM allocation functions (as well as the calls to free to deallocate memory), and everything should just work." [2]


[1] DPC++ Book . Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL , Chapter 9 Data Management, page 67

[2] DPC++ Book . Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL , Chapter 9 Data Management, page 70:


Alina_S_Intel
Employee
294 Views

We will no longer respond to this thread.  

If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only.

Thanks,


Reply