<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re:How to port some cuda APIs in Intel® oneAPI DPC++/C++ Compiler</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1215123#M722</link>
    <description>&lt;P&gt;We are currently working on migration of  some CUDA Driver APIs. &lt;/P&gt;&lt;P&gt;Regarding your question on free memory, we do not have any API to the information at runtime. &lt;/P&gt;&lt;P&gt;We are working on this as well. &lt;/P&gt;&lt;BR /&gt;</description>
    <pubDate>Tue, 06 Oct 2020 04:41:14 GMT</pubDate>
    <dc:creator>Varsha_M_Intel</dc:creator>
    <dc:date>2020-10-06T04:41:14Z</dc:date>
    <item>
      <title>How to port some cuda APIs</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1209584#M706</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;I have some cuda code that I am trying to port. It is using cuda functions like cuInit, cuDeviceGet, cuDeviceGetAttribute, cuCtxCreate, cuCtxEnablePeerAccess, cuModuleLoadData, cuModuleGetFunction, cuFuncSetCacheConfig, cuLaunchKernel, cuCtxSynchronize, cuMemAlloc, cuMemGetInfo, cuArray3DCreate, cuTexObjectCreate, cuTexObjectDestroy.&lt;/P&gt;
&lt;P&gt;Since sycl code gets directly compiled into binary, I understand that I do not need cuModuleXxx functions as I do not need to load anything dynamically. For cuLaunchKernel, I understand that I need to call .submit on sycl::queue or&amp;nbsp;dpct::get_default_queue_wait() to start the kernel execution. For cuDeviceGet, I can use dpct::dev_mgr::instance().get_device(0). For cuDeviceGetAttributes, I can use dpct::dev_mgr::instance().get_device(0).get_device_info(dpct::device_info). For cuMemAlloc, I can use sycl::malloc_device/sycl::free. Is that correct?&lt;/P&gt;
&lt;P&gt;Since dpct failed to convert these functions, I would appreciate if you could help me with some pointers on what I can use instead. And any documentation on how to create textures would be helpful too.&lt;/P&gt;
&lt;P&gt;Regards,&lt;BR /&gt;Gagan&lt;/P&gt;</description>
      <pubDate>Tue, 15 Sep 2020 14:56:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1209584#M706</guid>
      <dc:creator>Shukla__Gagandeep</dc:creator>
      <dc:date>2020-09-15T14:56:14Z</dc:date>
    </item>
    <item>
      <title>Re:How to port some cuda APIs</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1209809#M708</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Yes, that's correct. The DPCT/DPC++ functions that you have mentioned are roughly equivalent to their CUDA counterpart. &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Coming to texture memory, DPCT partially supports texture memory API calls. Internally, texture memory gets mapped to SYCL image (via dpct::image).&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;It would be really helpful if you could send a small CUDA source file containing the CUDA APIs that fail to migrate.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Rahul&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 16 Sep 2020 08:35:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1209809#M708</guid>
      <dc:creator>RahulV_intel</dc:creator>
      <dc:date>2020-09-16T08:35:51Z</dc:date>
    </item>
    <item>
      <title>Re: Re:How to port some cuda APIs</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1211498#M711</link>
      <description>&lt;P&gt;Here are some code excerpts:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;    /* Check if the device has P2P access to any other device in the system. */
    for (int peer_num = 0; peer_num &amp;lt; count &amp;amp;&amp;amp; !info.has_peer_memory; peer_num++) {
      if (num != peer_num) {
        int can_access = 0;
        cuDeviceCanAccessPeer(&amp;amp;can_access, num, peer_num);
        info.has_peer_memory = (can_access != 0);
      }
    }

    int pci_location[3] = {0, 0, 0};
    cuDeviceGetAttribute(&amp;amp;pci_location[0], CU_DEVICE_ATTRIBUTE_PCI_DOMAIN_ID, num);
    cuDeviceGetAttribute(&amp;amp;pci_location[1], CU_DEVICE_ATTRIBUTE_PCI_BUS_ID, num);
    cuDeviceGetAttribute(&amp;amp;pci_location[2], CU_DEVICE_ATTRIBUTE_PCI_DEVICE_ID, num);
&lt;/LI-CODE&gt;&lt;LI-CODE lang="markup"&gt;    result = cuCtxCreate(&amp;amp;cuContext, ctx_flags, cuDevice);

  // Ensure array access over the link is possible as well (for 3D textures)
  cuda_assert(cuDeviceGetP2PAttribute(&amp;amp;can_access,
                                      CU_DEVICE_P2P_ATTRIBUTE_ARRAY_ACCESS_ACCESS_SUPPORTED,
                                      cuDevice,
                                      peer_device_cuda-&amp;gt;cuDevice));
  if (can_access == 0) {
    return false;
  }

    int result = cuCtxEnablePeerAccess(peer_device_cuda-&amp;gt;cuContext, 0);

    CUmodule cuModule
    result = cuModuleLoadData(&amp;amp;cuModule, cubin_data.c_str());

    cuModuleGetFunction(
      &amp;amp;functions.adaptive_stopping, cuModule, "kernel_cuda_adaptive_stopping")
    cuFuncSetCacheConfig(functions.adaptive_stopping, CU_FUNC_CACHE_PREFER_L1)
&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Code given above is from&amp;nbsp;&lt;A href="https://github.com/blender/blender" target="_self"&gt;blender&lt;/A&gt;&amp;nbsp; repo files -&amp;nbsp;&lt;A href="https://github.com/blender/blender/blob/master/intern/cycles/device/device_cuda.cpp" target="_self"&gt;intern/cycles/device/device_cuda.cpp&lt;/A&gt; and &lt;A href="https://github.com/blender/blender/blob/master/intern/cycles/device/cuda/device_cuda_impl.cpp" target="_self"&gt;intern/cycles/device/cuda/device_cuda_impl.cpp&lt;/A&gt; respectively.&lt;/P&gt;
&lt;P&gt;I understand that I dont need to explicitly load cubin files and compile them at run time as sycl code is already compiled. But is there a way to specify cache cofiguration as done by &lt;A href="https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__EXEC.html#group__CUDA__EXEC_1g40f8c11e81def95dc0072a375f965681" target="_self"&gt;cuFuncSetCacheConfig&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;And is there some function like &lt;A href="https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__PEER__ACCESS.html#group__CUDA__PEER__ACCESS_1g4c55c60508f8eba4546b51f2ee545393" target="_self"&gt;cuDeviceGetP2PAttribute&lt;/A&gt; and &lt;A href="https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__PEER__ACCESS.html#group__CUDA__PEER__ACCESS_1g0889ec6728e61c05ed359551d67b3f5a" target="_self"&gt;cuCtxEnablePeerAccess&lt;/A&gt; ?&lt;/P&gt;
&lt;P&gt;Regards,&lt;BR /&gt;Gagan&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 22 Sep 2020 14:25:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1211498#M711</guid>
      <dc:creator>Shukla__Gagandeep</dc:creator>
      <dc:date>2020-09-22T14:25:58Z</dc:date>
    </item>
    <item>
      <title>Re:How to port some cuda APIs</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1212127#M713</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Could you please try out with the latest beta09 release and let me know if there is any change?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Rahul&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 24 Sep 2020 07:59:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1212127#M713</guid>
      <dc:creator>RahulV_intel</dc:creator>
      <dc:date>2020-09-24T07:59:37Z</dc:date>
    </item>
    <item>
      <title>Re: Re:How to port some cuda APIs</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1212561#M714</link>
      <description>&lt;P&gt;Hi Rahul,&lt;/P&gt;
&lt;P&gt;I'm trying to convert &lt;A href="https://github.com/blender/blender/tree/master/intern/cycles" target="_self"&gt;cycles&lt;/A&gt; library from blender. I tried beta09 (on ubuntu 18.04).&amp;nbsp; I'm using same code, same folder structure and same compile_commands.json file that I used with beta08 to convert the project.&lt;/P&gt;
&lt;P&gt;dpct stops converting randomly (no progress messages on terminal even after waiting for 30+ minutes) and I have seen it atleast 7-8 times since yesterday. There is no output in output folder, not even log file so I have no idea what happened.&lt;/P&gt;
&lt;P&gt;There are a couple of cuda files in this library &lt;A href="https://github.com/blender/blender/blob/master/intern/cycles/kernel/kernels/cuda/filter.cu" target="_self"&gt;cycles/kernel/kernels/cuda/filter.cu&lt;/A&gt; and &lt;A href="https://github.com/blender/blender/blob/master/intern/cycles/kernel/kernels/cuda/kernel.cu" target="_self"&gt;cycles/kernel/kernels/cuda/kernel.cu&lt;/A&gt; . dpct stops responding when it reaches them. I had to remove first cuda file to get to second and then remove second too for it to proceed.&lt;/P&gt;
&lt;P&gt;Few times it failed with SIGABRT. I have attached the log files (.diags.log and conversion log file). Error message on terminal is:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;...
Processing: /home/kuljeet/Downloads/repos/Blender09/intern/cycles/render/denoising.cpp
Processing: /home/kuljeet/Downloads/repos/Blender09/intern/cycles/kernel/kernels/cpu/filter.cpp
Processing: /home/kuljeet/Downloads/repos/Blender09/intern/cycles/util/util_debug.cpp
Processing: /home/kuljeet/Downloads/repos/Blender09/intern/cycles/blender/blender_geometry.cpp
Processing: /home/kuljeet/Downloads/repos/Blender09/intern/opencolorio/ocio_impl_glsl.cc
Processing: /home/kuljeet/Downloads/repos/Blender09/intern/cycles/kernel/kernels/cpu/kernel_split_avx2.cpp
terminate called after throwing an instance of 'std::length_error'
  what():  basic_string::_M_create

dpct error: meet signal:SIGABRT Intel(R) DPC++ Compatibility Tool trys to write analysis reports and terminates...
&lt;/LI-CODE&gt;
&lt;P&gt;Error message is not really much useful.&lt;/P&gt;
&lt;P&gt;Can't comment on converted code quality as I have not been able to get it to convert successfully.&lt;/P&gt;
&lt;P&gt;Command used for conversion:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;dpct --cuda-include-path=/usr/local/cuda/include -p=compile_commands.json --out-root=dpctx --in-root=. --output-file=conv_errs.txt&lt;/LI-CODE&gt;
&lt;P&gt;Regards,&lt;BR /&gt;Gagan&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 25 Sep 2020 18:34:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1212561#M714</guid>
      <dc:creator>Shukla__Gagandeep</dc:creator>
      <dc:date>2020-09-25T18:34:27Z</dc:date>
    </item>
    <item>
      <title>Re: Re:How to port some cuda APIs</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1212861#M715</link>
      <description>&lt;P&gt;Hi Rahul,&lt;/P&gt;
&lt;P&gt;Tried converting the code with beta09 but I see no change. All the functions related to cuda setup are let as it is in the converted code.&lt;/P&gt;
&lt;P&gt;So is there any parallel functions for: cuDeviceCanAccessPeer, cuDeviceGetP2PAttribute, cuCtxEnablePeerAccess, cuOccupancyMaxPotentialBlockSize, cuCtxSynchronize&lt;/P&gt;
&lt;P&gt;I also see function used in cuda code to get total memory and free memory info. I understand that I can use cl::sycl::info::device::max_mem_alloc_size or cl::sycl::info::device::global_mem_size/ cl::sycl::info::device::local_mem_size to get some information about memory but is there a way to get free memory info after launching a kernel to see how much memory it consumed. Cuda function is: cuMemGetInfo(&amp;amp;free_before, &amp;amp;total);&lt;/P&gt;
&lt;P&gt;Any documentation link would be helpful too.&lt;/P&gt;
&lt;P&gt;Regards,&lt;BR /&gt;Gagan&lt;/P&gt;
&lt;P&gt;PS: Converted code is attached. There is not much conversion taking place for these files. It isn't cuda code per se but cuda env setup code so may be dpct wasn't expected to much.&lt;/P&gt;</description>
      <pubDate>Sun, 27 Sep 2020 11:06:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1212861#M715</guid>
      <dc:creator>Shukla__Gagandeep</dc:creator>
      <dc:date>2020-09-27T11:06:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to port some cuda APIs</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1213111#M716</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;cuCtxSynchronize()&lt;/STRONG&gt;: CUDA contexts roughly map to SYCL contexts. This particular API essentially blocks the device until the previous tasks gets completed. In my opinion, &lt;STRONG&gt;queue.wait()&lt;/STRONG&gt; function would be a good equivalent to this, since every queue has a particular context associated with it.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&amp;nbsp;cuMemGetInfo(): &lt;/STRONG&gt;SYCL/DPC++ runtime automatically takes care of the memory management in case of a buffer/accessor model i.e data copy to/from the device is completely abstracted. Please refer to the SYCL specs 1.2.1.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For other CUDA APIs, DPCT subject matter experts(SME) will get in touch with you shortly.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;Rahul&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 01 Oct 2020 05:33:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1213111#M716</guid>
      <dc:creator>RahulV_intel</dc:creator>
      <dc:date>2020-10-01T05:33:41Z</dc:date>
    </item>
    <item>
      <title>Re:How to port some cuda APIs</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1215123#M722</link>
      <description>&lt;P&gt;We are currently working on migration of  some CUDA Driver APIs. &lt;/P&gt;&lt;P&gt;Regarding your question on free memory, we do not have any API to the information at runtime. &lt;/P&gt;&lt;P&gt;We are working on this as well. &lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 06 Oct 2020 04:41:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1215123#M722</guid>
      <dc:creator>Varsha_M_Intel</dc:creator>
      <dc:date>2020-10-06T04:41:14Z</dc:date>
    </item>
    <item>
      <title>Re:How to port some cuda APIs</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1289535#M1296</link>
      <description>&lt;P&gt;Hi Gagan,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Could you try migrating with the latest oneAPI (2021.2) version and let us know if it works?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Rahul&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 14 Jun 2021 06:58:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1289535#M1296</guid>
      <dc:creator>RahulV_intel</dc:creator>
      <dc:date>2021-06-14T06:58:10Z</dc:date>
    </item>
    <item>
      <title>Re:How to port some cuda APIs</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1291540#M1324</link>
      <description>&lt;P&gt;Hi Gagan,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Do you have any updates on this?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Rahul&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 21 Jun 2021 06:06:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1291540#M1324</guid>
      <dc:creator>RahulV_intel</dc:creator>
      <dc:date>2021-06-21T06:06:33Z</dc:date>
    </item>
    <item>
      <title>Re:How to port some cuda APIs</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1293762#M1337</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;I have not heard back from you; we won’t be monitoring this thread. If you need further assistance, please post a new thread.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Rahul&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 28 Jun 2021 05:51:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/How-to-port-some-cuda-APIs/m-p/1293762#M1337</guid>
      <dc:creator>RahulV_intel</dc:creator>
      <dc:date>2021-06-28T05:51:41Z</dc:date>
    </item>
  </channel>
</rss>

