<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re:Cannot create DLL which offloads part of the code to Intel Xe GPU in Intel® oneAPI DPC++/C++ Compiler</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1389955#M2251</link>
    <description>&lt;P&gt;Please try using explicit array sections for the offload mapping and not just pointers, for example:&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;void Compute(int A[MAX_TEST][MAX_TEST], int B[MAX_TEST][MAX_TEST], int C[MAX_TEST][MAX_TEST])&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;{&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; int a = 0;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; bool is_cpu = true;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;#ifdef USE_POINTER&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;#pragma omp target teams distribute parallel for map(to: A, B, a) map(tofrom: C) map(from: is_cpu)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;#else&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;#pragma omp target teams distribute parallel for map(to: A[0:MAX_TEST][0:MAX_TEST], B[0:MAX_TEST][0:MAX_TEST], a) map(tofrom: C[0:MAX_TEST][0:MAX_TEST]) map(from: is_cpu)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;#endif&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; for (int i = 0; i &amp;lt; MAX_TEST; i++) {&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  a = a + 1;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  if (i == 0) is_cpu = omp_is_initial_device();&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  for (int j = 0; j &amp;lt; MAX_TEST; j++) {&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;   for (int k = 0; k &amp;lt; MAX_TEST; k++) {&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;    C[i][j] += A[i][k] * B[k][j];&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;   }&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  }&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; }&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; if (a == 0) {&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  std::cout &amp;lt;&amp;lt; "Offloaded on GPU &lt;LI-EMOJI id="lia_slightly-smiling-face" title=":slightly_smiling_face:"&gt;&lt;/LI-EMOJI&gt; " &amp;lt;&amp;lt; std::endl;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; }&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; else&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  std::cout &amp;lt;&amp;lt; "Executed on CPU &lt;LI-EMOJI id="lia_disappointed-face" title=":disappointed_face:"&gt;&lt;/LI-EMOJI&gt; " &amp;lt;&amp;lt; std::endl;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; if (! is_cpu) {&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  std::cout &amp;lt;&amp;lt; "Offloaded on GPU &lt;LI-EMOJI id="lia_slightly-smiling-face" title=":slightly_smiling_face:"&gt;&lt;/LI-EMOJI&gt; " &amp;lt;&amp;lt; std::endl;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; }&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; else&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  std::cout &amp;lt;&amp;lt; "Executed on CPU &lt;LI-EMOJI id="lia_disappointed-face" title=":disappointed_face:"&gt;&lt;/LI-EMOJI&gt; " &amp;lt;&amp;lt; std::endl;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;}&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;And you can use &lt;SPAN style="font-family: courier;"&gt;omp_is_initial_device()&lt;/SPAN&gt; to check whether the code is executed on the GPU or the host, see &lt;A href="https://www.openmp.org/spec-html/5.1/openmpsu166.html" target="_blank"&gt;https://www.openmp.org/spec-html/5.1/openmpsu166.html&lt;/A&gt;.&lt;/P&gt;&lt;BR /&gt;</description>
    <pubDate>Sat, 04 Jun 2022 00:08:48 GMT</pubDate>
    <dc:creator>Klaus-Dieter_O_Intel</dc:creator>
    <dc:date>2022-06-04T00:08:48Z</dc:date>
    <item>
      <title>Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1384656#M2157</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;I have a DLL comprised of 3 cpp files. I have separated the "offloadable" (basically matrix multiplications) code into one of those 3 files.&lt;/P&gt;
&lt;P&gt;I have downloaded &amp;amp; installed the latest version of the Intel OneAPI HPC Toolkit.&lt;/P&gt;
&lt;P&gt;If I try to compile &amp;amp; link all them into a DLL with this command (executed in the Intel OneAPI Command Prompt for Intel64 for Visual Studio 2022):&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;icx /LD /out:MatMul.dll DLLMAIN.CPP MAIN.CPP OFFLOADABLE.CPP /c /nologo /Qiopenmp /Qopenmp-targets=spir64&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;I obtain:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;clang-cl: error: The use of '-LD' is not supported with '/Qiopenmp /Qopenmp-targets=spir64'.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;If I compile these 3 files separatedly, with these commands (same console):&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;icx DLLMAIN.CPP /c /nologo /Qiopenmp&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;icx MAIN.CPP /c /Qiopenmp&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;icx OFFLOADABLE.CPP /c /nologo /Qiopenmp /Qopenmp-targets=spir64&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;It creates the corresponding .obj correctly, but when I try to compile them with a &lt;STRONG&gt;source.def&lt;/STRONG&gt; file which defines what this DLL must export, with:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;icx MAIN.OBJ DLLMAIN.OBJ OFFLOADABLE.OBJ /LD -o MatMul.dll /DEF:source.def /Qiopenmp /Qopenmp-targets=spir64&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;Response is:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Intel(R) oneAPI DPC++/C++ Compiler for applications running on Intel(R) 64, Version 2022.1.0 Build 20220316&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Copyright (C) 1985-2022 Intel Corporation. All rights reserved.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;a-3292ef.obj : warning LNK4078: multiple '__CLANG_OFFLOAD_BUNDLE__openmp-s' sections found with different attributes (40500040)&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;And it creates the DLL, but when I try to use it from a .NET program which has a declaration to use one of its methods, an error pops-up saying:&lt;/P&gt;
&lt;P&gt;"Error in MyInitialize: Unable to load DLL 'MatMul.dll' or one of its dependencies: The specified module could not be found. (0x8007007E).&lt;/P&gt;
&lt;P&gt;Can you please help me? If I could solve this, then parts of DLLs could be also off-loaded to Intel's hardware opening lots of possibilities for businesses.&lt;/P&gt;
&lt;P&gt;I look forward to your help, thanks in advance.&lt;/P&gt;
&lt;P&gt;David.&lt;/P&gt;</description>
      <pubDate>Mon, 16 May 2022 20:19:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1384656#M2157</guid>
      <dc:creator>dtroncho</dc:creator>
      <dc:date>2022-05-16T20:19:52Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1384766#M2158</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you for posting in Intel Communities.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Could you please provide us with the sample reproducer codes(DLLMAIN.CPP, MAIN.CPP, OFFLOADABLE.CPP &amp;amp; source.def file) to investigate more on your issue?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;
&lt;P&gt;Santosh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 17 May 2022 07:07:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1384766#M2158</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2022-05-17T07:07:44Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1384811#M2160</link>
      <description>&lt;P&gt;Hello Santosh,&lt;/P&gt;
&lt;P&gt;Thank you for your reply.&lt;/P&gt;
&lt;P&gt;Please find attached the compressed (with default windows compressor) file &lt;STRONG&gt;GPU_CPU_Example.zip&lt;/STRONG&gt;&amp;nbsp;including:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Dll1 subfolder.&lt;/STRONG&gt;&amp;nbsp;It is a very simple Visual C++ DLL project which is usable from .NET. It has been developed with Microsoft Visual Studio Community 2022 (64-bit) - Version 17.3.0 Preview 1.0.&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Dll1_VB_Test subfolder&lt;/STRONG&gt;. You probably do not need to change this project. It is a very simple Visual Basic .NET Console project which uses the previous DLL. Developed with Microsoft Visual Studio Community 2019 (64-bit), because I don't have VB.NET on my VS2022, but it is very simple and should open with Visual Studio 2022.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The &lt;STRONG&gt;Dll1_VB_Test&lt;/STRONG&gt; is just a project to test the DLL.&lt;/P&gt;
&lt;P&gt;If you open a CMD console and run &lt;STRONG&gt;\GPU_CPU_Example\Dll1_VB_Test\bin\Release\net5.0\&lt;/STRONG&gt;&lt;STRONG&gt;Dll1_VB_Test.exe&lt;/STRONG&gt; as I send it, it should run ok but only on CPU.&lt;/P&gt;
&lt;P&gt;If you open (with Visual Studio 2022) the Visual C++ DLL project Dll1 and change the parameter &lt;STRONG&gt;Enable OpenMP Offloading&lt;/STRONG&gt; to&amp;nbsp;&lt;STRONG&gt;Generate x86 + SPIR64 fat binary (/Qopenmp-targets:spir64)&lt;/STRONG&gt;, and follow the steps:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Rebuild the Dll1 project.&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;Copy the &lt;/SPAN&gt;&lt;STRONG style="font-family: inherit;"&gt;Dll1.dll&lt;/STRONG&gt;&lt;SPAN&gt; from&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG style="font-family: inherit;"&gt;\GPU_CPU_Example\Dll1\x64\Release&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;to&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG style="font-family: inherit;"&gt;\GPU_CPU_Example\Dll1_VB_Test\bin\Release\net5.0&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;Open a CMD console, CD to&amp;nbsp;&lt;STRONG style="font-family: inherit;"&gt;\GPU_CPU_Example\Dll1_VB_Test\bin\Release\net5.0&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;Run: &lt;STRONG&gt;\GPU_CPU_Example\Dll1_VB_Test\bin\Release\net5.0\Dll1_VB_Test.exe&lt;/STRONG&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;The program will return:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Unhandled exception. System.DllNotFoundException: Unable to load DLL 'Dll1.dll' or one of its dependencies: The specified module could not be found. (0x8007007E)&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;at Dll1_VB_Test.Module1.CPULoad()&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;at Dll1_VB_Test.Program.Main(String[] args) in C:\temp\Dll1_VB_Test\Program.vb:line 10&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;I have tried the GPU code of the Dll1 as an EXE file, and it worked on the GPU. Then, what should I do for this DLL to run (partially, the part which is OpenMP "target") on the GPU?&lt;/P&gt;
&lt;P&gt;I have also tried to compile the cpp files separatedly, indicating&amp;nbsp;&lt;STRONG&gt;/Qopenmp-targets:spir64&lt;/STRONG&gt; only to the &lt;STRONG&gt;offloadable.cpp&lt;/STRONG&gt; file, and then (icx) linking them, but then same error occurs in the tester.&lt;/P&gt;
&lt;P&gt;If you need any other info, please request.&lt;/P&gt;
&lt;P&gt;I look forward to your feedback and thanks in advance.&lt;/P&gt;
&lt;P&gt;Best regards,&lt;/P&gt;
&lt;P&gt;David.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 17 May 2022 08:43:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1384811#M2160</guid>
      <dc:creator>dtroncho</dc:creator>
      <dc:date>2022-05-17T08:43:54Z</dc:date>
    </item>
    <item>
      <title>Re:Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1384860#M2164</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We can see that you are using an unsupported version of Visual Studio. Could you please try with any of the supported versions of Visual Studio and let us know if you still face the same issue?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;To check the supported versions of Visual Studio please refer to the below link:&lt;/P&gt;&lt;P&gt;&lt;A href="https://www.intel.com/content/www/us/en/developer/articles/reference-implementation/intel-compilers-compatibility-with-microsoft-visual-studio-and-xcode.html" target="_blank"&gt;https://www.intel.com/content/www/us/en/developer/articles/reference-implementation/intel-compilers-compatibility-with-microsoft-visual-studio-and-xcode.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;&lt;P&gt;Santosh&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 17 May 2022 12:48:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1384860#M2164</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2022-05-17T12:48:38Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1384909#M2166</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for your response.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have uninstalled and installed Microsoft Visual Studio Community 2022 (64-bit) - Current Version 17.2.0.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;After installing the indicated MVS2022, I reinstalled Intel oneAPI 2022.2 Base Toolkit and, later, the HPC Toolkit 2022.2.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I repeated the steps indicated above and obtained same results.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you need any other info, please request.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I look forward to your help&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best regards and thanks in advance,&lt;/P&gt;
&lt;P&gt;David.&lt;/P&gt;</description>
      <pubDate>Tue, 17 May 2022 15:05:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1384909#M2166</guid>
      <dc:creator>dtroncho</dc:creator>
      <dc:date>2022-05-17T15:05:26Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385139#M2167</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Could you please confirm whether you are using VS2022 17.0.2 or VS2022 17.2.0 version ?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Because VS2022 17.0.2 is a supported version whereas VS2022 17.2.0 is NOT yet supported. Refer to the below screenshot:&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="SantoshY_Intel_0-1652850568144.png" style="width: 400px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/29534i85526B83E1EB6C6D/image-size/medium?v=v2&amp;amp;px=400&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="SantoshY_Intel_0-1652850568144.png" alt="SantoshY_Intel_0-1652850568144.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;
&lt;P&gt;Santosh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 18 May 2022 05:10:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385139#M2167</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2022-05-18T05:10:09Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385157#M2168</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Yes, for the response of the previous message I used VS2022 version 17.2.0, but I assumed forward compatibility. As far as I know, I cannot downgrade my VS2022 to 17.0.2.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Anyways, I have also tried with&amp;nbsp;Microsoft Visual Studio Community 2019 Version 16.11.14 and, unfortunately, same result.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have provided you with the same example that I am running, so you can try it on a supported version and see if it comes out the same result (most probably) that I indicated above, and then, if you find a solution it will (most probably) apply to the other versions of Visual Studio.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We need to solve this problem, as it will allow DLLs developed with VS and Intel oneAPI to offload code to Intel GPUs.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I look forward to your help and thanks in advance,&lt;/P&gt;
&lt;P&gt;David.&lt;/P&gt;</description>
      <pubDate>Wed, 18 May 2022 06:06:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385157#M2168</guid>
      <dc:creator>dtroncho</dc:creator>
      <dc:date>2022-05-18T06:06:11Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385575#M2177</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We generated a dynamic linking library(DLL) and created a C++ application that uses the DLL to offload the code on Intel GPUs successfully using the Intel C++ compiler 2022.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We tried it using the Visual Studio 17.0.0 version &amp;amp; we have followed the steps from the link below:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://docs.microsoft.com/en-us/cpp/build/walkthrough-creating-and-using-a-dynamic-link-library-cpp?view=msvc-170" target="_blank" rel="noopener"&gt;https://docs.microsoft.com/en-us/cpp/build/walkthrough-creating-and-using-a-dynamic-link-library-cpp?view=msvc-170&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Please find the attachments below:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;DLL1&lt;/STRONG&gt;: This project will generate a Dynamic Linking Library.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;usingDLL&lt;/STRONG&gt;: It is a sample C++ project which uses the DLL and offloads the code on Intel GPUs.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We also tried running your .net application using the command line and it worked fine at our end as shown in the below screenshot. Before running the program(Module.exe), copy the Dll1.dll file to the directory where we have Module1.exe.&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="dll.png" style="width: 999px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/29596iF3336E91AFC1E4EB/image-size/large?v=v2&amp;amp;px=999&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="dll.png" alt="dll.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;
&lt;P&gt;Santosh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 19 May 2022 10:34:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385575#M2177</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2022-05-19T10:34:16Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385730#M2180</link>
      <description>&lt;P&gt;Hello Santosh,&lt;/P&gt;
&lt;P&gt;Thank you very much for your response and your effort.&lt;/P&gt;
&lt;P&gt;Unfortunately and apparently, your code does not offload to my GPU Intel Xe. For you to have all the info, the laptop where I am running your code is:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;O.S.: Windows 10 PRO&lt;/P&gt;
&lt;P&gt;CPU/GPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz 2.80 GHz&lt;/P&gt;
&lt;P&gt;RAM:&amp;nbsp;16,0 GB (15,7 GB usable)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Let me explain the steps that I have followed (I did not change any other thing from your example):&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;I downloaded and extracted Downloads.zip&lt;/LI&gt;
&lt;LI&gt;I opened (with Microsoft Visual Studio Community 2022 (64-bit) - Current Version 17.2.0) the project usingDLL and changed your local paths to mine.&lt;/LI&gt;
&lt;LI&gt;I rebuilt as it is. Rebuilds fine.&lt;/LI&gt;
&lt;LI&gt;I opened a CMD session, went to path and executed \UsingDLL\x64\Debug\&lt;STRONG&gt;UsingDLL.exe&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;Pops-up error: "... execution cannot continue... omptarget.dll nof found...".&lt;/LI&gt;
&lt;LI&gt;I searched in my computer &lt;STRONG&gt;omptarget.dll&lt;/STRONG&gt;, and found it here:&amp;nbsp;&lt;STRONG&gt;C:\Program Files (x86)\Intel\oneAPI\compiler\2022.1.0\windows\bin&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;Copied omptarget.dll from that path to:&amp;nbsp;\UsingDLL\x64\Debug&lt;/LI&gt;
&lt;LI&gt;I opened a CMD session and executed again UsingDLL.exe: &lt;STRONG&gt;it works! But, is it off-loading to GPU? Let us check.&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;I opened with VS2022 project Dll1 and added some very little change to validate where the offloadable code is being executed. What I added is a module variable (a) which is changed inside the code which should run on the GPU but that variable is not included in the map(tofrom) clause (omp directive), therefore, any change in this variable cannot come back to CPU. Also added new public function&amp;nbsp;&lt;STRONG&gt;executedOnGPU&lt;/STRONG&gt; which returns true if the &lt;STRONG&gt;a&lt;/STRONG&gt; variable remains unchanged, indicating that the variable was changed inside the GPU.&lt;/LI&gt;
&lt;LI&gt;I rebuilt the &lt;STRONG&gt;Dll1.dll&lt;/STRONG&gt; and copied it to&amp;nbsp;\UsingDLL\x64\Debug.&lt;/LI&gt;
&lt;LI&gt;Open with VS2002 project UsingDll1. I needed to add to the project the include directory of the Dll1 project, and a call to check with executedOnGPU and inform where was the code executed. With that, it rebuilds fine.&lt;/LI&gt;
&lt;LI&gt;I open a CMD session and execute UsingDLL.exe, and returns:&lt;BR /&gt;&lt;BR /&gt;C:\Users\david\Downloads\Downloads\usingDll\usingDll\x64\Debug&amp;gt;usingDll.exe&lt;BR /&gt;3 3 3&lt;BR /&gt;3 3 3&lt;BR /&gt;3 3 3&lt;BR /&gt;Executed on CPU &lt;LI-EMOJI id="lia_disappointed-face" title=":disappointed_face:"&gt;&lt;/LI-EMOJI&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;I have then compressed the example and uploaded it here.&lt;/P&gt;
&lt;P&gt;I look forward to your feedback on what should I do to really execute this example on the GPU.&lt;/P&gt;
&lt;P&gt;Thank you very much in advance.&lt;/P&gt;
&lt;P&gt;Best regards,&lt;/P&gt;
&lt;P&gt;David.&lt;/P&gt;</description>
      <pubDate>Thu, 19 May 2022 19:37:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385730#M2180</guid>
      <dc:creator>dtroncho</dc:creator>
      <dc:date>2022-05-19T19:37:15Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385748#M2181</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;For your convenience, I have changed all paths to be relative in the&amp;nbsp;&lt;STRONG&gt;usingDll&lt;/STRONG&gt; project, so that you can just download, open and try. Uploaded again.&lt;/P&gt;
&lt;P&gt;I look forward to your help. Thanks in advance.&lt;/P&gt;
&lt;P&gt;David.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 19 May 2022 20:24:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385748#M2181</guid>
      <dc:creator>dtroncho</dc:creator>
      <dc:date>2022-05-19T20:24:18Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385885#M2182</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;I&gt;&amp;gt;&amp;gt;"&amp;nbsp;it works! But, is it off-loading to GPU? Let us check"&lt;/I&gt;&lt;/P&gt;
&lt;P&gt;To check whether the code is offloading to GPU or CPU, we need to set LIBOMPTARGET_DEBUG to 1.&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;set LIBOMPTARGET_DEBUG=1&lt;/LI-CODE&gt;
&lt;P&gt;Now, run the executable(usingDLL.exe). It will generate debug information through which we can know whether the GPU offloading is done or not.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I tried using this LIBOMPTARGET_DEBUG flag and was able to get the offloading information as shown below:&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="SantoshY_Intel_0-1653031541729.png" style="width: 400px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/29640iCDE170335E4ECB50/image-size/medium?v=v2&amp;amp;px=400&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="SantoshY_Intel_0-1653031541729.png" alt="SantoshY_Intel_0-1653031541729.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="SantoshY_Intel_2-1653031602948.png" style="width: 400px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/29642iBBCC2B897AA015DC/image-size/medium?v=v2&amp;amp;px=400&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="SantoshY_Intel_2-1653031602948.png" alt="SantoshY_Intel_2-1653031602948.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For your reference, I am attaching the complete debug log below.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;
&lt;P&gt;Santosh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 20 May 2022 07:27:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385885#M2182</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2022-05-20T07:27:30Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385957#M2186</link>
      <description>&lt;P&gt;Hello Santosh,&lt;/P&gt;
&lt;P&gt;Thank you for your email and effort.&lt;/P&gt;
&lt;P&gt;My problem persists because the only way I have been able to off-load and run code on my Intel Xe GPU is by compiling the offloadable.cpp file as an .OBJ and then linking that offloadable.obj to build the &lt;STRONG&gt;executable&lt;/STRONG&gt; file, both with &lt;STRONG&gt;icx&lt;/STRONG&gt; and options&amp;nbsp;&lt;STRONG&gt;/Qiopenmp /Qopenmp-targets=spir64&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;In other words: &lt;STRONG&gt;I am still not able to compile a Visual C++ dll which eventually runs on an Intel GPU&lt;/STRONG&gt;.&lt;/P&gt;
&lt;P&gt;As I mentioned, I have confirmed that example to run on the GPU because:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;A variable created outside of the OMP section, and changed inside it, does not show that change outside the omp section.&lt;/LI&gt;
&lt;LI&gt;Every time I run it, I can see 100% activity in my GPU with the Windows task manager. And if I increase the iterations, the 100% activity of the GPU lasts longer and proportionally with the iterations.&lt;/LI&gt;
&lt;LI&gt;It even blocks my whole Windows for a while, I assume when transferrig the data CPU&amp;lt;-&amp;gt;GPU. Which by the way, I find that worrying.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;On the other hand, unfortunately, I do not think that your example really runs on the GPU.&amp;nbsp;Please take a look below to a debug dumped by my successful example and compare it with yours. Your debug finishes like this:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Libomptarget --&amp;gt; Done registering entries!&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;3 3 3&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;3 3 3&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;3 3 3&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Libomptarget --&amp;gt; Deinit target library!&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;But take a look to the things that mine does between the 2 Libomptarget messages&lt;STRONG&gt;:&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Libomptarget --&amp;gt; Done registering entries!&lt;/STRONG&gt;&lt;BR /&gt;Libomptarget --&amp;gt; Entering target region with entry point 0x00007ff7c4958e70 and device Id 0&lt;BR /&gt;Libomptarget --&amp;gt; Call to omp_get_num_devices returning 1&lt;BR /&gt;Libomptarget --&amp;gt; Call to omp_get_num_devices returning 1&lt;BR /&gt;Libomptarget --&amp;gt; Call to omp_get_initial_device returning 1&lt;BR /&gt;Libomptarget --&amp;gt; Checking whether device 0 is ready.&lt;BR /&gt;Libomptarget --&amp;gt; Is the device 0 (local ID 0) initialized? 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Initialize requires flags to 1&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Allocated a host memory object 0x0000022cba0a0000&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Initialized host memory pool for device 0x0000000000000000: AllocUnit = 65536, AllocMax = 1048576, Capacity = 4, PoolSizeMax = 268435456&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Allocated a shared memory object 0x0000022cba0a0000&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Initialized shared memory pool for device 0x0000022cb98047a8: AllocUnit = 65536, AllocMax = 1048576, Capacity = 4, PoolSizeMax = 268435456&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Allocated a device memory object 0xffffb80200010000&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Initialized device memory pool for device 0x0000022cb98047a8: AllocUnit = 65536, AllocMax = 1048576, Capacity = 4, PoolSizeMax = 268435456&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Created a command queue 0x0000022cb98dff48 (Ordinal: 0, Index: 0) for device 0.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Created a command list 0x0000022cb9d4cd98 (Ordinal: 0) for device 0.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Initialized Level0 device 0&lt;BR /&gt;Libomptarget --&amp;gt; Device 0 is ready to use.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Device 0: Loading binary from 0x00007ff7c498c000&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Expecting to have 1 entries defined&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Base L0 module compilation options: -cl-std=CL2.0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Created module from image #0.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Looking up device global variable '__omp_offloading_eeec4b03_1bfa1__Z7Compute_l9_kernel_info' of unknown size on device 0.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Global variable lookup succeeded (size: 80 bytes).&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Created a command list 0x0000022cb98e43c8 (Ordinal: 1) for device 0.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Created a command queue 0x0000022cb7647858 (Ordinal: 1, Index: 0) for device 0.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Kernel 0: Entry = 0x00007ff7c4958e70, Name = __omp_offloading_eeec4b03_1bfa1__Z7Compute_l9, NumArgs = 6, Handle = 0x0000022cc0f72cf0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Looking up device global variable '__omp_spirv_program_data' of size 48 bytes on device 0.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Global variable lookup succeeded (size: 48 bytes).&lt;BR /&gt;Libomptarget --&amp;gt; Entry 0: Base=0x000000c977367f08, Begin=0x000000c977367f08, Size=8, Type=0x23, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Entry 1: Base=0x000000c977367ef8, Begin=0x000000c977367ef8, Size=8, Type=0x21, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Entry 2: Base=0x000000c977367f00, Begin=0x000000c977367f00, Size=8, Type=0x21, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Entry 3: Base=0x00007ff7c4976d80, Begin=0x00007ff7c4976d80, Size=4, Type=0x21, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Entry 4: Base=0x0000000000000000, Begin=0x0000000000000000, Size=0, Type=0x120, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Entry 5: Base=0x0000000000000059, Begin=0x0000000000000059, Size=0, Type=0x120, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Entry 6: Base=0x000000c977367f80, Begin=0x000000c977367f80, Size=32, Type=0x800, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x000000c977367f08, Size=8)...&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Ptr 0x000000c977367f08 is not a device accessible memory pointer.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Allocated a shared memory object 0x0000022cba0b0000&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; New block allocation for shared memory pool: base = 0x0000022cba0b0000, size = 65536, pool size = 65536&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Allocated target memory 0x0000022cba0b0000 (Base: 0x0000022cba0b0000, Size: &lt;LI-EMOJI id="lia_smiling-face-with-sunglasses" title=":smiling_face_with_sunglasses:"&gt;&lt;/LI-EMOJI&gt; from memory pool for host ptr 0x000000c977367f08&lt;BR /&gt;Libomptarget --&amp;gt; Creating new map entry with HstPtrBegin=0x000000c977367f08, TgtPtrBegin=0x0000022cba0b0000, Size=8, DynRefCount=1, HoldRefCount=0, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Moving 8 bytes (hst:0x000000c977367f08) -&amp;gt; (tgt:0x0000022cba0b0000)&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Copied 8 bytes (hst:0x000000c977367f08) -&amp;gt; (tgt:0x0000022cba0b0000)&lt;BR /&gt;Libomptarget --&amp;gt; There are 8 bytes allocated at target address 0x0000022cba0b0000 - is new&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x000000c977367ef8, Size=8)...&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Ptr 0x000000c977367ef8 is not a device accessible memory pointer.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Allocated target memory 0x0000022cba0b0020 (Base: 0x0000022cba0b0020, Size: &lt;LI-EMOJI id="lia_smiling-face-with-sunglasses" title=":smiling_face_with_sunglasses:"&gt;&lt;/LI-EMOJI&gt; from memory pool for host ptr 0x000000c977367ef8&lt;BR /&gt;Libomptarget --&amp;gt; Creating new map entry with HstPtrBegin=0x000000c977367ef8, TgtPtrBegin=0x0000022cba0b0020, Size=8, DynRefCount=1, HoldRefCount=0, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Moving 8 bytes (hst:0x000000c977367ef8) -&amp;gt; (tgt:0x0000022cba0b0020)&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Copied 8 bytes (hst:0x000000c977367ef8) -&amp;gt; (tgt:0x0000022cba0b0020)&lt;BR /&gt;Libomptarget --&amp;gt; There are 8 bytes allocated at target address 0x0000022cba0b0020 - is new&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x000000c977367f00, Size=8)...&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Ptr 0x000000c977367f00 is not a device accessible memory pointer.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Allocated target memory 0x0000022cba0b0040 (Base: 0x0000022cba0b0040, Size: &lt;LI-EMOJI id="lia_smiling-face-with-sunglasses" title=":smiling_face_with_sunglasses:"&gt;&lt;/LI-EMOJI&gt; from memory pool for host ptr 0x000000c977367f00&lt;BR /&gt;Libomptarget --&amp;gt; Creating new map entry with HstPtrBegin=0x000000c977367f00, TgtPtrBegin=0x0000022cba0b0040, Size=8, DynRefCount=1, HoldRefCount=0, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Moving 8 bytes (hst:0x000000c977367f00) -&amp;gt; (tgt:0x0000022cba0b0040)&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Copied 8 bytes (hst:0x000000c977367f00) -&amp;gt; (tgt:0x0000022cba0b0040)&lt;BR /&gt;Libomptarget --&amp;gt; There are 8 bytes allocated at target address 0x0000022cba0b0040 - is new&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x00007ff7c4976d80, Size=4)...&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Ptr 0x00007ff7c4976d80 is not a device accessible memory pointer.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Allocated target memory 0x0000022cba0b0060 (Base: 0x0000022cba0b0060, Size: 4) from memory pool for host ptr 0x00007ff7c4976d80&lt;BR /&gt;Libomptarget --&amp;gt; Creating new map entry with HstPtrBegin=0x00007ff7c4976d80, TgtPtrBegin=0x0000022cba0b0060, Size=4, DynRefCount=1, HoldRefCount=0, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Moving 4 bytes (hst:0x00007ff7c4976d80) -&amp;gt; (tgt:0x0000022cba0b0060)&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Copied 4 bytes (hst:0x00007ff7c4976d80) -&amp;gt; (tgt:0x0000022cba0b0060)&lt;BR /&gt;Libomptarget --&amp;gt; There are 4 bytes allocated at target address 0x0000022cba0b0060 - is new&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x000000c977367f08, Size=8)...&lt;BR /&gt;Libomptarget --&amp;gt; Mapping exists with HstPtrBegin=0x000000c977367f08, TgtPtrBegin=0x0000022cba0b0000, Size=8, DynRefCount=1 (update suppressed), HoldRefCount=0&lt;BR /&gt;Libomptarget --&amp;gt; Obtained target argument (Begin: 0x0000022cba0b0000, Offset: 0) from host pointer 0x000000c977367f08&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x000000c977367ef8, Size=8)...&lt;BR /&gt;Libomptarget --&amp;gt; Mapping exists with HstPtrBegin=0x000000c977367ef8, TgtPtrBegin=0x0000022cba0b0020, Size=8, DynRefCount=1 (update suppressed), HoldRefCount=0&lt;BR /&gt;Libomptarget --&amp;gt; Obtained target argument (Begin: 0x0000022cba0b0020, Offset: 0) from host pointer 0x000000c977367ef8&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x000000c977367f00, Size=8)...&lt;BR /&gt;Libomptarget --&amp;gt; Mapping exists with HstPtrBegin=0x000000c977367f00, TgtPtrBegin=0x0000022cba0b0040, Size=8, DynRefCount=1 (update suppressed), HoldRefCount=0&lt;BR /&gt;Libomptarget --&amp;gt; Obtained target argument (Begin: 0x0000022cba0b0040, Offset: 0) from host pointer 0x000000c977367f00&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x00007ff7c4976d80, Size=4)...&lt;BR /&gt;Libomptarget --&amp;gt; Mapping exists with HstPtrBegin=0x00007ff7c4976d80, TgtPtrBegin=0x0000022cba0b0060, Size=4, DynRefCount=1 (update suppressed), HoldRefCount=0&lt;BR /&gt;Libomptarget --&amp;gt; Obtained target argument (Begin: 0x0000022cba0b0060, Offset: 0) from host pointer 0x00007ff7c4976d80&lt;BR /&gt;Libomptarget --&amp;gt; Forwarding first-private value 0x0000000000000000 to the target construct&lt;BR /&gt;Libomptarget --&amp;gt; Forwarding first-private value 0x0000000000000059 to the target construct&lt;BR /&gt;Libomptarget --&amp;gt; Launching target execution __omp_offloading_eeec4b03_1bfa1__Z7Compute_l9 with pointer 0x0000022cbc17f9c0 (index=0).&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Executing a kernel 0x0000022cbc17f9c0...&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Assumed kernel SIMD width is 32&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Preferred group size is multiple of 64&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Max group size is set to 80 (thread_limit clause)&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Level 0: Lb = 0, Ub = 89, Stride = 1&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Group sizes = {80, 1, 1}&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Group counts = {2, 1, 1}&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Created a command list 0x0000022cbc102f98 (Ordinal: 0) for device 0.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Created a command queue 0x0000022cb7647928 (Ordinal: 0, Index: 0) for device 0.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Kernel Pointer argument 0 (value: 0x0000022cba0b0000) was set successfully.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Kernel Pointer argument 1 (value: 0x0000022cba0b0020) was set successfully.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Kernel Pointer argument 2 (value: 0x0000022cba0b0040) was set successfully.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Kernel Pointer argument 3 (value: 0x0000022cba0b0060) was set successfully.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Kernel Scalar argument 4 (value: 0x0000000000000000) was set successfully.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Kernel Scalar argument 5 (value: 0x0000000000000059) was set successfully.&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Setting indirect access flags 0x0000000000000004&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Executed a kernel 0x0000022cbc17f9c0&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x00007ff7c4976d80, Size=4)...&lt;BR /&gt;Libomptarget --&amp;gt; Mapping exists with HstPtrBegin=0x00007ff7c4976d80, TgtPtrBegin=0x0000022cba0b0060, Size=4, DynRefCount=1 (deferred final decrement), HoldRefCount=0&lt;BR /&gt;Libomptarget --&amp;gt; There are 4 bytes allocated at target address 0x0000022cba0b0060 - is last&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x000000c977367f00, Size=8)...&lt;BR /&gt;Libomptarget --&amp;gt; Mapping exists with HstPtrBegin=0x000000c977367f00, TgtPtrBegin=0x0000022cba0b0040, Size=8, DynRefCount=1 (deferred final decrement), HoldRefCount=0&lt;BR /&gt;Libomptarget --&amp;gt; There are 8 bytes allocated at target address 0x0000022cba0b0040 - is last&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x000000c977367ef8, Size=8)...&lt;BR /&gt;Libomptarget --&amp;gt; Mapping exists with HstPtrBegin=0x000000c977367ef8, TgtPtrBegin=0x0000022cba0b0020, Size=8, DynRefCount=1 (deferred final decrement), HoldRefCount=0&lt;BR /&gt;Libomptarget --&amp;gt; There are 8 bytes allocated at target address 0x0000022cba0b0020 - is last&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x000000c977367f08, Size=8)...&lt;BR /&gt;Libomptarget --&amp;gt; Mapping exists with HstPtrBegin=0x000000c977367f08, TgtPtrBegin=0x0000022cba0b0000, Size=8, DynRefCount=1 (deferred final decrement), HoldRefCount=0&lt;BR /&gt;Libomptarget --&amp;gt; There are 8 bytes allocated at target address 0x0000022cba0b0000 - is last&lt;BR /&gt;Libomptarget --&amp;gt; Moving 8 bytes (tgt:0x0000022cba0b0000) -&amp;gt; (hst:0x000000c977367f08)&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Copied 8 bytes (tgt:0x0000022cba0b0000) -&amp;gt; (hst:0x000000c977367f08)&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x00007ff7c4976d80, Size=4)...&lt;BR /&gt;Libomptarget --&amp;gt; Deleting tgt data 0x0000022cba0b0060 of size 4&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Returned device memory 0x0000022cba0b0060 to memory pool&lt;BR /&gt;Libomptarget --&amp;gt; Removing map entry with HstPtrBegin=0x00007ff7c4976d80, TgtPtrBegin=0x0000022cba0b0060, Size=4, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x000000c977367f00, Size=8)...&lt;BR /&gt;Libomptarget --&amp;gt; Deleting tgt data 0x0000022cba0b0040 of size 8&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Returned device memory 0x0000022cba0b0040 to memory pool&lt;BR /&gt;Libomptarget --&amp;gt; Removing map entry with HstPtrBegin=0x000000c977367f00, TgtPtrBegin=0x0000022cba0b0040, Size=8, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x000000c977367ef8, Size=8)...&lt;BR /&gt;Libomptarget --&amp;gt; Deleting tgt data 0x0000022cba0b0020 of size 8&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Returned device memory 0x0000022cba0b0020 to memory pool&lt;BR /&gt;Libomptarget --&amp;gt; Removing map entry with HstPtrBegin=0x000000c977367ef8, TgtPtrBegin=0x0000022cba0b0020, Size=8, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Looking up mapping(HstPtrBegin=0x000000c977367f08, Size=8)...&lt;BR /&gt;Libomptarget --&amp;gt; Deleting tgt data 0x0000022cba0b0000 of size 8&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Returned device memory 0x0000022cba0b0000 to memory pool&lt;BR /&gt;Libomptarget --&amp;gt; Removing map entry with HstPtrBegin=0x000000c977367f08, TgtPtrBegin=0x0000022cba0b0000, Size=8, Name=unknown&lt;BR /&gt;Libomptarget --&amp;gt; Unloading target library!&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Target binary is a valid oneAPI OpenMP image.&lt;BR /&gt;Libomptarget --&amp;gt; Image 0x00007ff7c498c000 is compatible with RTL 0x00007ffa3d8b0000!&lt;BR /&gt;Libomptarget --&amp;gt; Unregistered image 0x00007ff7c498c000 from RTL 0x00007ffa3d8b0000!&lt;BR /&gt;Libomptarget --&amp;gt; Done unregistering images!&lt;BR /&gt;Libomptarget --&amp;gt; Removing translation table for descriptor 0x00007ff7c497d000&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Memory usage for host memory, device 0:&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- Allocator: Native, Pool&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- Requested: 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- Allocated: 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- Freed : 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- InUse : 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- PeakUse : 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- NumAllocs: 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Memory usage for shared memory, device 0:&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- Allocator: Native, Pool&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- Requested: 65536, 28&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- Allocated: 65536, 128&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- Freed : 65536, 128&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- InUse : 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- PeakUse : 65536, 128&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- NumAllocs: 1, 4&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Memory usage for device memory, device 0:&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- Allocator: Native, Pool&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- Requested: 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- Allocated: 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- Freed : 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- InUse : 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- PeakUse : 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; -- NumAllocs: 0, 0&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Closed RTL successfully&lt;BR /&gt;Target LEVEL0 RTL --&amp;gt; Deinit Level0 plugin!&lt;BR /&gt;Libomptarget --&amp;gt; Done unregistering library!&lt;BR /&gt;&lt;STRONG&gt;Libomptarget --&amp;gt; Deinit target library!&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To sum up, I still need help to have an example of a DLL which really can run on the GPU, and with stability&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I look forward to your feedback and thanks in advance,&lt;/P&gt;
&lt;P&gt;David.&lt;/P&gt;</description>
      <pubDate>Fri, 20 May 2022 12:41:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1385957#M2186</guid>
      <dc:creator>dtroncho</dc:creator>
      <dc:date>2022-05-20T12:41:13Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1386163#M2188</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maybe I almost have it. I have attached an example. If you run&amp;nbsp;&lt;STRONG&gt;\GPU_CPU_Example\Dll1_VB_Test\bin\Release\net5.0\Dll1_VB_Test.exe&amp;nbsp;&lt;/STRONG&gt;it will use a VC++ DLL which tries to execute in the GPU but returns this error:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Libomptarget --&amp;gt; &lt;STRONG&gt;Host ptr 0x00007ffa3b4c3410 does not have a matching target pointer.&lt;/STRONG&gt;&lt;BR /&gt;Libomptarget error: Run with&lt;BR /&gt;Libomptarget error: LIBOMPTARGET_DEBUG=1 to display basic debug information.&lt;BR /&gt;Libomptarget error: LIBOMPTARGET_DEBUG=2 to display calls to the compute runtime.&lt;BR /&gt;Libomptarget error: LIBOMPTARGET_INFO=4 to dump host-target pointer mappings.&lt;BR /&gt;unknown:0:12: &lt;STRONG&gt;Libomptarget fatal error 1: failure of target construct while offloading is mandatory&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Do you know how to fix that? Because looks like we need to solve that to, maybe, succeed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have uploaded the DLL and the program which uses it.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I look fwd to your feedback and thanks in advance.&lt;/P&gt;
&lt;P&gt;David.&lt;/P&gt;</description>
      <pubDate>Fri, 20 May 2022 20:31:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1386163#M2188</guid>
      <dc:creator>dtroncho</dc:creator>
      <dc:date>2022-05-20T20:31:48Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1387415#M2206</link>
      <description>&lt;P&gt;Hello, any news on this ticket?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I look forward to your help and thanks in advance.&lt;/P&gt;</description>
      <pubDate>Wed, 25 May 2022 15:51:17 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1387415#M2206</guid>
      <dc:creator>dtroncho</dc:creator>
      <dc:date>2022-05-25T15:51:17Z</dc:date>
    </item>
    <item>
      <title>Re:Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1387693#M2210</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We were able to reproduce your issue at our end using the steps given by you. We are working on your issue internally with the developers and will get back to you soon.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;&lt;P&gt;Santosh&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 26 May 2022 12:15:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1387693#M2210</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2022-05-26T12:15:16Z</dc:date>
    </item>
    <item>
      <title>Re:Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1389955#M2251</link>
      <description>&lt;P&gt;Please try using explicit array sections for the offload mapping and not just pointers, for example:&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;void Compute(int A[MAX_TEST][MAX_TEST], int B[MAX_TEST][MAX_TEST], int C[MAX_TEST][MAX_TEST])&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;{&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; int a = 0;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; bool is_cpu = true;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;#ifdef USE_POINTER&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;#pragma omp target teams distribute parallel for map(to: A, B, a) map(tofrom: C) map(from: is_cpu)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;#else&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;#pragma omp target teams distribute parallel for map(to: A[0:MAX_TEST][0:MAX_TEST], B[0:MAX_TEST][0:MAX_TEST], a) map(tofrom: C[0:MAX_TEST][0:MAX_TEST]) map(from: is_cpu)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;#endif&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; for (int i = 0; i &amp;lt; MAX_TEST; i++) {&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  a = a + 1;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  if (i == 0) is_cpu = omp_is_initial_device();&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  for (int j = 0; j &amp;lt; MAX_TEST; j++) {&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;   for (int k = 0; k &amp;lt; MAX_TEST; k++) {&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;    C[i][j] += A[i][k] * B[k][j];&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;   }&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  }&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; }&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; if (a == 0) {&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  std::cout &amp;lt;&amp;lt; "Offloaded on GPU &lt;LI-EMOJI id="lia_slightly-smiling-face" title=":slightly_smiling_face:"&gt;&lt;/LI-EMOJI&gt; " &amp;lt;&amp;lt; std::endl;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; }&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; else&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  std::cout &amp;lt;&amp;lt; "Executed on CPU &lt;LI-EMOJI id="lia_disappointed-face" title=":disappointed_face:"&gt;&lt;/LI-EMOJI&gt; " &amp;lt;&amp;lt; std::endl;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; if (! is_cpu) {&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  std::cout &amp;lt;&amp;lt; "Offloaded on GPU &lt;LI-EMOJI id="lia_slightly-smiling-face" title=":slightly_smiling_face:"&gt;&lt;/LI-EMOJI&gt; " &amp;lt;&amp;lt; std::endl;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; }&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt; else&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;  std::cout &amp;lt;&amp;lt; "Executed on CPU &lt;LI-EMOJI id="lia_disappointed-face" title=":disappointed_face:"&gt;&lt;/LI-EMOJI&gt; " &amp;lt;&amp;lt; std::endl;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: courier;"&gt;}&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;And you can use &lt;SPAN style="font-family: courier;"&gt;omp_is_initial_device()&lt;/SPAN&gt; to check whether the code is executed on the GPU or the host, see &lt;A href="https://www.openmp.org/spec-html/5.1/openmpsu166.html" target="_blank"&gt;https://www.openmp.org/spec-html/5.1/openmpsu166.html&lt;/A&gt;.&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Sat, 04 Jun 2022 00:08:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1389955#M2251</guid>
      <dc:creator>Klaus-Dieter_O_Intel</dc:creator>
      <dc:date>2022-06-04T00:08:48Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1390597#M2260</link>
      <description>&lt;P&gt;Thank you for the feedback, however, it does not resolve the problem because:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;I did what you mentioned, recompiled and tried, and I obtained same result:&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Libomptarget error: Host ptr 0x00007fff8615344d does not have a matching target pointer.&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Libomptarget error: Run with&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Libomptarget error: LIBOMPTARGET_DEBUG=1 to display basic debug information.&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Libomptarget error: LIBOMPTARGET_DEBUG=2 to display calls to the compute runtime.&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;Libomptarget error: LIBOMPTARGET_INFO=4 to dump host-target pointer mappings.&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;unknown:0:15: Libomptarget fatal error 1: failure of target construct while offloading is mandatory&lt;/STRONG&gt;&lt;/LI&gt;
&lt;LI&gt;In the particular example that we have shared in this post, matrices are of size&amp;nbsp;&lt;SPAN&gt;[MAX_TEST][MAX_TEST], but in the real case matrices are created in run time by allocating (malloc) memory and then sent to the GPU via the&amp;nbsp;shared omp directive, and therefore dimensions and sizes cannot be typed in the omp shared directive. If I try like this: #pragma omp parallel default(none) shared(float *A, float *B, ...), I obtain an error from the intel compiler.&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;I think the real problem was indicated in the first message of these posts, which probably means that currently the intel compiler cannot create a DLL which offloads to a GPU:&amp;nbsp;&lt;STRONG class="sub_section_element_selectors"&gt;&lt;SPAN class="sub_section_element_selectors"&gt;clang-cl: error: The use of '-LD' is not supported with '/Qiopenmp /Qopenmp-targets=spir64'.&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&lt;SPAN class="sub_section_element_selectors"&gt;We need your help to be able to publish DLLs which can be partially offloaded to intel's GPUs.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="sub_section_element_selectors"&gt;I look forward to your feedback and thanks in advance.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Jun 2022 11:40:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1390597#M2260</guid>
      <dc:creator>dtroncho</dc:creator>
      <dc:date>2022-06-07T11:40:35Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1392269#M2273</link>
      <description>&lt;P&gt;Hello! any news? Thanks in advance.&lt;/P&gt;</description>
      <pubDate>Tue, 14 Jun 2022 05:56:43 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1392269#M2273</guid>
      <dc:creator>dtroncho</dc:creator>
      <dc:date>2022-06-14T05:56:43Z</dc:date>
    </item>
    <item>
      <title>Re:Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1403949#M2394</link>
      <description>&lt;P&gt;The developers are working on a solution.&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 27 Jul 2022 18:58:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1403949#M2394</guid>
      <dc:creator>Klaus-Dieter_O_Intel</dc:creator>
      <dc:date>2022-07-27T18:58:35Z</dc:date>
    </item>
    <item>
      <title>Re: Cannot create DLL which offloads part of the code to Intel Xe GPU</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1404242#M2401</link>
      <description>&lt;P&gt;Hello Klaus,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for the update. That´s very good news: it will allow to reuse and distribute DLLs which off-load (via OpenMP) to Intel's GPUs.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I look forward to your updates.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best regards,&lt;/P&gt;
&lt;P&gt;David.&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jul 2022 13:37:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Cannot-create-DLL-which-offloads-part-of-the-code-to-Intel-Xe/m-p/1404242#M2401</guid>
      <dc:creator>dtroncho</dc:creator>
      <dc:date>2022-07-28T13:37:12Z</dc:date>
    </item>
  </channel>
</rss>

