<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re:HelloMKLwithDPCPP - different behaviour selecting CPU and GPU - feature or bug ? in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/HelloMKLwithDPCPP-different-behaviour-selecting-CPU-and-GPU/m-p/1353322#M32647</link>
    <description>&lt;P&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;It is some kind of our specific implementation but not a bug. In this case, when the user wants to obtain the output results into freqData array, he has to explicitly see &lt;/SPAN&gt;&lt;B style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;DFTI_NOT_INPLACE &lt;/B&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;mode.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;e.x – it could be like as follows: &amp;nbsp;desc.set_value(oneapi::mkl::dft::config_param::PLACEMENT, &lt;/SPAN&gt;&lt;B style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;DFTI_NOT_INPLACE&lt;/B&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;); ) &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;The thread is closing and we will no longer respond to this thread.&amp;nbsp;If you require additional assistance from Intel, please start a new thread.&amp;nbsp;Any further interaction in this thread will be considered community only.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;BR /&gt;</description>
    <pubDate>Fri, 21 Jan 2022 04:38:18 GMT</pubDate>
    <dc:creator>Gennady_F_Intel</dc:creator>
    <dc:date>2022-01-21T04:38:18Z</dc:date>
    <item>
      <title>HelloMKLwithDPCPP - different behaviour selecting CPU and GPU - feature or bug ?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/HelloMKLwithDPCPP-different-behaviour-selecting-CPU-and-GPU/m-p/1346064#M32491</link>
      <description>&lt;P&gt;Hi there,&lt;/P&gt;
&lt;P&gt;it looks like I did hit a rather interesting feature/bug using oneMKL and DPC++.&lt;/P&gt;
&lt;P&gt;I created a small test program to get an initial feel for oneMKL in combination with DPC++ (opposed to using ISO C++ which IMHO imposes the usage of the C API).&lt;/P&gt;
&lt;P&gt;The program allows to calculate a single-precision FFT of a random sequence of real values of a specified length while choosing between the GPU (Intel UHD Graphics 630) and the CPU (Intel Core i9-9880H CPU&amp;nbsp;@ 2.30 GHz) of my Dell Precision 7540.&lt;/P&gt;
&lt;P&gt;I'm also roughly monitoring how long it takes to calculate the FFT, which at this moment is out-of-place to keep track of both time and frequency data.&lt;/P&gt;
&lt;P&gt;When specifying 1k points, I get identical results for GPU and CPU. The FFT output data nicely shows up in the freqData variable while the timeData is maintained as (random) input data.&lt;/P&gt;
&lt;P&gt;As soon as I use 10k points (or 25 Mio points as required in an upcoming application), everything works as expected using the GPU. However when using the CPU, I obtain identical results but the FFT output data shows up in the timeData (overwriting the FFT input data), implying the FFT suddenly acts as if &lt;EM&gt;in-place&lt;/EM&gt; instead of &lt;EM&gt;out-of-place&lt;/EM&gt;.&lt;/P&gt;
&lt;P&gt;Here's the output where the HelloMKLwithDPCPP accepts 2 arguments, i.e. number of points and gpu|cpu.&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Frans1_0-1640193652976.png" style="width: 400px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/25031i1A115E1929148B71/image-size/medium/is-moderation-mode/true?v=v2&amp;amp;px=400&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="Frans1_0-1640193652976.png" alt="Frans1_0-1640193652976.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;I'm using oneAPI Base Toolkit, v.2022 (downloaded last Friday, Dec 17th) in combination with Visual Studio 2017.&lt;/P&gt;
&lt;P&gt;I tried to attach the simple source code as file, but for one or another weird reason I get the following error (?!)&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Frans1_0-1640194344450.png" style="width: 400px;"&gt;&lt;img src="https://community.intel.com/t5/image/serverpage/image-id/25032i020F71B5DC3ED2D5/image-size/medium/is-moderation-mode/true?v=v2&amp;amp;px=400&amp;amp;whitelist-exif-data=Orientation%2CResolution%2COriginalDefaultFinalSize%2CCopyright" role="button" title="Frans1_0-1640194344450.png" alt="Frans1_0-1640194344450.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;As such I had to copy-paste the source code below.&lt;/P&gt;
&lt;P&gt;Can you please confirm this is a bug?&lt;/P&gt;
&lt;P&gt;Also, is my statement that sticking to ISO C++ imposes the use of the oneMKL C API correct?&lt;/P&gt;
&lt;P&gt;Thanks and regards,&lt;/P&gt;
&lt;P&gt;Frans&lt;/P&gt;
&lt;P&gt;-----------------------------------------&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;#include &amp;lt;mkl.h&amp;gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;#include &amp;lt;CL/sycl.hpp&amp;gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;#include &amp;lt;iostream&amp;gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;#include &amp;lt;string&amp;gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;#include &amp;lt;oneapi/mkl/dfti.hpp&amp;gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;#include &amp;lt;oneapi/mkl/rng.hpp&amp;gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;#include &amp;lt;complex&amp;gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;#include &amp;lt;chrono&amp;gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;using namespace oneapi::mkl::dft;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;int main(int argc, char** argv)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;{&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;try&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;{&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;// Probably not 100% idiot-proof ... using 25 Mio points on CPU by default&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;unsigned int nrOfPoints = (argc &amp;lt; 2) ? 25000000U : std::stoi(argv[1]);&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;std::string selector = (argc &amp;lt; 3) ? "cpu" : argv[2];&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;sycl::queue Q;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;if (selector == "cpu")&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;Q = sycl::queue(sycl::cpu_selector{});&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;else if (selector == "gpu")&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;Q = sycl::queue(sycl::gpu_selector{});&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;else&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;{&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;std::cout &amp;lt;&amp;lt; "Please use: " &amp;lt;&amp;lt; argv[0] &amp;lt;&amp;lt; " &amp;lt;nrOfPoints (default 25Mio)&amp;gt; &amp;lt;selector cpu|gpu&amp;gt;" &amp;lt;&amp;lt; std::endl;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;return EXIT_FAILURE;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;}&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;std::cout &amp;lt;&amp;lt; "Running on: " &amp;lt;&amp;lt; Q.get_device().get_info&amp;lt;sycl::info::device::name&amp;gt;() &amp;lt;&amp;lt; "\n";&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;auto sycl_device = Q.get_device();&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;auto sycl_context = Q.get_context();&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;// For the time being not yet trying complex-valued IQ data due to missing random generation of complex values.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;auto timeData = sycl::malloc_shared&amp;lt;float&amp;gt;(nrOfPoints, sycl_device, sycl_context);&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;// Initially not in-place ... later in-place in an attempt to speed up things&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;auto freqData = sycl::malloc_shared&amp;lt;float&amp;gt;(nrOfPoints, sycl_device, sycl_context);&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;// Use fixed seed in combination with random data&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;std::uint32_t seed = 0;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;oneapi::mkl::rng::mcg31m1 pseudoRndGen(Q, seed); // Initialize the pseudo-random generator.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;// Uniform distribution only supports floats and doubles (e.g. not std::complex&amp;lt;float&amp;gt;)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;oneapi::mkl::rng::uniform&amp;lt;float, oneapi::mkl::rng::uniform_method::standard&amp;gt; uniformDistribution(-1, 1);&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;oneapi::mkl::rng::generate(uniformDistribution, pseudoRndGen, nrOfPoints, timeData).wait();&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;oneapi::mkl::dft::descriptor&amp;lt;oneapi::mkl::dft::precision::SINGLE, oneapi::mkl::dft::domain::REAL&amp;gt; fftDescriptor(nrOfPoints);&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;// Don't forget to commit the FFT descriptor to the queue.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;fftDescriptor.commit(Q);&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;// Calculate the forward FFT and wait until done before printing the first and last value.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;// Apparently no support to have N floats as input and N/2 + 1 complex&amp;lt;float&amp;gt;s as output.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;// Not sure how the complex FFT values are stored ... expect real|imag|real|imag|...&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;auto startTime = std::chrono::system_clock::now();&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;oneapi::mkl::dft::compute_forward&amp;lt;oneapi::mkl::dft::descriptor&amp;lt;oneapi::mkl::dft::precision::SINGLE, oneapi::mkl::dft::domain::REAL&amp;gt;, float, float&amp;gt;(fftDescriptor, timeData, freqData).wait();&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;auto stopTime = std::chrono::system_clock::now();&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;// +++ BUG ALERT +++&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;// When using the CPU the freq data end up in the time data (?!) starting from 10k points while not the case for GPU.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;// +++ BUG ALERT +++&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;std::cout &amp;lt;&amp;lt; "time data: " &amp;lt;&amp;lt; timeData[0] &amp;lt;&amp;lt; " .. " &amp;lt;&amp;lt; timeData[1] &amp;lt;&amp;lt; " .. " &amp;lt;&amp;lt; timeData[2] &amp;lt;&amp;lt; " .. " &amp;lt;&amp;lt; timeData[3] &amp;lt;&amp;lt; " .. " &amp;lt;&amp;lt; std::endl;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;std::cout &amp;lt;&amp;lt; "freq data: " &amp;lt;&amp;lt; freqData[0] &amp;lt;&amp;lt; " .. " &amp;lt;&amp;lt; freqData[1] &amp;lt;&amp;lt; "j .. " &amp;lt;&amp;lt; freqData[2] &amp;lt;&amp;lt; " .. " &amp;lt;&amp;lt; freqData[3] &amp;lt;&amp;lt; "j .. " &amp;lt;&amp;lt; std::endl;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;std::cout &amp;lt;&amp;lt; "Elapsed time (ms) for " &amp;lt;&amp;lt; nrOfPoints &amp;lt;&amp;lt; " points: " &amp;lt;&amp;lt; std::chrono::duration_cast&amp;lt;std::chrono::milliseconds&amp;gt;(stopTime - startTime).count() &amp;lt;&amp;lt; std::endl;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;return EXIT_SUCCESS;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;}&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;catch (sycl::exception&amp;amp; e)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;{&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;std::cout &amp;lt;&amp;lt; "SYCL exception: " &amp;lt;&amp;lt; e.what() &amp;lt;&amp;lt; std::endl;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;}&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;}&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 22 Dec 2021 17:42:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/HelloMKLwithDPCPP-different-behaviour-selecting-CPU-and-GPU/m-p/1346064#M32491</guid>
      <dc:creator>Frans1</dc:creator>
      <dc:date>2021-12-22T17:42:28Z</dc:date>
    </item>
    <item>
      <title>Re:HelloMKLwithDPCPP - different behaviour selecting CPU and GPU - feature or bug ?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/HelloMKLwithDPCPP-different-behaviour-selecting-CPU-and-GPU/m-p/1346233#M32492</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for reaching out to us.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We are able to reproduce your issue. We are working on it internally and will get back to you soon.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;&lt;P&gt;Varsha&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 23 Dec 2021 11:00:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/HelloMKLwithDPCPP-different-behaviour-selecting-CPU-and-GPU/m-p/1346233#M32492</guid>
      <dc:creator>VarshaS_Intel</dc:creator>
      <dc:date>2021-12-23T11:00:12Z</dc:date>
    </item>
    <item>
      <title>Re:HelloMKLwithDPCPP - different behaviour selecting CPU and GPU - feature or bug ?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/HelloMKLwithDPCPP-different-behaviour-selecting-CPU-and-GPU/m-p/1353322#M32647</link>
      <description>&lt;P&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;It is some kind of our specific implementation but not a bug. In this case, when the user wants to obtain the output results into freqData array, he has to explicitly see &lt;/SPAN&gt;&lt;B style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;DFTI_NOT_INPLACE &lt;/B&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;mode.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;e.x – it could be like as follows: &amp;nbsp;desc.set_value(oneapi::mkl::dft::config_param::PLACEMENT, &lt;/SPAN&gt;&lt;B style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;DFTI_NOT_INPLACE&lt;/B&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;); ) &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;The thread is closing and we will no longer respond to this thread.&amp;nbsp;If you require additional assistance from Intel, please start a new thread.&amp;nbsp;Any further interaction in this thread will be considered community only.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial, sans-serif; font-size: 10pt;"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 21 Jan 2022 04:38:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/HelloMKLwithDPCPP-different-behaviour-selecting-CPU-and-GPU/m-p/1353322#M32647</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2022-01-21T04:38:18Z</dc:date>
    </item>
  </channel>
</rss>

