<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic I'm interested in the same in Analyzers</title>
    <link>https://community.intel.com/t5/Analyzers/Is-Intel-MPI-CUDA-aware/m-p/1148255#M17329</link>
    <description>&lt;P&gt;I'm interested in the same thing! I'd like to hear some useful advise, thank You!&lt;/P&gt;</description>
    <pubDate>Fri, 31 May 2019 12:21:12 GMT</pubDate>
    <dc:creator>Gamer__Nathan</dc:creator>
    <dc:date>2019-05-31T12:21:12Z</dc:date>
    <item>
      <title>Is Intel MPI CUDA-aware?</title>
      <link>https://community.intel.com/t5/Analyzers/Is-Intel-MPI-CUDA-aware/m-p/1148254#M17328</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I am developing an MPI program where each processor will offload computation on GPUs. I am using&amp;nbsp; latest version of Intel Parallel Studio 2019 (Update 3 I think), and cuda unified memory system to have a more maintainable code. Unfortunately, results get quite non-deterministic, even with all the synchronizations mechanism active. After some search, I found it might be that Intel MPI doesen't recognize unified memory, so I should fallback to duplicate variables/allocations/tedious copies...&lt;/P&gt;</description>
      <pubDate>Fri, 17 May 2019 13:31:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Is-Intel-MPI-CUDA-aware/m-p/1148254#M17328</guid>
      <dc:creator>SPAstef</dc:creator>
      <dc:date>2019-05-17T13:31:39Z</dc:date>
    </item>
    <item>
      <title>I'm interested in the same</title>
      <link>https://community.intel.com/t5/Analyzers/Is-Intel-MPI-CUDA-aware/m-p/1148255#M17329</link>
      <description>&lt;P&gt;I'm interested in the same thing! I'd like to hear some useful advise, thank You!&lt;/P&gt;</description>
      <pubDate>Fri, 31 May 2019 12:21:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Is-Intel-MPI-CUDA-aware/m-p/1148255#M17329</guid>
      <dc:creator>Gamer__Nathan</dc:creator>
      <dc:date>2019-05-31T12:21:12Z</dc:date>
    </item>
    <item>
      <title>Well, I personally ended up</title>
      <link>https://community.intel.com/t5/Analyzers/Is-Intel-MPI-CUDA-aware/m-p/1148256#M17330</link>
      <description>&lt;P&gt;Well, I personally ended up making 2 separate functions, the right one being selected at compile time with a flag... I don't know whether Parallel Studio 2020 MPI library supports this, I have not tried, but after making lots of tests, there would really be no point in using "legacy" allocation instead of UVA. In my specific case, it happened to actually be faster for smaller problem sizes, but then got closer and closer as the size grew, being at some point even surpassed. So: easier to write and similar/better when running!&lt;/P&gt;</description>
      <pubDate>Thu, 19 Dec 2019 20:16:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Is-Intel-MPI-CUDA-aware/m-p/1148256#M17330</guid>
      <dc:creator>SPAstef</dc:creator>
      <dc:date>2019-12-19T20:16:47Z</dc:date>
    </item>
    <item>
      <title>Re: Is Intel MPI CUDA-aware?</title>
      <link>https://community.intel.com/t5/Analyzers/Is-Intel-MPI-CUDA-aware/m-p/1465342#M23125</link>
      <description>&lt;P&gt;still can't find this info&lt;/P&gt;</description>
      <pubDate>Tue, 14 Mar 2023 02:21:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Is-Intel-MPI-CUDA-aware/m-p/1465342#M23125</guid>
      <dc:creator>Shihab</dc:creator>
      <dc:date>2023-03-14T02:21:54Z</dc:date>
    </item>
  </channel>
</rss>

