<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Case 1:  Both section a and b in Software Archive</title>
    <link>https://community.intel.com/t5/Software-Archive/Asynchronous-Patterns-on-MIC/m-p/950121#M19335</link>
    <description>&lt;P&gt;Case 1:&amp;nbsp; Both section a and b are executed simulataneously.&amp;nbsp; But the signal from section a is clobbered by section b, so the offload_wait waits only for section b to complete&lt;/P&gt;
&lt;P&gt;Case 2:&amp;nbsp; Both section a and b are executed simulataneously, and the offload_wait waits for both to complete&lt;/P&gt;
&lt;P&gt;Case 3: Section b waits for section a to complete before proceeding.&lt;/P&gt;
&lt;P&gt;Case 4: CPU computation overlaps with section a and b computation, Once the CPU completes it waits at offload_wait for section a and b to complete&lt;/P&gt;</description>
    <pubDate>Tue, 08 Oct 2013 20:45:21 GMT</pubDate>
    <dc:creator>Ravi_N_Intel</dc:creator>
    <dc:date>2013-10-08T20:45:21Z</dc:date>
    <item>
      <title>Asynchronous Patterns on MIC</title>
      <link>https://community.intel.com/t5/Software-Archive/Asynchronous-Patterns-on-MIC/m-p/950118#M19332</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;I read about the handling of Asynchronous communications and computations using MIC, but the examples I found are all very simple and I would need a clarification about some possible patterns I need to implement in my&amp;nbsp; code.&lt;/P&gt;
&lt;P&gt;1) In the following scenario, are sections a and b serialized on MIC or do these get simultaneously executed on MIC? In case, how are these scheduled?&lt;/P&gt;
&lt;P&gt;[cpp]#pragma offload target(mic:0) .... signal(one)&lt;/P&gt;
&lt;P&gt;&amp;lt;MIC computing section a&amp;gt;&lt;/P&gt;
&lt;P&gt;#pragma offload target(mic:0) .... signal(one)&lt;/P&gt;
&lt;P&gt;&amp;lt;MIC computing section b&amp;gt;&lt;/P&gt;
&lt;P&gt;#offload_wait wait(one)[/cpp]&lt;/P&gt;
&lt;P&gt;2) The same questions as above replacing the name of a signal&lt;/P&gt;
&lt;P&gt;[cpp]#pragma offload target(mic:0) .... signal(one)&lt;/P&gt;
&lt;P&gt;&amp;lt;MIC computing section a&amp;gt;&lt;/P&gt;
&lt;P&gt;#pragma offload target(mic:0) .... signal(two)&lt;/P&gt;
&lt;P&gt;&amp;lt;MIC computing section b&amp;gt;&lt;/P&gt;
&lt;P&gt;#offload_wait wait(one,two)[/cpp]&lt;/P&gt;
&lt;P&gt;3) The same questions as above adding a wait&lt;/P&gt;
&lt;P&gt;[cpp]#pragma offload target(mic:0) .... signal(one)&lt;/P&gt;
&lt;P&gt;&amp;lt;MIC computing section a&amp;gt;&lt;/P&gt;
&lt;P&gt;#pragma offload target(mic:0) .... signal(two) wait(one)&lt;/P&gt;
&lt;P&gt;&amp;lt;MIC computing section b&amp;gt;&lt;/P&gt;
&lt;P&gt;#offload_wait wait(two)[/cpp]&lt;/P&gt;
&lt;P&gt;4) Considering now to add, at the end, a CPU computing section, how can I have this section to be executed simulatenously to MIC computing sections? If possible, I would like to be able to select if MIC section a and MIC section b are serialized or not on MIC and, in any case, to have the CPU computing section executing while MIC processes both the sections a and b&lt;/P&gt;
&lt;P&gt;[cpp]#pragma offload target(mic:0) .... signal(one)&lt;/P&gt;
&lt;P&gt;&amp;lt;MIC computing section a&amp;gt;&lt;/P&gt;
&lt;P&gt;#pragma offload target(mic:0) .... signal(two)&lt;/P&gt;
&lt;P&gt;&amp;lt;MIC computing section b&amp;gt;&lt;/P&gt;
&lt;P&gt;&amp;lt;CPU computing section&amp;gt;&lt;/P&gt;
&lt;P&gt;#offload_wait wait(one,two)[/cpp]&lt;/P&gt;
&lt;P&gt;many thanks for any help,&lt;/P&gt;
&lt;P&gt;Francesco&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Oct 2013 08:51:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Asynchronous-Patterns-on-MIC/m-p/950118#M19332</guid>
      <dc:creator>Salvadore__Francesco</dc:creator>
      <dc:date>2013-10-08T08:51:01Z</dc:date>
    </item>
    <item>
      <title>Hi Francesco, </title>
      <link>https://community.intel.com/t5/Software-Archive/Asynchronous-Patterns-on-MIC/m-p/950119#M19333</link>
      <description>&lt;P&gt;Hi Francesco,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am investigating your issue. Let me get back to you with what I find.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Oct 2013 13:14:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Asynchronous-Patterns-on-MIC/m-p/950119#M19333</guid>
      <dc:creator>Sumedh_N_Intel</dc:creator>
      <dc:date>2013-10-08T13:14:03Z</dc:date>
    </item>
    <item>
      <title>Francesco,</title>
      <link>https://community.intel.com/t5/Software-Archive/Asynchronous-Patterns-on-MIC/m-p/950120#M19334</link>
      <description>&lt;P&gt;Francesco,&lt;/P&gt;
&lt;P&gt;While you await an answer from Sumedh, in your sketch code how many threads in the Xeon Phi do you intend to run in offload section a, and offload section b? At issue here is you would want to avoid oversubscription in the Xeon Phi. The preferred way would be to have a programming structure whereby you have one thread pool within the Xeon Phi. Avoid OpenMP nested levels (unless you take care to manage your thread teams properly). Not having a Xeon Phi handy for testing you might want to see if you can run concurrently on the Xeon Phi a "Task Manager"-like app while performing your offload tests. This may give you some visualization aid. Alternatively VTune may give you this information.&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Tue, 08 Oct 2013 15:30:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Asynchronous-Patterns-on-MIC/m-p/950120#M19334</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2013-10-08T15:30:45Z</dc:date>
    </item>
    <item>
      <title>Case 1:  Both section a and b</title>
      <link>https://community.intel.com/t5/Software-Archive/Asynchronous-Patterns-on-MIC/m-p/950121#M19335</link>
      <description>&lt;P&gt;Case 1:&amp;nbsp; Both section a and b are executed simulataneously.&amp;nbsp; But the signal from section a is clobbered by section b, so the offload_wait waits only for section b to complete&lt;/P&gt;
&lt;P&gt;Case 2:&amp;nbsp; Both section a and b are executed simulataneously, and the offload_wait waits for both to complete&lt;/P&gt;
&lt;P&gt;Case 3: Section b waits for section a to complete before proceeding.&lt;/P&gt;
&lt;P&gt;Case 4: CPU computation overlaps with section a and b computation, Once the CPU completes it waits at offload_wait for section a and b to complete&lt;/P&gt;</description>
      <pubDate>Tue, 08 Oct 2013 20:45:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Asynchronous-Patterns-on-MIC/m-p/950121#M19335</guid>
      <dc:creator>Ravi_N_Intel</dc:creator>
      <dc:date>2013-10-08T20:45:21Z</dc:date>
    </item>
    <item>
      <title>In you final case CPU</title>
      <link>https://community.intel.com/t5/Software-Archive/Asynchronous-Patterns-on-MIC/m-p/950122#M19336</link>
      <description>&lt;P&gt;In you final case CPU overlaps only with section b.&amp;nbsp;&amp;nbsp; You could do some work on CPU between section a and section b,&amp;nbsp; if there is no work to be done then you can merge section a and section b.&lt;/P&gt;</description>
      <pubDate>Tue, 08 Oct 2013 21:33:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Asynchronous-Patterns-on-MIC/m-p/950122#M19336</guid>
      <dc:creator>Ravi_N_Intel</dc:creator>
      <dc:date>2013-10-08T21:33:30Z</dc:date>
    </item>
  </channel>
</rss>

