<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Quote:Kevin Davis (Intel) in Software Archive</title>
    <link>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946891#M18482</link>
    <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Kevin Davis (Intel) wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Regarding mixing pragma/scif, our developer replied:&lt;/P&gt;
&lt;P&gt;"Memory allocation/deallocation must be done either using malloc/free, or using the pragmas. It cannot be a mixture of the two.&lt;/P&gt;
&lt;P&gt;When memory is allocated on MIC using the pragmas, the alignment on MIC will equal that of the CPU, as long as the CPU alignment does not exceed 64 bytes. CPU data aligned higher than 64-bytes can be matched on MIC with an align modifier.&lt;/P&gt;
&lt;P&gt;We have not tested using offload and additional SCIF connections in the same program."&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hi Kevin,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I wanted to know if the flags affecting the offloads will still work when you mix offload with SCIF?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;-Sumedh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 28 Feb 2013 18:49:06 GMT</pubDate>
    <dc:creator>Sumedh_N_Intel</dc:creator>
    <dc:date>2013-02-28T18:49:06Z</dc:date>
    <item>
      <title>offload transfer (partial)</title>
      <link>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946884#M18475</link>
      <description>&lt;P&gt;&amp;nbsp;Hi&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;Is it possible to offload transfer part of an array with the whole array preallocated on the host and the phi.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;For example&lt;/P&gt;
&lt;P&gt;&amp;nbsp; void *p_host =&amp;nbsp;_mm_malloc(size,alignment);&lt;/P&gt;
&lt;P&gt;&amp;nbsp; void *p_phi=0;&lt;/P&gt;
&lt;P&gt;&lt;B&gt;#pragma&lt;/B&gt; offload target(mic) in(size) in(alignment) out(p_phi){&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; p_phi = _mm_malloc(size,alignment));&amp;nbsp;&lt;/P&gt;
&lt;P&gt;}&lt;/P&gt;
&lt;P&gt;&amp;nbsp;I would like to now transfer part of p_host to p_phi&lt;/P&gt;
&lt;P&gt;Is this possible?&lt;/P&gt;
&lt;P&gt;Thanks&lt;/P&gt;
&lt;P&gt;Jamil&lt;/P&gt;</description>
      <pubDate>Sat, 16 Feb 2013 20:56:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946884#M18475</guid>
      <dc:creator>Jamil_A_</dc:creator>
      <dc:date>2013-02-16T20:56:51Z</dc:date>
    </item>
    <item>
      <title>Have a look at sampleC14.c</title>
      <link>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946885#M18476</link>
      <description>&lt;P&gt;Have a look at sampleC14.c (under /opt/intel/composer_xe_2013/Samples/en_US/C++/mic_samples/intro_sampleC) for an example of using the &lt;STRONG&gt;alloc&lt;/STRONG&gt; and &lt;STRONG&gt;into&lt;/STRONG&gt; specifiers. I believe those (at least &lt;STRONG&gt;into&lt;/STRONG&gt;) provide the functionality and convenience you are interested in.&lt;/P&gt;</description>
      <pubDate>Sun, 17 Feb 2013 11:25:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946885#M18476</guid>
      <dc:creator>Kevin_D_Intel</dc:creator>
      <dc:date>2013-02-17T11:25:07Z</dc:date>
    </item>
    <item>
      <title>   Hi</title>
      <link>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946886#M18477</link>
      <description>&lt;P&gt;&amp;nbsp; &amp;nbsp;Hi&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; I may have to drop back to using scif as it does not appear that I can get the fine grained functionality that I need using the pragma approach.&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;Are there any issues mixing pragmas and scif?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;Jamil&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 18 Feb 2013 11:17:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946886#M18477</guid>
      <dc:creator>Jamil_A_</dc:creator>
      <dc:date>2013-02-18T11:17:52Z</dc:date>
    </item>
    <item>
      <title>Hi Jamil,</title>
      <link>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946887#M18478</link>
      <description>&lt;P&gt;Hi Jamil,&lt;/P&gt;
&lt;P&gt;I had used partial data transfers in one of the codes. Here is a sample code to demonstrate what I did.&lt;/P&gt;
&lt;P&gt;[cpp]&lt;/P&gt;
&lt;P&gt;#include&amp;lt;stdio.h&amp;gt;&lt;BR /&gt;#include&amp;lt;stdlib.h&amp;gt;&lt;/P&gt;
&lt;P&gt;__attribute__((target(mic))) int *my_array;&lt;/P&gt;
&lt;P&gt;int main()&lt;BR /&gt;{&lt;/P&gt;
&lt;P&gt;//Allocate Host array&lt;BR /&gt;my_array=(int*)_mm_malloc(sizeof(int)*20,4096);&lt;/P&gt;
&lt;P&gt;//Initialize Host array&lt;BR /&gt;for(int i=0;i&amp;lt;20;i++)&lt;BR /&gt; my_array&lt;I&gt;=0;&lt;/I&gt;&lt;/P&gt;
&lt;P&gt;//Allocate on coprocessor and transfer the entire array&lt;BR /&gt;//to the coprocessor&lt;BR /&gt;#pragma offload target(mic:0)\&lt;BR /&gt; in(my_array:length(20) alloc_if(1) free_if(0))&lt;BR /&gt;{&lt;BR /&gt; for(int i=0;i&amp;lt;20;i++)&lt;BR /&gt; {&lt;BR /&gt; printf("%d",my_array&lt;I&gt;);fflush(0);&lt;BR /&gt; }&lt;BR /&gt; printf("\n");fflush(0);&lt;BR /&gt;}&lt;/I&gt;&lt;/P&gt;
&lt;P&gt;//Changed something on the host&lt;BR /&gt;for(int i=0;i&amp;lt;5;i++)&lt;BR /&gt;{&lt;BR /&gt; my_array[i+5]=5;&lt;BR /&gt;}&lt;/P&gt;
&lt;P&gt;//Transferred only the required bit to the card&lt;BR /&gt;#pragma offload target(mic:0)\&lt;BR /&gt; in(my_array[5:5] : into(my_array[5:5]) alloc_if(0) free_if(0))&lt;BR /&gt;{&lt;BR /&gt; //printf("\nPARTIAL TRANSFER\n");fflush(0);&lt;BR /&gt; for(int i=0;i&amp;lt;20;i++)&lt;BR /&gt; {&lt;BR /&gt; printf("%d",my_array&lt;I&gt;);fflush(0);&lt;BR /&gt; }&lt;BR /&gt;}&lt;/I&gt;&lt;/P&gt;
&lt;P&gt;//Free memory on the coprocessor&lt;BR /&gt;#pragma offload target(mic:0)\&lt;BR /&gt; nocopy(my_array:length(20) alloc_if(0) free_if(1))&lt;BR /&gt;{&lt;BR /&gt;}&lt;/P&gt;
&lt;P&gt;_mm_free(my_array);&lt;/P&gt;
&lt;P&gt;return 0;&lt;BR /&gt;}&lt;/P&gt;
&lt;P&gt;[/cpp]&lt;/P&gt;
&lt;P&gt;I am not sure what exactly you are trying to do but I hope this helps.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;-Sumedh&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 19 Feb 2013 20:52:34 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946887#M18478</guid>
      <dc:creator>Sumedh_N_Intel</dc:creator>
      <dc:date>2013-02-19T20:52:34Z</dc:date>
    </item>
    <item>
      <title>Quote:Jamil A. wrote:</title>
      <link>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946888#M18479</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Jamil A. wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;
&lt;P&gt;I may have to drop back to using scif as it does not appear that I can get the fine grained functionality that I need using the pragma aproach.&lt;/P&gt;
&lt;P&gt;Are there any issues mixing pragmas and scif?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;
&lt;P&gt;I do not know; there could be and I do not believe mixing&amp;nbsp;will be supported.&amp;nbsp;I'll have our Developers weigh in. Can you offer more details about what control is lacking w/offload for your case?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 20 Feb 2013 09:54:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946888#M18479</guid>
      <dc:creator>Kevin_D_Intel</dc:creator>
      <dc:date>2013-02-20T09:54:00Z</dc:date>
    </item>
    <item>
      <title>  HI Sumedh</title>
      <link>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946889#M18480</link>
      <description>&lt;P&gt;&amp;nbsp; HI Sumedh&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;Thanks for the example. I will be getting access to a phi in the next couple of days, so I will post a message with a more detailed example expanding on your code.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;Thanks&lt;/P&gt;
&lt;P&gt;&amp;nbsp;Jamil&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 20 Feb 2013 20:55:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946889#M18480</guid>
      <dc:creator>Jamil_A_</dc:creator>
      <dc:date>2013-02-20T20:55:58Z</dc:date>
    </item>
    <item>
      <title>Regarding mixing pragma/scif,</title>
      <link>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946890#M18481</link>
      <description>&lt;P&gt;Regarding mixing pragma/scif, our developer replied:&lt;/P&gt;
&lt;P&gt;"Memory allocation/deallocation must be done either using malloc/free, or using the pragmas. It cannot be a mixture of the two.&lt;/P&gt;
&lt;P&gt;When memory is allocated on MIC using the pragmas, the alignment on MIC will equal that of the CPU, as long as the CPU alignment does not exceed 64 bytes. CPU data aligned higher than 64-bytes can be matched on MIC with an align modifier.&lt;/P&gt;
&lt;P&gt;We have not tested using offload and additional SCIF connections in the same program."&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Feb 2013 18:08:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946890#M18481</guid>
      <dc:creator>Kevin_D_Intel</dc:creator>
      <dc:date>2013-02-28T18:08:00Z</dc:date>
    </item>
    <item>
      <title>Quote:Kevin Davis (Intel)</title>
      <link>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946891#M18482</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Kevin Davis (Intel) wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Regarding mixing pragma/scif, our developer replied:&lt;/P&gt;
&lt;P&gt;"Memory allocation/deallocation must be done either using malloc/free, or using the pragmas. It cannot be a mixture of the two.&lt;/P&gt;
&lt;P&gt;When memory is allocated on MIC using the pragmas, the alignment on MIC will equal that of the CPU, as long as the CPU alignment does not exceed 64 bytes. CPU data aligned higher than 64-bytes can be matched on MIC with an align modifier.&lt;/P&gt;
&lt;P&gt;We have not tested using offload and additional SCIF connections in the same program."&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hi Kevin,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I wanted to know if the flags affecting the offloads will still work when you mix offload with SCIF?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;-Sumedh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Feb 2013 18:49:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946891#M18482</guid>
      <dc:creator>Sumedh_N_Intel</dc:creator>
      <dc:date>2013-02-28T18:49:06Z</dc:date>
    </item>
    <item>
      <title>Hi Sumedh/Kevin</title>
      <link>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946892#M18483</link>
      <description>Hi Sumedh/Kevin

   I have got a case to work using the pragma approach.

   As the pragma approach does not allow you to pass a pointer by value (i.e. in my case I have a pointer to device memory stored on the host), I am casting my pointers to size_t before passing them as an argument to the pragma call to get around this issue.

   For cases like this (i.e I want to store the device pointer in a host structure) it would be very useful for the pragma approach to allow a passing pointers by value (both in and out) without casting.

   It would also be useful to be able to override the bitwise copyable check for structures.  I am currently having to memcpy the structure into an unsigned char array which I pass as an argument to the offload call and reconstruct my structure on the device side.

   Thanks for your help

 Jamil</description>
      <pubDate>Fri, 01 Mar 2013 10:14:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/offload-transfer-partial/m-p/946892#M18483</guid>
      <dc:creator>Jamil_A_</dc:creator>
      <dc:date>2013-03-01T10:14:26Z</dc:date>
    </item>
  </channel>
</rss>

