<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Converting SSE packed integer handling to AVX in Intel® ISA Extensions</title>
    <link>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773924#M180</link>
    <description>PDEP/PEXT are indeed part of BMI2 and are planned to be available in the first CPU supporting BMI2.&lt;BR /&gt;&lt;BR /&gt;Trailing bits manipulation instructions are useful for fast decoding of variable bit length codes (check e.g. Gamma &lt;A href="http://nlp.stanford.edu/IR-book/html/htmledition/gamma-codes-1.html)" target="_blank"&gt;http://nlp.stanford.edu/IR-book/html/htmledition/gamma-codes-1.html)&lt;/A&gt;, where detecting the length of the next bit field is often on a critical path and reducing latency can help significantly. For example pair of BLSR and TZCNT can be used together to decode unary encoded bit stream.&lt;BR /&gt;&lt;BR /&gt;-Max</description>
    <pubDate>Thu, 18 Aug 2011 15:54:26 GMT</pubDate>
    <dc:creator>Max_L</dc:creator>
    <dc:date>2011-08-18T15:54:26Z</dc:date>
    <item>
      <title>Converting SSE packed integer handling to AVX</title>
      <link>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773917#M173</link>
      <description>&lt;P&gt;Used SSE to work with for example, 6-bit packed integers.SSErequired a lot of heavy lifting thru masking and shifting and storing of results into temporary registers; prior to doing the arithmetic or logical operation. I'vebeen studying AVX2 instructions looking for a more optimal set of instructions to do this. Has anyone on this forum already looked at optimizing this type of workload? There are a lot of new instructions for manipulating 32-bit and 64-bit data units; but it is not obvious to me at this piont how these can help with this 6-bit packed integer problem. &lt;BR /&gt;Thanks,&lt;/P&gt;</description>
      <pubDate>Tue, 09 Aug 2011 22:26:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773917#M173</guid>
      <dc:creator>Grace_Oliver__Intel_</dc:creator>
      <dc:date>2011-08-09T22:26:05Z</dc:date>
    </item>
    <item>
      <title>Converting SSE packed integer handling to AVX</title>
      <link>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773918#M174</link>
      <description>Which operations do you need to perform on these 6-bit integers exactly? I imagine the AVX2 vector-vector shift instructions, and gather instructions, could come in quite handy.</description>
      <pubDate>Wed, 10 Aug 2011 06:23:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773918#M174</guid>
      <dc:creator>capens__nicolas</dc:creator>
      <dc:date>2011-08-10T06:23:35Z</dc:date>
    </item>
    <item>
      <title>Converting SSE packed integer handling to AVX</title>
      <link>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773919#M175</link>
      <description>Vector-to-vector shifts work on dword and qword element sizes. Perhaps these are useful; but I believe it will still take multiple shifts and masks to get the packed data aligned in order to operate on smaller than dword elements. &lt;BR /&gt;&lt;BR /&gt;With the AVX larger register sizethe number of elements operated will double and I'm expecting willincrease performance. &lt;BR /&gt;&lt;BR /&gt;Thanks,</description>
      <pubDate>Wed, 10 Aug 2011 17:51:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773919#M175</guid>
      <dc:creator>Grace_Oliver__Intel_</dc:creator>
      <dc:date>2011-08-10T17:51:58Z</dc:date>
    </item>
    <item>
      <title>Converting SSE packed integer handling to AVX</title>
      <link>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773920#M176</link>
      <description>Grace,&lt;BR /&gt;&lt;BR /&gt;As asked by the second post, what operations do you intend to perform and how long are your 6-bit vectors.&lt;BR /&gt;&lt;BR /&gt;Example:&lt;BR /&gt;&lt;BR /&gt;You intend to only perform test for 6-bit vector compares.&lt;BR /&gt;&lt;BR /&gt;Or, do you intend to perform addition, subtraction, multiplication, division&lt;BR /&gt;&lt;BR /&gt;rotates, etc...&lt;BR /&gt;&lt;BR /&gt;Compare for equal could be done relatively easily using pxor (and possibly pand for partial vector).&lt;BR /&gt;&lt;BR /&gt;For arithmatic, it might be easier to use the GP registers and almost as fast since you can handle 64 bits (or 60 bits) at a time.&lt;BR /&gt;&lt;BR /&gt;Stating what you want to do would certainly help us in providing you with advise.&lt;BR /&gt;&lt;BR /&gt;Jim Dempsey</description>
      <pubDate>Fri, 12 Aug 2011 20:55:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773920#M176</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2011-08-12T20:55:22Z</dc:date>
    </item>
    <item>
      <title>Converting SSE packed integer handling to AVX</title>
      <link>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773921#M177</link>
      <description>You can unpack the data with the &lt;A href="http://software.intel.com/en-us/forums/showthread.php?t=83399&amp;amp;o=a&amp;amp;s=lr"&gt;proposed&lt;/A&gt; new &lt;A href="http://software.intel.com/file/36945"&gt;BMI2 command&lt;/A&gt; PDEP from several 6 bit entities to 8 bit ones, do your calculation and repack them with PEXT. These commands act on 32 or 64 bit general purpose registers (GPR) like EAX or RAX; unfortunately there is no version for MMX/SSE/AVX. The mask to be used for both packing and unpacking will be probably 0x3f3f3f3f (32 bit) or similar.&lt;BR /&gt;I'm not sure whether this is what you want; at least you can test it with the &lt;A href="http://www.intel.com/software/sde"&gt;AVX emulator&lt;/A&gt;.&lt;BR /&gt;&lt;BR /&gt;Alternatively you can do the unpack/pack with some bit magic, see e.g. &lt;A href="http://hackersdelight.org/"&gt;Hacker's Delight&lt;/A&gt; (see the code of &lt;A href="http://hackersdelight.org/HDcode/compress.c.txt"&gt;compress&lt;/A&gt; and the PDF linked on &lt;A href="http://hackersdelight.org/revisions.pdf"&gt;revisions&lt;/A&gt;, figure 7-7 on page 43).&lt;BR /&gt;On &lt;A href="http://programming.sirrida.de/"&gt;my programming pages&lt;/A&gt; you will find similar routines under "bit permutations".&lt;BR /&gt;It should be not too difficult to adapt these routines to MMX/SSE/AVX 
but not necessarily worthwhile. Be aware that the mask is a constant.&lt;BR /&gt;If you work with SSE registers it probably makes sense to unpack every 3
 bytes to 4 (i.e. 12 to 16 bytes) via PSHUFB before doing the bit 
shuffling. For the packing afterwards do this in opposite direction.&lt;BR /&gt;&lt;BR /&gt;Is this what you want? Do you need explicit code snippets?</description>
      <pubDate>Tue, 16 Aug 2011 21:17:02 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773921#M177</guid>
      <dc:creator>sirrida</dc:creator>
      <dc:date>2011-08-16T21:17:02Z</dc:date>
    </item>
    <item>
      <title>Converting SSE packed integer handling to AVX</title>
      <link>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773922#M178</link>
      <description>&lt;DIV id="tiny_quote"&gt;
                &lt;DIV style="margin-left: 2px; margin-right: 2px;"&gt;Quoting &lt;A rel="/en-us/services/profile/quick_profile.php?is_paid=&amp;amp;user_id=520042" class="basic" href="https://community.intel.com/en-us/profile/520042/"&gt;sirrida&lt;/A&gt;&lt;/DIV&gt;
                &lt;DIV style="background-color: #e5e5e5; padding: 5px; border: 1px; border-style: inset; margin-left: 2px; margin-right: 2px;"&gt;&lt;I&gt;You can unpack the data with the &lt;A href="http://software.intel.com/en-us/forums/showthread.php?t=83399&amp;amp;o=a&amp;amp;s=lr"&gt;proposed&lt;/A&gt; new &lt;A href="http://software.intel.com/file/36945"&gt;BMI2 command&lt;/A&gt; PDEP from several 6 bit entities to 8 bit ones, do your calculation and repack them with PEXT.&lt;/I&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Wow, those instructions are fantastic. I never imagined a complex operation like that was even possible in a single pipelined instruction, but after reading up on the 'butterfly' datapath it's actually quite elegant.&lt;/P&gt;&lt;P&gt;I really look forward to CPUs with AVX2 and BMI. Am I correct that Haswell won't support BMI2 yet? PDEP and PEXT are not mentioned in the &lt;A href="http://software.intel.com/en-us/blogs/2011/06/13/haswell-new-instruction-descriptions-now-available/"&gt;Haswell New Instructions blog&lt;/A&gt;. Hopefully it's scheduled for Broadwell then.&lt;/P&gt;</description>
      <pubDate>Thu, 18 Aug 2011 14:40:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773922#M178</guid>
      <dc:creator>capens__nicolas</dc:creator>
      <dc:date>2011-08-18T14:40:00Z</dc:date>
    </item>
    <item>
      <title>Converting SSE packed integer handling to AVX</title>
      <link>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773923#M179</link>
      <description>Strangely in the blog PEXT and PDEP are missing but &lt;B&gt;all&lt;/B&gt; the other BMI2 commands are mentioned: BZHI, MULX, RORX, SARX, SHLX, SHRX.&lt;BR /&gt;Nobody has reacted on my comment (2011-07-01 12:11) thereof - and I still don't have any clue for what these lowest bit manipulation operations (BMI1 / XOP) are useful.</description>
      <pubDate>Thu, 18 Aug 2011 14:53:17 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773923#M179</guid>
      <dc:creator>sirrida</dc:creator>
      <dc:date>2011-08-18T14:53:17Z</dc:date>
    </item>
    <item>
      <title>Converting SSE packed integer handling to AVX</title>
      <link>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773924#M180</link>
      <description>PDEP/PEXT are indeed part of BMI2 and are planned to be available in the first CPU supporting BMI2.&lt;BR /&gt;&lt;BR /&gt;Trailing bits manipulation instructions are useful for fast decoding of variable bit length codes (check e.g. Gamma &lt;A href="http://nlp.stanford.edu/IR-book/html/htmledition/gamma-codes-1.html)" target="_blank"&gt;http://nlp.stanford.edu/IR-book/html/htmledition/gamma-codes-1.html)&lt;/A&gt;, where detecting the length of the next bit field is often on a critical path and reducing latency can help significantly. For example pair of BLSR and TZCNT can be used together to decode unary encoded bit stream.&lt;BR /&gt;&lt;BR /&gt;-Max</description>
      <pubDate>Thu, 18 Aug 2011 15:54:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773924#M180</guid>
      <dc:creator>Max_L</dc:creator>
      <dc:date>2011-08-18T15:54:26Z</dc:date>
    </item>
    <item>
      <title>Converting SSE packed integer handling to AVX</title>
      <link>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773925#M181</link>
      <description>Thanks everyone for the references and suggestions. &lt;BR /&gt;Grace</description>
      <pubDate>Fri, 19 Aug 2011 19:02:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-ISA-Extensions/Converting-SSE-packed-integer-handling-to-AVX/m-p/773925#M181</guid>
      <dc:creator>Grace_Oliver__Intel_</dc:creator>
      <dc:date>2011-08-19T19:02:06Z</dc:date>
    </item>
  </channel>
</rss>

