- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello ,
I can't find a way to cast a __m256i variable to integer!
Any ideas?
Thanks!
Link Copied
- « Previous
-
- 1
- 2
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, now it is clear.
So , when we use intrinsics we don't get a report that states 'vectorized' because it is already vectorized.
Now,for the hint you gave me.
You say to determine if any floats in the vector are less than D.Do you mean TempD?
Because we must compare :
if ( TempD[ j ] < D )
But,then , TempD is a float and D is a vector ( if I change it to vector as you said above ).
If I make TempD a vector , I must change also the :
_mm256_store_ps( TempD, _mm256_add_ps(...
And , I have found a lot of comparison commands and I don't know which to use! (suppose 256 (or 512) for all of these ).
You mention cmpeq but I can't find something useful.
Regarding the gmin to find the minimum. I haven't find this anywhere.Instead , I am finding various min commands , as _mm256_min_ps..
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Note that I said "Starting with a vector of FLT_MAX (an __m512, say D)"
This indicates you change "float D" into "__m512 D". This initially holds FLT_MAX spread across the vector, but as the loop(s) progress, it gets updated to hold the minimum value of the "sum of the squares" spread across the vector, which no longer require a memory array (TempD is removed from memory), but now can reside within a register. As you compute a next vector of "sum of the squares", each float representing the next 8 or 16 values, then using the new vector D, test the next vector of "sum of squares" to see if any of the values are lower than that (those) stored in D, if there is, then obtain the new minimum, update D (all 8/16 floats) to new minimum, locate the position in the vector of "sum of squares" of the first float holding the new minimum, and... instead of fetching into T a new value from V[], you remember the information necessary to construct the index into V[] (this is the inner loop index and the __mmask16 bitmask). After the end of the inner loop, you use the remembered index (of the vector holding minimum value) together with the mask holding the mask of positions in that (sum of squares) vector holding the minimum value to reconstruct the index into V[] then perform the fetch into T outside the inner loop.
It might help if you rename your variables to be representative of their purpose.
Consider changing "D" to "vectorOfMinimum", and naming the result of the "sumOfSquares", formerly TempD into "vectorOfSumOfSquares".
Things like that. While you are typing more letters in the variable names, this removes a necessity of more (or all) words in a comment that belongs on the statements.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>You mention cmpeq but I can't find something useful....Regarding the gmin to find the minimum
My #21 response clearly indicated targeting MIC, __mm512, however this is actually the subset for Knights Corner (KNC) subset of AVX-512.
Save this link: https://software.intel.com/sites/landingpage/IntrinsicsGuide/
Open the link and you will find the Intrinsics guide. If you check the technologies "
Jim Dempsey
RE: ...cmpeq...
-----
__mmask16 _mm512_cmpeq_ps_mask (__m512 a, __m512 b)
#include "zmmintrin.h"
Instruction: vcmpps k {k}, zmm, zmm, imm
CPUID Flags: AVX512F for AVX-512, KNCNI for KNC
Description
Compare packed single-precision (32-bit) floating-point elements in a and b for equality, and store the results in mask vector k.
Operation
FOR j := 0 to 15
i := j*32
k
ENDFOR
k[MAX:16] := 0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, thanks for the help.
I will try ,but this is a little overhelming for me..
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>little overhelming for me...
Sounds like a nautical term (helm is where you control the boat/ship), but overhelming might be apropos in this case (over controlling). ;)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hehe.. :)
Missed a "w".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
oh i don't know
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »