- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What's the fastest way to check 2 unsigned char arrays of indeterminate size for equality in C++?
I'm using Visual Studio 2017 and Intel Compiler 2017.
- Tags:
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, how am I supposed to know what header to include for these things?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Commander Lake,
You could use _mm256_mpsadbw_epu8 to find the sum of absolute difference of 8 quadruplets and use a _mm256_testz_si256 instruction to see if all the elements are zero or not, if it is nonzero , the strings are not the same.
However in my experience, these types of code totally depends on memory access performance, try to use the memory streaming operations for better performance ( I did achieve some performance boost this way). Secondly visual studio 2017 compiler is very efficient at vectorizing this type of code so you could check the disassembly of your non vectorized code, may be it is already vectorizing for you.
You could also try the string comparison instructions given in the intrinsics guide but I do not know how they perform and remember they have only 128 bit variants.
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=3643,5519&cats=String%252520Compare.
- Anil
Edit: I realize you could just read 256 bit integers and compare them instead of using those Sum of Absolute value instructions.
The comparison and zero checking instructions together take around 7-10 cycles whereas the string compare instructions take 10-15 cycles even while working on 128 bits of data as seen in the instruction tables in
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page