I understand that MD5 is no longer the strongest algorithm and has know attacks. However i still want to use it, however i have find below comments in documentation :
"This algorithm is considered weak due to known attacks on it. The functionality remains in the library, but the implementation will no longer be optimized and no security patches will be applied."
Does that mean MD5 version is not optimized anymore ? Is that mean optimizations has been removed ?
I am using this version: l_ippcp_2019.0.117
I have benchmark Intel MD5 version with non-intel un-optimized version and it gives same performance.
Thank you so much for kind reply and taking time, i hope you and your family are safe and sound.
I find that the IPP crypto md5 not giving me any performance boost, i have a regular un-optimized md5 c++ code when i compare the performance between IPP crypto md5 and regular un-optimized md5 c++ code. I find that regular un-optimized md5 c++ code slightly outperformed IPP crypto md5 by 20-30 nanoseconds. I was expecting to get significant boost to get from IPP.
Some reference points :
1. I have only installed IPPCP addon and not installed IPP , will that make any difference ?
2. I am using g++ compiler on Fedora linux.
3. Hardware : Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz, AVX2 enable
4. I have tried with below both IPP version: compilers_and_libraries_2019.5.281,compilers_and_libraries_2019.0.117
5. Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz
ppCP AVX2 (l9) 2019.0.0 (r0x438f167)
Features supported by CPU by Intel(R) Integrated Performance Primitives Cryptography
ippCPUID_MMX = Y Y Intel(R) Architecture MMX technology supported
ippCPUID_SSE = Y Y Intel(R) Streaming SIMD Extensions
ippCPUID_SSE2 = Y Y Intel(R) Streaming SIMD Extensions 2
ippCPUID_SSE3 = Y Y Intel(R) Streaming SIMD Extensions 3
ippCPUID_SSSE3 = Y Y Supplemental Streaming SIMD Extensions 3
ippCPUID_MOVBE = Y Y The processor supports MOVBE instruction
ippCPUID_SSE41 = Y Y Intel(R) Streaming SIMD Extensions 4.1
ippCPUID_SSE42 = Y Y Intel(R) Streaming SIMD Extensions 4.2
ippCPUID_AVX = Y Y Intel(R) Advanced Vector Extensions (Intel(R) AVX) instruction set
ippAVX_ENABLEDBYOS = Y Y The operating system supports Intel(R) AVX
ippCPUID_AES = Y Y Intel(R) AES instruction
ippCPUID_SHA = N N Intel(R) SHA new instructions
ippCPUID_CLMUL = Y Y PCLMULQDQ instruction
ippCPUID_RDRAND = Y Y Read Random Number instructions
ippCPUID_F16C = Y Y Float16 instructions
ippCPUID_AVX2 = Y Y Intel(R) Advanced Vector Extensions 2 instruction set
ippCPUID_AVX512F = N N Intel(R) Advanced Vector Extensions 512 Foundation instruction set
ippCPUID_AVX512CD = N N Intel(R) Advanced Vector Extensions 512 Conflict Detection instruction set
ippCPUID_AVX512ER = N N Intel(R) Advanced Vector Extensions 512 Exponential & Reciprocal instruction set
ippCPUID_ADCOX = Y Y ADCX and ADOX instructions
ippCPUID_RDSEED = Y Y The RDSEED instruction
ippCPUID_PREFETCHW = Y Y The PREFETCHW instruction
ippCPUID_KNC = N N Intel(R) Xeon Phi(TM) Coprocessor instruction set
Code i wrote to use Intel IPP MD5:
1. using ippsMD5MessageDigest.
static Ipp8u digest;
ippsMD5MessageDigest( (const Ipp8u *)data1, size , digest);
2. Using ippsHashMessage.
static Ipp8u digest2;
ippsHashMessage( (const Ipp8u *)data1, size , digest2, IPP_ALG_HASH_MD5);
3. Using ippsMD5Update & ippsMD5Final:
However no such difference in performance between 1 & 2. 3 took 100 nanosecond more then 1 & 2.
I assume the code will do automatic cpu dispatching and i dont have to explictly intialize anything since code will initialize during first call.
Compile & Linking: Compiled with dynamic libs.
Linked with all architecture specific libs : -lippcp -lippcpe9 -lippcpk0 -lippcpl9 -lippcpm7 -lippcpn0 -lippcpn8 -lippcpy8
g++ -O3 compare_md5.cpp md5.cpp -omit-frame-pointe -mavx2 -o compare_md5 -I /home/user9/intel/compilers_and_libraries_2019.5.281/linux/ippcp/include/ -L /home/user9/intel/compilers_and_libraries_2019.5.281/linux/ippcp/lib/intel64 -lippcp -lippcpe9 -lippcpk0 -lippcpl9 -lippcpm7 -lippcpn0 -lippcpn8 -lippcpy8 -pthread
Could you pls guide if i am missing anything ?
>> 1. I have only installed IPPCP addon and not installed IPP , will that make any difference?
<< no, it will not make any difference..
>> " an un-optimized md5 c++ code. "
<< is that some kind of open source code or in-house private ones? if the OpenSource - could you share the link?
>> I forward the questions to the IPP Crypto experts to look at your questions....
I've run standard IPP build and found that:
ippsMD5MessageDigest - 4.32 cycle/byte
ippHashMesage - 4.33 cycle/byte
ippsMD5MessageDigest_rmf - 4.38 cycle/byte
ippsMD5Update - 4.02 cycle/byte
ippHashUpdate - 4.02 cycle/byte
ippHashUpdate - 4.02 cycle/byte
Data above have been obtained on 1024 length payload, 2.6GHz CPU, "l9" code.
I think, ippcpInit() call should be before 1-st IPP's processing function
1. What is _rmf for ?
2. I have used ippcpInit() in code now but performance unchanged. Calling ippcpInit() is required ? I think i have read even if we dont call ippcpInit, during the first call to the API it does the same what ippcpInit() does. Pls. correct me if i am wrong.
3. considering, 4.02 cycle/byte, how many nanosecond should one take for 300 bytes ?
As per your hardware 2.6GHz
1 cycle = 2.6 naoseconds
4.02 cycle/byte .
1 byte = 10.452 nanoseconds
300 byte = 3135.6 nanoseconds. // i am getting 580 nanoseconds for 300 bytes using Intel ippsHashMessage api.
4. Could you pls review if there is any mistake in above calculation ?
5. How should i know cycle / byte at my system, this stats is by using ps_ippcp ? What should be the command line param to get cycle/byte for these md5 apis.
It means that Intel will not deliver improved optimizations to the code.
The patching cycle of it will be stopped.
So you shall have to find vulnerabilities of the code and patch them yourself.