Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

AES-NI support for Westmere

Philip_Gladstone
Beginner
492 Views
Which processor type (as far as IPP is concerned) is the one that supports the AES-NI optimizations for AES? The code is very fast on Westmere, but is rather slower on older processors than the Crypto++ implementation. Thus I'd like to know which of the processor specific IPP libraries I should link to just get the AES-NI implementation, and then I'll use the Crypto++ implementation for the other platforms.

Thanks
0 Kudos
7 Replies
Tamer_Assad
Innovator
492 Views
Hi Philip,

Please check these articles, I hope these are helpful.

1- Understanding CPU optimized code used in Intel IPP
URL: http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-understanding-cpu-optimized-code-used-in-intel-ipp/

2- Is there any function to detect processor type from Intel IPP?
URL: http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-is-there-any-function-to-detect-processor-type/

Regards,
Tamer
0 Kudos
Philip_Gladstone
Beginner
492 Views
Quoting - Tamer Assad
Hi Philip,

Please check these articles, I hope these are helpful.

1- Understanding CPU optimized code used in Intel IPP
URL: http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-understanding-cpu-optimized-code-used-in-intel-ipp/

2- Is there any function to detect processor type from Intel IPP?
URL: http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-is-there-any-function-to-detect-processor-type/

Unfortunately, these articles do not describe how to map fromippGetCpuFeatures into the ipp suffixes. From reading another message in the forum, I am not the only person with the problem.
Let me be clear -- I'm looking for the function that takes a set of CPU features and returns the two character machine suffix used in ipp. For example, these are some of the suffixes: p8, px, t7, v8, w7, etc.

Thanks

Philip
0 Kudos
Tamer_Assad
Innovator
492 Views
Hi Philip,

Please check the IPP user guide (userguide_win_ia32.pdf)
section: Processor Type and Features
in the Processor Type sub-section, using the ippGetCpuType() function, along with the associated table: "Table 5-3 Detecting processor type. Returned values and their meaning"
This function will get you the processor type.

Doing some additional work, you may match the processor type returned by this function and provide the appropriate two character machine suffix accordingly.

ex:
//////////////////////////////////////////////////////////////////////////
case ippCpuNehalem: // Intel Core i7 processor
{
Suffix = p8;
break;
}


Regards,
Tamer

0 Kudos
Ying_S_Intel
Employee
492 Views
Which processor type (as far as IPP is concerned) is the one that supports the AES-NI optimizations for AES? The code is very fast on Westmere, but is rather slower on older processors than the Crypto++ implementation. Thus I'd like to know which of the processor specific IPP libraries I should link to just get the AES-NI implementation, and then I'll use the Crypto++ implementation for the other platforms.

Thanks

Hi Philip,

Inrecent versions Intel IPP 6.1 update 2 and 3, the p8 (for IA32) and y8 (for Intel 64) also includedAES-NI optimization for Westmere. If you run your application build from IPP 6.1 update 2 or higher version on Westmere, the p8/y8 code will load the optimized code for WSM.


These functions in Intel IPP cryptography libraires are optmzied for Westmere, they are available in IPP 6.1 updates 2 and 3:

ippsRijndael128{Encrypt|Decrypt{ECB|CBC|CFB|OFB|CTR} }
ippsRijndael128CCM{Encrypt|Decrypt},
ippsRijndael128GCMProcess{IV|AAD}

ippsRijndael192{Encrypt|Decrypt{ECB|CBC|CFB|OFB|CTR} }
ippsRijndae256{Encrypt|Decrypt{ECB|CBC|CFB|OFB|CTR} }

ippsDAARijndael128Update, ippsDAARijndael128Final
ippsDAARijndael192Update, ippsDAARijndael192Final
ippsDAARijndael256Update, ippsDAARijndael256Final

ippsXCBCRijndael128Update, ippsXCBCRijndael128Final

These functions in Intel IPP data compression library are also optimized for Westmere and they are available in IPP 6.1 update 3:

ippsCRC32_8u
ippsCRC32_BZ2_8u

That means you may need to update your IPP version to the latestto takebenefits of IPP optimizations on Westmere. You can also run the cpuinfo sample from ipp-samplesadvanced-usagecpuinfo folder ( samples provided along with IPP 6.1 update 3) on Westmere to ensure the p8 or y8 code is recommended as part of output data.

If you notice slower performance on older processors, please send us more details , and we will look into it.


Hope it helps.
Thanks,
Ying

0 Kudos
matthieu_darbois
New Contributor III
492 Views
Quoting - Tamer Assad
Hi Philip,

Please check the IPP user guide (userguide_win_ia32.pdf)
section: Processor Type and Features
in the Processor Type sub-section, using the ippGetCpuType() function, along with the associated table: "Table 5-3 Detecting processor type. Returned values and their meaning"
This function will get you the processor type.

Doing some additional work, you may match the processor type returned by this function and provide the appropriate two character machine suffix accordingly.

ex:
//////////////////////////////////////////////////////////////////////////
case ippCpuNehalem: // Intel Core i7 processor
{
Suffix = p8;
break;
}


Regards,
Tamer


Hi,
I posted some time ago a comment on this article http://software.intel.com/en-us/articles/new-nehalem-support/
This thread not being that different, I'll give another shot at getting an answer.

It seems that Penryn (SSE4.1), Nehalem (SSE4.2) and now Westmere (AES-NI) share the same library suffix. If I'm not mistaken, there was one suffix per instruction set : px (IA-32), w7 (SSE2), t7 (SSE3), v8 (SSSE3)...
Why isn't it the case for these new instruction set ? Is the number of functions so small that having multiple libraries wasn't considered ?

Regards,
Matthieu

0 Kudos
Vladimir_Dudnik
Employee
492 Views

Hi Matthieu,

you are exactly right. These new AES instructions introduced in new generation of Core i7 processors (code named Westmere) are benefitial mostly for some of crypto algorithms and small subset of data compression algorithms. Thus we decided to not increase IPP package size with adding new set of libraries when only limited number of functions can take advantage of these new instructions. Instead we extended IPP run time dispatcher with ability to check support for AES new instructions and branch to specific optimized code inside of library optimized for Core i7 processors family.

Regards,
Vladimir
0 Kudos
PaulF_IntelCorp
Employee
492 Views

All:

I have updated this article:

Understanding CPU Dispatching in the Intel IPP Library

describing the dispatching feature of the library so that, I hope, it makes more sense. If there are still questions please leave a comment and we can expand this article further.

Paul

0 Kudos
Reply