Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
92 Views

SGX poor encryption/seal performance

Jump to solution

Hi,

I have been developing a bit with the SDK you have done for Linux. First, thank you for this nice tool and for the rapid support you have.

I am quite surprised by the encryption/seal performance. Using an Intel(R) Core(TM) i3-6100U CPU @ 2.30GHz when I build with SGX_MOD=HW or SW I get quite poor encryption/decryption seal/unseal performance:

- 1024 calls to sgx_rijndael128GCM_encrypt + 1024 calls to sgx_rijndael128GCM_decrypt for a 8192 bytes buffer -> 1s

- 1024 calls to sgx_seal_data + 1024 calls to sgx_unseal_data for a 8192 bytes buffer -> 1s

This gives roughly 128Mbits/s encryption/decryptiono. With openssl I get around 28Gibts/s for 8192 buffers with command :

openssl speed -evp aes-128-gcm
...
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128-gcm     437023.89k   994503.79k  1871737.26k  2845251.24k  3507497.64k

The performance loss is almost of a factor 200 ! Is there a way to get better encryption/decryption performance ?

Thanks in advance,

Carlos

0 Kudos

Accepted Solutions
Highlighted
Employee
92 Views

I don't know whether you noticed this, but the 1.7 release recently posted allows you to download and build using the optimized IPP crypto library. If you use this version you should get similar results to what I provided before.

View solution in original post

0 Kudos
18 Replies
Highlighted
92 Views

Are you using the prebuilt binaries downloaded from 01.org or did you build them from the open source github repo?

Can you copy+paste the main test loop you used?

Thanks

0 Kudos
Highlighted
Beginner
92 Views

Hi,

I have built them from the open source github repo.

This is the loop for AES-GCM:

      for (unsigned int i = 0 ; i < 1<<num; i++)
      {
        // Encrypt/Decrypt
        res = sgx_rijndael128GCM_encrypt(&aes_key, plaintext, plaintext_len, 
            (uint8_t*)encrypted+4/*skip bytes reserved for plaintext size*/, iv, 12, NULL, 0, &mac);
        if (res != SGX_SUCCESS) { printf("Encryption error!\n"); }
        res = sgx_rijndael128GCM_decrypt(&aes_key, (uint8_t*)encrypted+4, plaintext_len, plaintext, 
            iv, 12, NULL, 0, &mac);
        if (res != SGX_SUCCESS) { printf("Decryption error!\n"); }

        // Store mac after the gcm encryption
        memcpy(encrypted+4+aesgcm_len, &mac, sizeof(sgx_aes_gcm_128bit_tag_t));
      }

This is the loop for sealing/unsealing:

      for (unsigned int i = 0 ; i < 1<<num; i++)
      {
        res = sgx_seal_data(0, NULL, plaintext_len, plaintext, ciph_size, 
            (sgx_sealed_data_t *)sealed);
        if (res != SGX_SUCCESS) { printf("Seal error! %d\n", ciph_size); }
        res = sgx_unseal_data((sgx_sealed_data_t *)sealed, NULL, NULL, plaintext, &plain_size);
        if (res != SGX_SUCCESS) { printf("Unseal error!\n"); }
      }

 

And this is the whole function:

void speedtest(int num, int plaintext_len, int seal)
{
    sgx_status_t res;
    
    // Allocate space for the plaintext and set it to a non null value
    uint8_t *plaintext = (uint8_t*) malloc(plaintext_len);
    memset(plaintext, 1, plaintext_len);

    if (!seal)
    {
      // Allocate space for encryption
      size_t aesgcm_len = 4 /*plaintext_length*/ 
        + ceil( ( ( (double) plaintext_len ) / 16) ) * 16 /*block enc*/
        + 16/*gmac*/;
      uint8_t* encrypted = (uint8_t*) malloc(aesgcm_len);
      sgx_aes_gcm_128bit_tag_t mac;
      
      // Init encryption key and IV
      sgx_aes_gcm_128bit_key_t aes_key; 
      sgx_read_rand((unsigned char *) &aes_key, sizeof(sgx_aes_gcm_128bit_key_t));
      uint8_t iv[12];
      memset(iv, 0, 12);
      
      // Store plaintext_len as in int in the first 4 bytes of the encryption
      ((int*)encrypted)[0]=plaintext_len;

      // Do the test loop
      for (unsigned int i = 0 ; i < 1<<num; i++)
      {
        // Encrypt/Decrypt
        res = sgx_rijndael128GCM_encrypt(&aes_key, plaintext, plaintext_len, 
            (uint8_t*)encrypted+4/*skip bytes reserved for plaintext size*/, iv, 12, NULL, 0, &mac);
        if (res != SGX_SUCCESS) { printf("Encryption error!\n"); }
        res = sgx_rijndael128GCM_decrypt(&aes_key, (uint8_t*)encrypted+4, plaintext_len, plaintext, 
            iv, 12, NULL, 0, &mac);
        if (res != SGX_SUCCESS) { printf("Decryption error!\n"); }

        // Store mac after the gcm encryption
        memcpy(encrypted+4+aesgcm_len, &mac, sizeof(sgx_aes_gcm_128bit_tag_t));
      }

      free(encrypted);
    }
    else 
    {
      // Allocate space for sealing
      uint32_t ciph_size = sgx_calc_sealed_data_size(0, plaintext_len);
      uint8_t* sealed = (uint8_t*) malloc(ciph_size);
      uint32_t plain_size;

      // Do the test loop
      for (unsigned int i = 0 ; i < 1<<num; i++)
      {
        res = sgx_seal_data(0, NULL, plaintext_len, plaintext, ciph_size, 
            (sgx_sealed_data_t *)sealed);
        if (res != SGX_SUCCESS) { printf("Seal error! %d\n", ciph_size); }
        res = sgx_unseal_data((sgx_sealed_data_t *)sealed, NULL, NULL, plaintext, &plain_size);
        if (res != SGX_SUCCESS) { printf("Unseal error!\n"); }
      }

      free(sealed);
    }

    free(plaintext);
}

Thank you !

 

0 Kudos
Highlighted
92 Views

I believe that the memcpy() on line 39 of the snippet you pasted exceeds the bounds of the memory you malloc'ed.

If you are trying to measure if there's a performance difference on a specific piece of code when run inside an enclave, you can try replicating the exact same code inside the enclave and outside the enclave.

When you built the binaries from the open-source github repo, which crypto library did you use? Can you use that same crypto library outside the enclave too? What are your results if you use the prebuilt binaries from 01.org?

 

0 Kudos
Highlighted
Beginner
92 Views

Dear Francisco,

if you look at the allocation done on the whole function you'll see I took into account the space needed for the mac. In any case in this speed test the memcpy is not needed and I removed it just in case :)

I am unable to use the binaries. When I uninstall the driver/psw/sdk that I compile from source and use the binary driver/psw/sdk installer I get on my apps libprotobuf.so.8 not found and if I create a fake link (to libprotobuf.so.9 which is what gets installed on Ubuntu  I get a symbol not found. Of course all the needed dependencies are installed as I needed them to compile from source the driver/psw/sdk a few weeks ago.

The problem probably comes from the fact I have a 16.04 Ubuntu but I am unable to change the OS (shared computer).

Also when I try to run the code outside the Enclave it won't work at all because it has tons of dependencies and linking issues ...

The crypto library used is sgx_tcrypto which is the default in the makefile. Can I change that ? with what ?

Can you run the code I gave above with your source/binary installation ? The required headers are :

#include <math.h>       
#include "sgx_trts.h"
#include "sgx_tseal.h"
#include "sgx_tcrypto.h"

The call should be speedtest(10 /* log of # of tests*/, 8192, 0) for rjindael and speedtest(10, 8192, 1) for seal. On my NUC I get 1s for each test.

Thx for your help !

Carlos

0 Kudos
Highlighted
Beginner
92 Views

Hi there,

is it possible for anyone to confirm/deny the very low encryption performance ? If the data processed is private it must be decrypted from hard drive (or network) ... but at such a decryption throughput big data processing is completely out of reach.

I am planning on releasing a note on the IACR (international association of cryptographic research) eprints on this issue, but I would like to know if you guys disagree before...

Carlos

0 Kudos
Highlighted
92 Views

Your code has:

  uint8_t* encrypted = (uint8_t*) malloc(aesgcm_len);

Then

  memcpy(encrypted+4+aesgcm_len, &mac, sizeof(sgx_aes_gcm_128bit_tag_t));

Do you see what I mentioned earlier regarding the memcpy exceeding the bounds?

Also, why is there a memcpy() if you are freeing it right after?

I ran your speedtest() in the untrusted domain, linking to the exact same libsgx_tcrypto that is used in the trusted domain, and the results were about the same.

If you look at the source for the sgx functions you are calling,

https://github.com/01org/linux-sgx/blob/master/sdk/tlibcrypto/sgx_aes_gcm.cpp

you will notice that, for example sgx_rijndael128GCM_encrypt:

  getsize();
  malloc();
  aes_gcm_init_state();
  aes_gcm_start();
  aes_gcm_encrypt();
  aes_gcm_get_tag();
  memset();

Does the 'speedtest' of openssl do all those equivalent calls inside their timed loop or do they pre-allocate and/or initialize all the contexts and then only measure the encrypt() calls?

I would try to analyze what openssl speedtest does and try to replicate that functionality using the lower-level ipps_AES_GCM* family of functions to get a better comparison.

0 Kudos
Highlighted
92 Views

Carlos A. wrote:

Hi there,

is it possible for anyone to confirm/deny the very low encryption performance ? If the data processed is private it must be decrypted from hard drive (or network) ... but at such a decryption throughput big data processing is completely out of reach.

I am planning on releasing a note on the IACR (international association of cryptographic research) eprints on this issue, but I would like to know if you guys disagree before...

Carlos

Data sealing is not meant for encrypting/decrypting large amounts of data. It's meant to encrypt/decrypt just the secret or secrets needed by the enclave in order to save and restore its state in the event of a power transition, an application upgrade, or between application sessions. Sealing is not intended to be a general purpose bulk encryption routine. There are third parties who are developing trusted crypto libraries for Intel SGX in order to provide more complete crypto functionality than it is presented in the Intel SGX SDK.

 

Performance in this case is most likely going to related to how the user is handling memory. To protect SGX memory against physical and some misconfiguration attacks it has integrity and replay protections... this has some performance penalty although this is minimized when the memory is cached. So it depends on how the user is constructing their encrypt and decrypt operations, for instance copying memory inside the enclave to encrypt/decrypt it might not be necessary as you could encrypt/decrypt from outside the enclave and deliver the results inside the enclave for and decrypt and visa versa for an encrypt. 

 

-Surenthar

- Surenthar Selvaraj
0 Kudos
Highlighted
Employee
92 Views

Hello Carlos,

I think you're running with an unoptimized version of the IPP crypto library. This version is written in plain C and is not taking advantage of any CPU feature. I ran your test program using the optimized IPP crypto library and I got a much better result.

To ensure that we're reporting the "same" results, could you post the code that calculates the result numbers?

I'll update the code and post the OpenSSL (outside the enclave) and AES-GCM (inside the enclave) results.

0 Kudos
Highlighted
Beginner
92 Views

Francisco C. (Intel) wrote:

Your code has:

  uint8_t* encrypted = (uint8_t*) malloc(aesgcm_len);

Then

  memcpy(encrypted+4+aesgcm_len, &mac, sizeof(sgx_aes_gcm_128bit_tag_t));

Do you see what I mentioned earlier regarding the memcpy exceeding the bounds?

Oh, indeed you are right and I am wrong it should be

encrypted + 4 + ceil( ( ( (double) plaintext_len ) / 16) ) * 16

Francisco C. (Intel) wrote:

Also, why is there a memcpy() if you are freeing it right after?

It was to fully simulate my AES-GCM usage but nevermind I removed this line.

Francisco C. (Intel) wrote:

I ran your speedtest() in the untrusted domain, linking to the exact same libsgx_tcrypto that is used in the trusted domain, and the results were about the same.

If you look at the source for the sgx functions you are calling,

https://github.com/01org/linux-sgx/blob/master/sdk/tlibcrypto/sgx_aes_gcm.cpp

you will notice that, for example sgx_rijndael128GCM_encrypt:

  getsize();
  malloc();
  aes_gcm_init_state();
  aes_gcm_start();
  aes_gcm_encrypt();
  aes_gcm_get_tag();
  memset();

Does the 'speedtest' of openssl do all those equivalent calls inside their timed loop or do they pre-allocate and/or initialize all the contexts and then only measure the encrypt() calls?

I would try to analyze what openssl speedtest does and try to replicate that functionality using the lower-level ipps_AES_GCM* family of functions to get a better comparison.

Hmm Indeed I will have a look at that and give some feedback with a side-by-side test. Where can I find the documentation for those lower-level ipps functions ?

Thank you !

Carlos

0 Kudos
Highlighted
Beginner
92 Views

Surenthar Selvaraj. (Intel) wrote:

Quote:

Carlos A. wrote:

 

Hi there,

is it possible for anyone to confirm/deny the very low encryption performance ? If the data processed is private it must be decrypted from hard drive (or network) ... but at such a decryption throughput big data processing is completely out of reach.

I am planning on releasing a note on the IACR (international association of cryptographic research) eprints on this issue, but I would like to know if you guys disagree before...

Carlos

 

 

Data sealing is not meant for encrypting/decrypting large amounts of data. It's meant to encrypt/decrypt just the secret or secrets needed by the enclave in order to save and restore its state in the event of a power transition, an application upgrade, or between application sessions. Sealing is not intended to be a general purpose bulk encryption routine. There are third parties who are developing trusted crypto libraries for Intel SGX in order to provide more complete crypto functionality than it is presented in the Intel SGX SDK.

 

Performance in this case is most likely going to related to how the user is handling memory. To protect SGX memory against physical and some misconfiguration attacks it has integrity and replay protections... this has some performance penalty although this is minimized when the memory is cached. So it depends on how the user is constructing their encrypt and decrypt operations, for instance copying memory inside the enclave to encrypt/decrypt it might not be necessary as you could encrypt/decrypt from outside the enclave and deliver the results inside the enclave for and decrypt and visa versa for an encrypt. 

 

-Surenthar

Dear Surenthar,

thank you for your reply. I indeed supposed that sealing/unsealing was not the right tool, hence the test with rjindael128GCM. What I have seen is that, with a standard compilation of the SDK on Linux, using this SGX encryption function is extremely slow. I am not saying that it is not possible to build a faster encryption function NOR that the problem could not be solved by compilations options. However, with the SDK as is, it is quite hard to process large amounts of data.

I'll be glad to see third party faster implementations for AES that can be safely run in SGX ! Or to find compilation options that improve the performance !

best,

Carlos

0 Kudos
Highlighted
Beginner
92 Views

Juan D. (Intel) wrote:

Hello Carlos,

I think you're running with an unoptimized version of the IPP crypto library. This version is written in plain C and is not taking advantage of any CPU feature. I ran your test program using the optimized IPP crypto library and I got a much better result.

To ensure that we're reporting the "same" results, could you post the code that calculates the result numbers?

I'll update the code and post the OpenSSL (outside the enclave) and AES-GCM (inside the enclave) results.

Dear Juan,

that looks great ! Is it possible to include AES-NI ?

I attach my Enclave.cpp Enclave.edl and App.cpp files. Tell me if you need anything else !

best,

Carlos

Edit : could not upload my Enclave.edl so I paste here its contents

/*
 * Copyright (C) 2011-2016 Intel Corporation. All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 *
 *   * Redistributions of source code must retain the above copyright
 *     notice, this list of conditions and the following disclaimer.
 *   * Redistributions in binary form must reproduce the above copyright
 *     notice, this list of conditions and the following disclaimer in
 *     the documentation and/or other materials provided with the
 *     distribution.
 *   * Neither the name of Intel Corporation nor the names of its
 *     contributors may be used to endorse or promote products derived
 *     from this software without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
 * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
 * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
 * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 *
 */

/* Enclave.edl - Top EDL file. */

enclave {
    
    include "user_types.h" /* buffer_t */
    include "sgx_tseal.h"

    /* Import ECALL/OCALL from sub-directory EDLs.
     *  [from]: specifies the location of EDL file. 
     *  [import]: specifies the functions to import, 
     *  
  • : implies to import all functions. */ from "Edger8rSyntax/Types.edl" import *; from "Edger8rSyntax/Pointers.edl" import *; from "Edger8rSyntax/Arrays.edl" import *; from "Edger8rSyntax/Functions.edl" import *; from "TrustedLibrary/Libc.edl" import *; from "TrustedLibrary/Libcxx.edl" import ecall_exception, ecall_map; from "TrustedLibrary/Thread.edl" import *; trusted { public void speedtest(int num, int size, int seal); }; /* * ocall_print_string - invokes OCALL to display string buffer inside the enclave. * [in]: copy the string buffer to App outside. * [string]: specifies 'str' is a NULL terminated buffer. */ untrusted { void ocall_print_string([in, string] const char *str); }; };
  •  

    0 Kudos
    Highlighted
    Employee
    92 Views

    I commented out the original test calls. All SKL processors support AES-NI so the optimized library is taking advantage of that feature, so is OpenSSL.

    $ ./app 19 8192 speed_test
    Initializing enclave...Calling enclave functions...
    Testing encryption/decryption of 524288 elements of size 8192.
    Time spent: 3.570483 seconds.

    OpenSSL results on the same platform:

    $ openssl speed -evp aes-128-gcm
    Doing aes-128-gcm for 3s on 16 size blocks: 148933529 aes-128-gcm's in 3.00s
    Doing aes-128-gcm for 3s on 64 size blocks: 82045096 aes-128-gcm's in 3.01s
    Doing aes-128-gcm for 3s on 256 size blocks: 28569819 aes-128-gcm's in 3.01s
    Doing aes-128-gcm for 3s on 1024 size blocks: 7368415 aes-128-gcm's in 3.00s
    Doing aes-128-gcm for 3s on 8192 size blocks: 927991 aes-128-gcm's in 3.00s
    OpenSSL 1.0.1f 6 Jan 2014
    built on: Mon May  2 16:53:18 UTC 2016
    options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx) 
    compiler: cc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
    The 'numbers' are in 1000s of bytes per second processed.
    type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    aes-128-gcm     794312.15k  1744480.45k  2429858.36k  2515085.65k  2534034.09k
    

    I believe the results are of the same order. Taking into account that OpenSSL is running outside EPC the small difference makes sense.

    0 Kudos
    Highlighted
    Employee
    92 Views

    The IPP Crypto API is documented here

    The AES-GCM API, in particular, is here.

    0 Kudos
    Highlighted
    Beginner
    92 Views

    Juan D. (Intel) wrote:

    I commented out the original test calls. All SKL processors support AES-NI so the optimized library is taking advantage of that feature, so is OpenSSL.

    $ ./app 19 8192 speed_test
    Initializing enclave...Calling enclave functions...
    Testing encryption/decryption of 524288 elements of size 8192.
    Time spent: 3.570483 seconds.

    OpenSSL results on the same platform:

    $ openssl speed -evp aes-128-gcm
    Doing aes-128-gcm for 3s on 16 size blocks: 148933529 aes-128-gcm's in 3.00s
    Doing aes-128-gcm for 3s on 64 size blocks: 82045096 aes-128-gcm's in 3.01s
    Doing aes-128-gcm for 3s on 256 size blocks: 28569819 aes-128-gcm's in 3.01s
    Doing aes-128-gcm for 3s on 1024 size blocks: 7368415 aes-128-gcm's in 3.00s
    Doing aes-128-gcm for 3s on 8192 size blocks: 927991 aes-128-gcm's in 3.00s
    OpenSSL 1.0.1f 6 Jan 2014
    built on: Mon May  2 16:53:18 UTC 2016
    options:bn(64,64) rc4(16x,int) des(idx,cisc,16,int) aes(partial) blowfish(idx) 
    compiler: cc -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -DTERMIO -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
    The 'numbers' are in 1000s of bytes per second processed.
    type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
    aes-128-gcm     794312.15k  1744480.45k  2429858.36k  2515085.65k  2534034.09k
    

    I believe the results are of the same order. Taking into account that OpenSSL is running outside EPC the small difference makes sense.

    Nice ! Can you give me the code you used ? (I haven't understood what you mean by "I commented out the original test calls")

    What are the compilation options you used for your SDK ? Anything else I would need to be able to reproduce your results ?

    Thank you very much !

    Carlos

    0 Kudos
    Highlighted
    Employee
    92 Views

    I commented out the original calls from the SampleEnclave code, since they are not necessary.

      /* Utilize edger8r attributes */
      //edger8r_array_attributes();
      //edger8r_pointer_attributes();
      //edger8r_type_attributes();
      //edger8r_function_attributes();
      
      /* Utilize trusted libraries */
      //ecall_libc_functions();
      //ecall_libcxx_functions();
      //ecall_thread_functions();

    Other than that, the code is the same.

    I build the enclave using the default options.

    Unfortunately, you won't be able to reproduce these results yet. I'm using the optimized IPP crypto library, which isn't part of the open source release yet.

    0 Kudos
    Highlighted
    Employee
    92 Views

    There is an option, in case you're willing to do some extra work. :)

    You can download the IPP crypto library from here https://software.intel.com/intel-ipp and follow the instructions to build an optimized single processor-specific library that doesn't require the dispatcher.

    This link contains provides an overview of the dispatching mechanism:

    https://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-understa...

    which contains a link to this other article. See the Single Processor Static Linkage section:

    https://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-intel-ip...

    Note that I haven't tried this myself but it should work.

    0 Kudos
    Highlighted
    Employee
    93 Views

    I don't know whether you noticed this, but the 1.7 release recently posted allows you to download and build using the optimized IPP crypto library. If you use this version you should get similar results to what I provided before.

    View solution in original post

    0 Kudos
    Highlighted
    Beginner
    92 Views

    Dear Juan,

    thank you. I came back to the thread to tell you I just was notified of that :)

    It is great as I did not have the time until now to test the approach you proposed in the other post. Thank you again for your useful help :)

    best,

    Carlos

    0 Kudos