Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7942 Discussions

Intel CPP & exp() function for Non-Intel precessors

seby83
Beginner
502 Views
Hi,
After reading some articles and veryfied by myself than the Intel Compiler proceed a detection of Intel Processor to use the best efficient algorithm, I want to know how can I choose the exp() function from VC++ rather than the ICC one.
With basic test, there is at least a drop of 50% of the performances between ICC vs VC6.0++ SP4.
Much better, but I don't expect a response, would be to give a workaround to force the detection procedure to use any processor that support SSE, SSE2 functionalities...
Best regards,
SeBy
0 Kudos
5 Replies
TimP
Honored Contributor III
502 Views
You're mighty short on specifics. If you compile and link with -QxW, using ICL 8.1 or 9.0, you would normally get SSE or SSE2 exp(), with no switching on CPUID. If the code is vectorized, it will be an svml call, which uses parallel SSE or SSE2 instructions. Why not show a code example, including your compile command?
Among the claims of non-Intel CPUs are good speed of exp() without depending on SSE or vectorization. I didn't think VC6 had any SSE option for exp().
0 Kudos
seby83
Beginner
502 Views
Well,
I actually test with Athlon XP barton without SSE2 but with only SSE. So my compilation flag are : /LD -O3 -G5 -Qip -QxK
Please the the C Mex code (please ignore begining of the code wich is the gateway between Matlan & C)
#include "math.h"
#include "mex.h"

void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{

double *X , *Z;

int i , N=1 , numdimsX;
const int *dimsX;

/* --- Input 1 ---*/

X = mxGetPr(prhs[0]);
numdimsX = mxGetNumberOfDimensions(prhs[0]);

dimsX = mxGetDimensions(prhs[0]);

plhs[0] = mxCreateNumericArray(numdimsX, dimsX, mxDOUBLE_CLASS, mxREAL);

Z = mxGetPr(plhs[0]);
for (i = 0 ; i < numdimsX ; i++)
{

N *=dimsX;
}

for (i = 0 ; i < N ; i++)
{
Z = exp(X);

}

}
I tested with Intel CPP 7.0 & 9.0, both of the use _exp.A procedure in libmmt.lib.
In matlab
A = rand(1,3000000);
1) code compiled with MSVC
mex test.c
tic,test(A);,toc
elapsed_time =
0.5000
2) code compiled with ICC
mex -f mexopts_intelamd.bat test.c
tic,test(A);,toc
elapsed_time =
0.8300
Best regards,
SeBy
0 Kudos
JenniferJ
Moderator
502 Views
Could you report to Premier Support with the test case?
Thanks!
Jennifer
0 Kudos
andrew_lowe
Beginner
502 Views
Even though this is being referred off to the premier support area, could the rest of us please be kept informed here of the outcome. After reading the posting on Slashdot,

http://yro.slashdot.org/article.pl?sid=05/07/12/1320202&tid=142&tid=118&tid=123

a few days ago, regarding non optimal routes being used in the compiler for non Intel chips and now this, I would like to know what Intel's stance is on the whole situation of the Intel C++ compiler being used on AMD chips. If the view is that Intel gets one path and AMD another, then I think we will be reviewing our choice of compiler.

Regards,
Andrew Lowe
0 Kudos
TimP
Honored Contributor III
502 Views
Maybe I didn't explain in enough length that you would have different code paths for Intel and for AMD only if you asked for that specifically, by using options like /QaxW. If you want to run only on CPUs with SSE2, regardless of brand, you would use /QxW. If you want a single code path which will run on a machine which has SSE but not SSE2, you would use /QxK.
0 Kudos
Reply