A interesting general question when I try the new version of Intel complier under windows system:
I have implemented the same algorithm by Intel C++ and Intel Fortran. Both of them use the intel mkl. However when I compare the running time, I am surprised to see that the code by C++ is three times slower than the one by Fortran.
Then I write the another algorithm by C++ and fortran, but this time, the intel mkl is not neccessaily used. I find the speed is similar.
So Is it normal that the difference of performance between C++ and fortran when using Intel MKL is so big? Any good suggestion how to configure the visual studio 2012 to improve the C++ performance?
The performance of MKL's routines don't depend on which compiler you use for building your application. We expect the same performance for C/C++ and Fortran API for all MKL's routines. if you see the difference - pls give the example.
In the case if your implementation of some algorithm. like you said, shows the similar performance - we need to know some more psecifiec details about that: what mkl's routine you use?
what is the problem size, accuracy in any ... and etc...
You (the OP) may be jumping to conclusions.
The run time consists of two parts: (i) run time in user's code (ii) run time in MKL.
It is quite conceivable that the first part (run time in user's code) is different for the C++ and Fortran versions, depending on the quality of the code, the quality of the compiler and the compiler options chosen.
I do not believe that MKL ever tries to sniff out whether it was called from C++ or Fortran. Therefore, given calls from C++ or Fortran to MKL with identical arguments, the second part of the run time should be the same. In past versions, MKL did query CPUID and chose different code paths accordingly, but this does not apply to your question.
Without recording the two contributions to the run time separately, you cannot make valid statements about the speed of MKL.
If you use c_blas or similar wrappers, but call the core MKL functions directly from Fortran, you can expect some cost in performance for the C version.
It seems more likely that your extra time is spent in your C++ code. In order tor C++ to compete with Fortran, you must make your code valid for ansi-alias option and take advantage of the restrict extension (alternatively, CEAN), among other things. Intel compilers haven't been performing inter-procedureal optimizations across call by reference, although it seems that may be getting attention again.
>>>Then I write the another algorithm by C++ and fortran, but this time, the intel mkl is not neccessaily used. I find the speed is similar>>>
Both of algorithms are compiled and executed as x86 machine code so the speed should be similar probably with the small deviation which could stem from the calling overhead.Think about some two statements one in C++ and the other in fortran and compare their compiled machine code version.