Are there any cases the gives worst performance than other compilers especially studio compiler v120. When I compiled a sample of matrix multiplication with using inline assembly avx code with visual c 2013 ultimate the performance of intel compiler is bad than the visual studio compiler 120. Can anyone help me what is the reasons to that?
my machine is intel core i5
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
I haven't seen a case like that, but there are (usually) small differences in the way the various compilers deal with simd intrinsics. For example, icl may change generated code according to /arch, while gcc may be more influenced by unroll settings. So an actual example, including your preferred settings, should help.
As icl covers many of the important cases of matrix multiplication in the performance libraries, it seems relatively unusual that intensive intrinsics coding would be justified.
thanks for your replying
i didn't use any libraries i write my own hand code using inline assembly and i surprised that the performance using visual studio compiler is better than icl compiler!