Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

CVTPS2PD - Highly populated

srimks
New Contributor II
428 Views

Hi.

I am not sure that my posting is in right forum, instead of being in "Intel AVX & CPU Instructions", I had posted in "Intel C++ Compiler", sorry for it.

I 'm quite inexperienced when it comes to micro optimization. I am trying to use and play around with Intel C++ compiler (v-11.0) optimizations.

When I see the asm content using VTune, it seems to be somewhat unefficient code which I guess. Indeed, looking at the disassembly, two things seem odd to me:

- the code seems to be using double precision values (cvtps2pd appears a lot) and consume maximum CPU_CLK_UNHALTED_CORE in percentages in comparision to other instructions(movss, mulsd, subsd, movaps, fstpd, leaq, etc)
- the code seems never to operate on several components at once (movss, addsd, mulsd, etc)

The source code snapshot can't be replicated here as being long.

Query:

(a) How to have an optimal code?

(b) How to reduce CVTPS2PD operation to minimal?

~BR

0 Kudos
1 Reply
TimP
Honored Contributor III
428 Views

You are omitting relevant information. If you want to avoid promotion of float expressions to double, you would use one of the options -fp-model source, or the default -fp-model fast, not one of the options which invokes the double evaluation of 20 years ago.

If you have written mixed precision into your source, and didn't meant to do that, you must fix your source. Likely possibilities, in the style of 20 years ago, are that you mixed double constants (no f suffix) into your float type code, or you used generic math function names (like sin(), cos(), exp(), log()) in your float code but didn't #include with the -std=c99 option.

Many people used to write mixed precision code to use with the 32-bit non-SSE option (currently -mia32) so as to prevent their code from gaining from vectorization or optimizing on a non-IA platform, or because it was reasonable for K&R C, which didn't support float constants or math functions.

0 Kudos
Reply