Does this affect the performance (since only a part of the register is used while the rest is still be processed).
We are in the process of consolidating the 64-bit Programming forum into the Intel AVX and CPU Instructions forum and Intel C++ Compiler forum, so I am moving this thread to the C++ Compiler forum, as it is related to assembler programming.
Thanks for your patience.
Intel Software Network Support
For floating-point numbers, there are instructions that process only 1 value, e.g. mulss for single-precision multiplication and mullsd for double-precision multiplication. However, they have the same latency and throughput as multiplying the whole register.
For integer numbers, such instruction do not exists. Nevertheless, you can still use only the lower 64 bit and ignore the upper half. Obviously, the performance will be the same as for processing the whole register.
P.S.: You can find the instruction latencies of instructions in Appendix C of the