I did some performance tests with multiplication. There is no dedicated operation for squaring, and multiplication does not seem to detect squaring situations (identical operands for multiplication)
It should be expected that a dedicated implementation of squaring provides a significant performance benefit. (at least this is my experience with own implementations of grammar school multiplication, Karatsuba, FFT).
Questions: - is there a hidden feature in MKL/GMP for optimized squaring? - are there plans for future versions of MKL to improve squaring?