Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

transcendental speed

mrentropy1
Beginner
746 Views
This might not belong in MKL forum, but I'm not sure where else to put it - sorry.

Anybody know how transcendental function evaluation on Intel 64 compares with floating-point divide, for 64-bit float? I have a transformation I could write with sines and cosines, or I could do it differently and use a simple divide - but doing it that way requries a lot more work on my part.... This is in a numerically-intensive code in what I think may be a significant bottleneck, so faster is better. This will be parallel operations on a very large array.

Thanks,
Peter

P.S. Based on my own timing I get mult : divide : sqrt() : sin() speed ratio of
1 : 1.7 : 2.3 : 5.6
for 32- bit and
1 : 2.8 : 3.3 : 6.7
for 64 bit, on a Core2Duo, but I'm not sure if/when that translates into raw clock cycle ratios...., and how other factors might have affected my measurement. That's using Intel Fortran with no compiler flags.
0 Kudos
3 Replies
TimP
Honored Contributor III
746 Views
ifort defaults to enabling auto-vectorization, with calls to svml (short vector) math library. If you have vectorizable loops several thousand elements long, the VML library in MKL might do better. You could look up quoted performance for VML. Anyway, the numbers you quote look reasonable as a rough guide for scalar code.
0 Kudos
mrentropy1
Beginner
746 Views
Great. Thanks very much!!!!
0 Kudos
Shane_S_Intel
Employee
746 Views
The following page (http://www.intel.com/software/products/mkl/data/vml/functions/_performanceall.htm) gives the cycle counts (per element)for the Intel MKL vector math library functions. A quick review of it will likely provide you the insights you need on the best way to code your algorithm. Regards, Shane
0 Kudos
Reply