128-bit precision arithmetic (Floating point) and Real*16 with Fortran 9.1

postaquestion · ‎10-19-2006

Three questions:

1. How is 128-bit precision arithmetic (Floating point) or Real*16 (Type=16) implemented in the 9.1 version of FORTRAN and what are the performance issues?

2. What is the best INTEL processor to use for best performance with REAL*16 calculations?

3. Can REAL*16 calculations be vectorized for enhanced performance in solving linear and non-linear systems? e.g newton-raphson solvers, etc.

Intel_C_Intel · ‎10-20-2006

Dear customer,

Operations on REAL*16 are done through library calls. These operations are not vectorized, since the data types are as wide as the data types of the Streaming SIMD Extensions and vectorizing with vector length equal to one is really just code generation, not vectorization. Furthermore, the Streaming SIMD Extensions mainly deal with packed operands varying from 8-bit to 64-bit only and do not provide sufficiently general support to implement REAL*16 this way.

Aart Bik

http://www.aartbik.com/

dwmccarn · ‎10-23-2006

OK - Now to answer the question:

What is the performance difference between REAL*8 and REAL*16?

What do you mean Library calls?

Steven_L_Intel1 · ‎10-23-2006

Since there are no processor instructions for 16-byte reals, each operation, such as add or multiply or sqrt, is done by calling a software routine which implements the operation using integer instructions.

The performance difference can vary, but expect 7-10X slower for an individual operation, possibly worse. The whole application may not be that much slower. Obviously then you want to reserve 16-byte reals for the parts of your application that need the extra precision or range.