Three questions:
1. How is 128-bit precision arithmetic (Floating point) or Real*16 (Type=16) implemented in the 9.1 version of FORTRAN and what are the performance issues?
2. What is the best INTEL processor to use for best performance with REAL*16 calculations?
3. Can REAL*16 calculations be vectorized for enhanced performance in solving linear and non-linear systems? e.g newton-raphson solvers, etc.
連結已複製
Dear customer,
Operations on REAL*16 are done through library calls. These operations are not vectorized, since the data types are as wide as the data types of the Streaming SIMD Extensions and vectorizing with vector length equal to one is really just code generation, not vectorization. Furthermore, the Streaming SIMD Extensions mainly deal with packed operands varying from 8-bit to 64-bit only and do not provide sufficiently general support to implement REAL*16 this way.
Aart Bik
The performance difference can vary, but expect 7-10X slower for an individual operation, possibly worse. The whole application may not be that much slower. Obviously then you want to reserve 16-byte reals for the parts of your application that need the extra precision or range.
