Software Tuning, Performance Optimization & Platform Monitoring
Discussion around monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform monitoring
1606 Discussions

## Test results for CRT-function 'sqrt' for different Floating Point Models Valued Contributor II
103 Views
Intel C++ compiler supports four Floating Point Models: Fast, Precise, Strict and Fast=2 and test results demonstrate how performance is affected when CRT-function sqrt is called ( results are in ASC order ): [ Fast=2 (/fp:fast=2) [Intel C++] ] 32-bit Windows platform CRT Sqrt - float - Calculating the Square Roots - 203 ticks Last Result: 134217728.000^0.5 = 11585.236 CRT Sqrt - double - Calculating the Square Roots - 750 ticks Last Result: 134217728.000^0.5 = 11585.237 [ Fast (/fp:fast) ] 32-bit Windows platform CRT Sqrt - float - Calculating the Square Roots - 219 ticks Last Result: 134217728.000^0.5 = 11585.236 CRT Sqrt - double - Calculating the Square Roots - 750 ticks Last Result: 134217728.000^0.5 = 11585.237 [ Precise (/fp:precise) ] 32-bit Windows platform CRT Sqrt - float - Calculating the Square Roots - 422 ticks Last Result: 134217728.000^0.5 = 11585.237 CRT Sqrt - double - Calculating the Square Roots - 750 ticks Last Result: 134217728.000^0.5 = 11585.237 [ Strict (/fp:strict) ] 32-bit Windows platform CRT Sqrt - float - Calculating the Square Roots - 875 ticks Last Result: 134217728.000^0.5 = 11585.237 CRT Sqrt - double - Calculating the Square Roots - 2969 ticks Last Result: 134217728.000^0.5 = 11585.237 As you can see Fast=2 is the fastest Floating Point Model, and Strict is the slowest ( slower in 875 / 203 = ~4.3x ).  