I have a precision issue with the below code. If I do the calculations for the same input in my calculator I get -13421772.8
Whereas with compiler I get -13421773.0, and this is a considerable difference for us.
The variable used for the above observation is ‘tmp’.
Please help us in resolving this.
void convert(__m128 &vrz /*inout*/, int art)
unsigned int _rounding_mode;
_rounding_mode = _MM_GET_ROUNDING_MODE();
__m128 tmp, scale_vr;
const float scale = (float)((unsigned int)1<<(31-(art)));
scale_vr = _mm_set1_ps(scale);
tmp = _mm_mul_ps(vrz, scale_vr);
vrz = _mm_insert_ps(vrz, _mm_castsi128_ps(_mm_cvtps_epi32(tmp)) , ((1)<<6) | ((1)<<4));
float a =( float) -0.8;
vrz = _mm_set1_ps(a);
Eswar Reddy K
sorry display proble... below are my compiler options:
BasicRuntimeChecks : Default
AdditionalOptions : /fp:precise
FlushDenormalResultsToZero : false
At issue here may be:
float a = (float)-0.8;
Where a does not use the same rounding mode (round down). As a quick test, compile as Debug build. After setting a=, open a Memory window and examine "&a". View as unsigned 1-byte integer. You should see "205 204 76 191". Subtract 1 from the 205 to undo the round up. Had this been zero, then 0-1 produces 255 with borrow propigating to next byte (i.e. subtract 1 from next byte). There will be some cases where the exponent will need to be adjusted, but this is not necessary for this experiment.
Once the value of a has been adjusted, continue and check the result.
For a formal fix, you will have to be careful as to how you preset your parameters that contain fractional values that cannot be precisely represented in binary. 0.1 is one such fraction as is 0.8.
The results looks ok for test cases 5 & 6.
Can please provide compiler options and other options if any for the test cases 5 & 6
Eswar Reddy K
Actually rounding is probably done by micro-operation control signal (mulps decoded into corresponding uop).It is interesting what triggers the execution of rounding mode(some control bit being set when mulps is decoded)by SIMD FPU.