- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
if sin is autovectorized there seems to be a signbit issue.
For example the 2nd phase (from PIHALF to PI just flips the sign.)
Here is the code:
__declspec(align(16)) float input[12] = {0.f,0.5f,1.f,1.5f,2.f,2.5f,3.0f,3.5f,4.f,4.5f,5.f,5.5f};
__declspec(align(16)) float output[12] ;
for(int i = 0 ;i < 12 ; i++)
{
output = sin(input);
}
for(int i = 0 ;i < 12 ; i++)
{
char txt[32];
sprintf(txt,"%i %f %f\\n",i,input,output);
OutputDebugString(txt);
}
this outputs:
0 0.000000 0.000000
1 0.500000 0.479426
2 1.000000 0.841471
3 1.500000 0.997495
4 2.000000 -0.909297 5 2.500000 -0.598472
6 3.000000 -0.141120
7 3.500000 -0.3507838 4.000000 -0.756802
9 4.500000 -0.977530
10 5.000000 0.95892411 5.500000 0.705540
cos seems ok on 32 bit but also has a sign problem in 64 bit.
Compiler XE 12.1.2.278 on Windows with VS2008.
Same error when using double.
If i turn off autovectorization everything is fine.
Since the performance difference is huge between vectorized and nonvectorized version this is a big issue.
Evitan
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Intel runs a long series of tests on math functions, but the tests and results aren't divulged. There are several classic open source tests, such as the TOMS Fortran ELEFUNT (which has been ported to C, including 128-bit long double) and CELEFUNT.
Recently, a lot of academic work has gone into the possibility of correctness criteria and IEEE standards for elementary math functions. I haven't seen anything remotely resembling a useful practical summary, but I'm no expert.
The questions of software vs. firmware implementation have by no means been laid to rest, e.g. with float elementary vector math functions in firmware on MIC prototypes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Intel runs a long series of tests on math functions,...
...
I absolutely believe in that.
Anyway, it is still enigma what is going on with a sign for some results of sin function when Autovectorization is used.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is an example of a Round-off problem when adding a small number. Expected result has to be
equal to 1.0, but due to a round-off problem it is not equal to1.0.
>> 'Single-Precision' Test-Case <<
...
RTfloat fSum = 0.0f;
for( i = 0; i < 1000000; i++ )
fSum += 0.000001f;
...
Results:
Accuracy _RTFPU_PC_24 - RTfloat - Result: 1.00903892517089840
Relative error: 0.895795524120330810 %
Accuracy _RTFPU_PC_53 - RTfloat - Result: 1.00903892517089840
Relative error: 0.895795488699064450 %
Accuracy _RTFPU_PC_64 - RTfloat - Result: 1.00903892517089840
Relative error: 0.895795488699064560 %
Note: It also demonstrates that in case of a 'Single-Precision' type 53-bit and 64-bitprecisions are not applicable
>> 'Double-Precision' Test-Case <<
...
RTdouble dSum = 0.0L;
for( i = 0; i < 1000000; i++ )
dSum += 0.000001L;
...
Results:
Accuracy _RTFPU_PC_24 - RTdouble - Result: 1.00903892517089840
Relative error: 0.895795524120330810 %
Accuracy _RTFPU_PC_53 - RTdouble - Result: 1.00000000000791810
Relative error: 0.000000000791811061 %
Accuracy _RTFPU_PC_64 - RTdouble - Result: 1.00000000000791810
Relative error: 0.000000000791811061 %
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
Sorry it took a little bit of time to get to this, but I had a couple different developers on our math runtime team looking at this and wanted to make sure we understood the behavior here before I replied.
Autovectorization of the sin() function call here causes a call to a dedicated SIMD-friendly sin routine which is different from the scalar sin (you'll see this in generated asm as __svml_sinf4). That SIMD-friendly version, by design, assumes round-to-nearest-even default rounding mode and the trigonometric reduction in that function leads to the wrong quadrant for the reduced argument if run in non-standard rounding modes. Older versions of this SIMD sin() used a different reduction algorithm which is why we dont see this issue with 11.1. This is also true about the scalar sin call it uses a different reduction algorithm. The reason the algorithm was changed from 11.1 to 12.x was for performance reasons.
The compiler defaults to /fp:fast which among other things implies (along with the lack of a user-inserted #pragma stdc fenv_access) that the floating-point environment has not been manually changed, and therefore the compiler can generate calls to the optimized SIMD versions of these math functions, so in this situation where the rounding mode is being changed, we recommend (and document in the User's Guide) that /fp:strict be used, which will fall back to the scalar math function which can handle different rounding modes without affecting results at the precision normally expected.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The XE 2011 SP1 documentation for -fp-fast does not state that there is a quadrant issue relating to sin/cos that will result in a flip in sign of the result of some calculations. With the default being -fp=fast means the default has a defective sin/cos routine that must be addressed by the developers. The documentation states "These optimizations increase speed, but may alter the accuracy of floating-point computations.". Accuracy difference, by interpretation of any user of your code, would be thatsome minor loss in precision of the mantissa is to be expected, but NOT of the sign of the result. It is Intel's responsibility to produce reliable code be it with some loss in precision of the mantissa.
In an earlier post I mentioned a software glitch (programming error) that led to an airplane crash. In this case, the programming error caused the artificial horizon to flip over (sign change) causing the autopilot to flip control inputs, and subsequently may have engaged overrides to pilot inputs. I do not know how else to stress that this is a serious problem that needs to be fixed.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Accuracy difference, by interpretation of any user of your code, would be thatsome minor loss in precision of the mantissa is to be expected, but NOT of the sign of the result. It is Intel's responsibility to produce reliable code be it with some loss in precision of the mantissa.
...
[SergeyK] I wonder if Intel's software developershave decided to change trigonometric laws?
...
Brandon,
It doesn't matter what aFloating Point Model ('strict', 'fast' or'precise' ) is selected for an application,
a sign must be rightfor avaluereturnedfrom'sin' or 'cos'functions.
It doesn't matter if 'sin' or 'cos' functionsare implemented asscalar,non-scalar, SIMD-friendly, etc.
Guys, you're breaking trigonometric laws!
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When you compile with -fast (the default) in order to get automatic invocation of svml libraries, you must not tinker with rounding modes. If you intend to change rounding modes, you should set -fp:strict, or at least set rounding mode back to normal before entering the math library function.
You could test yourself whether the time required to save rounding mode, set round-nearest (and later restore your local rounding mode) is large, and whether it solves associated numeric problems. The original P4 was notoriously slow in saving and restoring rounding modes, particularly in x87; later CPUs had to improve on it, but the compiler retains some choices which seem to have been make for P4.
I can't tell whether all the comments made in this thread are related to the original question. Needless to say, if errors such as are reported here occur with default optimizations, with rounding modes left at default, this would make it extremely important to test whether options such as -fp:source or -fp:strict, and/or -Qimf-arch-consistency:true are required in each application.
I'm also having difficulty guessing whether the comments all refer to the same OS.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry for the delay herein my response - I was havingan involveddiscussion with Martyn Corden on my team and several of our math libraries developers on this topic, and I wanted to wait until that was settled before I responded.
To quickly summarize the result of that discussion though, our math libraries team islooking into if we can change this algorithm to not be as nearly significantly impacted by the rounding mode change as it currently is without dropping the performance of the algorithm at the same time. I'll keep this thread posted when there's an update on their progress.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
if we can change this algorithm to not be as nearly significantly impacted by the rounding
mode change as it currently is without dropping the performance of the algorithm at the
same time.I'll keep this thread posted when there's an update on their progress.
Hi Brandon,
Are there any updates?
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Completely agree with you Sergey.doesn't matter what aFloating Point Model ('strict', 'fast' or'precise' ) is selected for an application,
a sign must be rightfor avaluereturnedfrom'sin' or 'cos'functions
Some loss in the accuracy of the returned value is acceptable, but not the change in the sign.
Such a behaviour can be catastrophic.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The Intel compiler -help option tells you explicitly that accuracy of math functions may depend on imf- option settings, although the application to trig functions is left to the imagination. It also tells you that if you want equally accurate treatment regardless of CPU type, you must set the imf-arch-consistency=true.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For example, for arguments of magnitude > 2^64 for the x87 firmware functions there is no effort to get
It is hard to imagine such a large argument being passed to sin() function in real-world application.
Also division and rounding in the range-reduction algorithm can introduce someinnacuracies in the range reduced argument.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Completely agree with you Sergey. Some loss in the accuracy of the returned value is acceptable,doesn't matter what aFloating Point Model ('strict', 'fast' or'precise' ) is selected for an application,
a sign must be rightfor avaluereturnedfrom'sin' or 'cos'functions
but not the change in the sign. Such a behaviour can be catastrophic.
In Post #8:
http://software.intel.com/en-us/forums/showpost.php?p=175161
Jim gave a real ( unfortunately,tragic ) example what happens in such cases.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim gave a real ( unfortunately,tragic ) example what happens in such cases
IIRCI read about fortran related error probably caused by the programmer which was wrongly interpreted by the compiler and lead to the catastrophic explosion of the rocket booster involved in space mission.
http://www.sundoginteractive.com/sunblog/posts/top-ten-most-infamous-software-bugs-of-all-time/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[plain]It is hard to imagine such a large argument being passed to sin() function in real-world application. Also division and rounding in the range-reduction algorithm can introduce some innacuracies in the range reduced argument. [/plain]
You might find http://www.derekroconnor.net/Software/Ng--ArgReduction.pdf interesting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Itishardtoimaginesuchalargeargumentbeingpassedtosin()functioninreal-worldapplication.
- Alsodivisionandroundingintherange-reductionalgorithmcanintroducesomeinnacuraciesintherangereducedargument.
[plain]It is hard to imagine such a large argument being passed to sin() function in real-world application. Also division and rounding in the range-reduction algorithm can introduce some innacuracies in the range reduced argument. [/plain]
You might find http://www.derekroconnor.net/Software/Ng--ArgReduction.pdf interesting.
Hi everybody,
Jeff, thank you. The article is good and very interesting. However, I won't be surprised if somebody will defend a PhD on
how to reduce 2^1024 radians to some right value. We do programming and we try apply some knowledge to some real life problems.
I completely agree with Iliya and I can't imaging, for example,a satellite tracking hardware that will return a look-at-angle values
greater than 180 degrees ( onePI radians ).
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Update to the issue.
This issue is fixed in the next version (13.0) of Intel C++ compiler. The beta of 13.0 just ended. If you already registered for the beta, you can still download it.
For 12.1, please use the two work-arounds:
1. use this option like TimP provided: /Qimf-arch-consistency=true
This is an easy work-around.Now I'm wondering why it's not set as default.
2. do not use setRoundingMode() to change the mode.
The fix is in 13.0 only, there are some challenge in porting the fix to 12.1 without affecting the performance in general.
Thanks,
Jennifer
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You might find http://www.derekroconnor.net/Software/Ng--ArgReduction.pdf interesting
.
I read thisdocumentit is very interesting and insightful.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I completely agree with Iliya and I can't imaging, for example,a satellite tracking hardware that will return a look-at-angle values
greater than 180 degrees ( onePI radians ).
Radian argument to some software/hardware sine implementation will be directly depended on some physical constraints which measured process or device exhibits.For example: some rotational modulator-filter which modulates emitted IR energy from the fast moving target(airplane) in such a system a value of few hundreds radians(fast rotational speed)will be sufficient to spot very earlythe change in the target position relative to the filter.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting Jeff Arnold (Intel)
I've done a couple of simpleverifications for a test-case used in the article (sin( 10^22 * OneRadian ) ) and
here is a screenshot:
Windows Calculator was used:a top result iswithout a range reduction and bottom result is with a "simulated" range reduction.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
( top - without range reduction / bottom - with "simulated" range reduction )
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page