- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is there a way to force using SSE instead of FPU for all floating point operations? There seems to be an option to select that on Linux version of ICC, but I cant find it on Windows.
I am using ICC 10.0 on Windows Vista 64.
Thanks in advance.
Link Copied
6 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - marcos
Is there a way to force using SSE instead of FPU
I am using ICC 10.0 on Windows Vista 64.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - tim18
The Windows X64 compiler doesn't like to use x87 code at all. As on linux, the default is SSE2, but more so, no loopholes, not even /Od or /Op. Don't try /Qlongdouble, let the long doubles be set to double.
Sorry, I forgot to say that I am using the 32 bit compiler, not the 64 bit one.
Also, the problem I am trying to solve (precision issues with float operation) dissapears whenever I increase precision of floating point operation (/fp:double, /fp:extended, ...) but then performance gets hurt... Normally, my command line parameters are:
[cpp]/EHsc /Gd /GS /GR /Qprec /Ob2 /MD -G7 -O3 -QaxW -Qipo[/cpp]
[cpp]
[/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
/fp:double forces expressions to be evaluated in double precision. In the 32-bit compiler, this might be done with x87 code, which would be faster than SSE2 promotions and normally give the same result. You should find out where your application requires double and write it in, so you don't suffer the performance loss everywhere.
Up through ICL 10.1, /QaxW generates both an x87 and an SSE2 code path, with the choice made by CPU run-time recognition. AMD CPUs would get the x87 path, so they would often get the effect of /fp:double.
/fp:extended would require expressions to be evaluated by x87, with precision mode set to 64.
These /fp promotions normally would disable vectorization where they take effect.
Up through ICL 10.1, /QaxW generates both an x87 and an SSE2 code path, with the choice made by CPU run-time recognition. AMD CPUs would get the x87 path, so they would often get the effect of /fp:double.
/fp:extended would require expressions to be evaluated by x87, with precision mode set to 64.
These /fp promotions normally would disable vectorization where they take effect.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - tim18
/fp:double forces expressions to be evaluated in double precision. In the 32-bit compiler, this might be done with x87 code, which would be faster than SSE2 promotions and normally give the same result. You should find out where your application requires double and write it in, so you don't suffer the performance loss everywhere.
Up through ICL 10.1, /QaxW generates both an x87 and an SSE2 code path, with the choice made by CPU run-time recognition. AMD CPUs would get the x87 path, so they would often get the effect of /fp:double.
/fp:extended would require expressions to be evaluated by x87, with precision mode set to 64.
These /fp promotions normally would disable vectorization where they take effect.
Up through ICL 10.1, /QaxW generates both an x87 and an SSE2 code path, with the choice made by CPU run-time recognition. AMD CPUs would get the x87 path, so they would often get the effect of /fp:double.
/fp:extended would require expressions to be evaluated by x87, with precision mode set to 64.
These /fp promotions normally would disable vectorization where they take effect.
What I would like to do is to remove any inconsistencies due to the mix of FPU and SSE code, by forcing use of SSE. Something similar to what -fpmath=sse does in gcc (someone told me that option actually exists on ICC linux). Is there something like that on Win32?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - marcos
What I would like to do is to remove any inconsistencies due to the mix of FPU and SSE code, by forcing use of SSE. Something similar to what -fpmath=sse does in gcc (someone told me that option actually exists on ICC linux). Is there something like that on Win32?
The inconsistencies, if you call it that, which you get with /fp:double, aren't different between SSE2 and x87.
You seem to be using earlier compilers, with options specifically implying you want various mixtures of x87, so it's hard to know your goal.
The instructions you get with /Qprec-div- /Qprec-sqrt- (as implied by some of your quoted options) are SSE instructions, but they are inconsistent with IEEE standard.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - tim18
You should have seen enough hints by now. If you use Intel 11.x compilers, and don't set any options which imply x87, you get SSE everywhere it doesn't cost performance, and some places where it does. I think the most likely place for 11.x to produce x87 without you asking for it is in complex arithmetic. If you used gcc and set -ffast-math, you would get worse "inconsistencies." If you facilitate vectorization, you may even get SSE2 math functions where you would get more accurate x87 without vectorization. By the way, -fpmath=sse doesn't specify the libraries you get with gcc either, not on Windows, not on linux,....
The inconsistencies, if you call it that, which you get with /fp:double, aren't different between SSE2 and x87.
You seem to be using earlier compilers, with options specifically implying you want various mixtures of x87, so it's hard to know your goal.
The instructions you get with /Qprec-div- /Qprec-sqrt- (as implied by some of your quoted options) are SSE instructions, but they are inconsistent with IEEE standard.
The inconsistencies, if you call it that, which you get with /fp:double, aren't different between SSE2 and x87.
You seem to be using earlier compilers, with options specifically implying you want various mixtures of x87, so it's hard to know your goal.
The instructions you get with /Qprec-div- /Qprec-sqrt- (as implied by some of your quoted options) are SSE instructions, but they are inconsistent with IEEE standard.
I finally managed to try with ICC 11.0.72, and it fixed the problem.
Thanks.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page