- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
I am planning to change the architecture option from /arch:IA32 to /arch:SSE2, in order to enable vectorisation. I am aware that this comes at a numerical cost, i.e. the results will not look the same. This is because the x87 instructions store the intermediate results in 80bit significand precision.
I noticed that there is an option to specify the significand precision /Qpcn. If I specify /arch:SSE2 and /Qpc80, will this improve the accuracy of results, without disabling any optimisations?
Is there anything else I could set, so that the accuracy of results will remain high?
Kind regards,
Daniel.
I am planning to change the architecture option from /arch:IA32 to /arch:SSE2, in order to enable vectorisation. I am aware that this comes at a numerical cost, i.e. the results will not look the same. This is because the x87 instructions store the intermediate results in 80bit significand precision.
I noticed that there is an option to specify the significand precision /Qpcn. If I specify /arch:SSE2 and /Qpc80, will this improve the accuracy of results, without disabling any optimisations?
Is there anything else I could set, so that the accuracy of results will remain high?
Kind regards,
Daniel.
1 Solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
/Qpc80 affects only x87 execution. It extends the extra precision boost to double precision data types.
For full accuracy of SSE code, I use /assume:protect_parens /Qprec-div /Qprec-sqrt. /Qftz- also will improve accuracy for tiny operands, and should have little performance impact on the Sandy Bridge CPU generation.
All of those options are included in /fp:source; all but /Qftz apply also to IA32 code but become more important with SSE.
If you have expression evaluations which depend on promotion to double precision, they should be written explicitly in the source code so as to control variations from one compiler and architecture to another.
For full accuracy of SSE code, I use /assume:protect_parens /Qprec-div /Qprec-sqrt. /Qftz- also will improve accuracy for tiny operands, and should have little performance impact on the Sandy Bridge CPU generation.
All of those options are included in /fp:source; all but /Qftz apply also to IA32 code but become more important with SSE.
If you have expression evaluations which depend on promotion to double precision, they should be written explicitly in the source code so as to control variations from one compiler and architecture to another.
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
/Qpc80 affects only x87 execution. It extends the extra precision boost to double precision data types.
For full accuracy of SSE code, I use /assume:protect_parens /Qprec-div /Qprec-sqrt. /Qftz- also will improve accuracy for tiny operands, and should have little performance impact on the Sandy Bridge CPU generation.
All of those options are included in /fp:source; all but /Qftz apply also to IA32 code but become more important with SSE.
If you have expression evaluations which depend on promotion to double precision, they should be written explicitly in the source code so as to control variations from one compiler and architecture to another.
For full accuracy of SSE code, I use /assume:protect_parens /Qprec-div /Qprec-sqrt. /Qftz- also will improve accuracy for tiny operands, and should have little performance impact on the Sandy Bridge CPU generation.
All of those options are included in /fp:source; all but /Qftz apply also to IA32 code but become more important with SSE.
If you have expression evaluations which depend on promotion to double precision, they should be written explicitly in the source code so as to control variations from one compiler and architecture to another.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your answer.
What do you mean by:
"expression evaluations which depend on promotion to double precision"?
Some thing like this
real(8) :: x
real(4) :: y,z
Instead of
x = y + z ! This can produce different results when migrated.
I should have
x = dble(y) + dble(z) ! This is consistent all the time
?
What do you mean by:
"expression evaluations which depend on promotion to double precision"?
Some thing like this
real(8) :: x
real(4) :: y,z
Instead of
x = y + z ! This can produce different results when migrated.
I should have
x = dble(y) + dble(z) ! This is consistent all the time
?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You would require 3 or more operands in an expression (taking into account that optimization may extend across assignments) before promotion to double could make a difference. In cases such as
v = (w-x) + (y-z)
promotion e.g.
v = (w-dble(x)) + (y-dble(z))
could avoid numerical problems. If your compiler ignores parentheses (as ifort does by default) you need to promote at least 3 of the 4 operands explicitly.
Usually, when you see code written carefully with parentheses, the author expects the specified order of operations to give good results without requiring extra precision, provided that the compiler heeds the parentheses. This is a sign that you should set -assume:protect_parens or some option which implies it e.g. /fp:source or /standard-semantics
The implication in my example is that the differences w-x and y-z are expected to be small, such that the expressed order of operation will be accurate, but other orders of evaluation are likely to be inaccurate.
The rules of Fortran allow a compiler to make certain transformations in expression evaluation, such as
a*x -y*a => a*(x-y)
or
x/y/z => x/(y*z)
(but not in reverse), where promotion to double may make a significant difference. If you write code which is eligible for such optimization, you have no guarantee which way it will go.
The suspicion remains that the default treatment of parentheses by Intel compilers (like "traditional" K&R C) is a holdover from the x87 behavior (with /Qpc80 or equivalent). Current gcc/gfortran require a specific option to be set if one wishes to ignore the language rules about parentheses.
Unfortunately, when you set ifort -assume:protect_parens, you forbid useful "legal" transformations such as
x/2 => x*.5
but that is off the topic you raised, as it doesn't change numerical results.
v = (w-x) + (y-z)
promotion e.g.
v = (w-dble(x)) + (y-dble(z))
could avoid numerical problems. If your compiler ignores parentheses (as ifort does by default) you need to promote at least 3 of the 4 operands explicitly.
Usually, when you see code written carefully with parentheses, the author expects the specified order of operations to give good results without requiring extra precision, provided that the compiler heeds the parentheses. This is a sign that you should set -assume:protect_parens or some option which implies it e.g. /fp:source or /standard-semantics
The implication in my example is that the differences w-x and y-z are expected to be small, such that the expressed order of operation will be accurate, but other orders of evaluation are likely to be inaccurate.
The rules of Fortran allow a compiler to make certain transformations in expression evaluation, such as
a*x -y*a => a*(x-y)
or
x/y/z => x/(y*z)
(but not in reverse), where promotion to double may make a significant difference. If you write code which is eligible for such optimization, you have no guarantee which way it will go.
The suspicion remains that the default treatment of parentheses by Intel compilers (like "traditional" K&R C) is a holdover from the x87 behavior (with /Qpc80 or equivalent). Current gcc/gfortran require a specific option to be set if one wishes to ignore the language rules about parentheses.
Unfortunately, when you set ifort -assume:protect_parens, you forbid useful "legal" transformations such as
x/2 => x*.5
but that is off the topic you raised, as it doesn't change numerical results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank very much Tim!
Excellent example.
Excellent example.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page