Solved: Some initial issues, variable types and long doubles treatment

MPere15 · ‎05-28-2012

Hello,

Do long doubles need a special treatment? Anyone know a reason why the would stop working properly?

I've recently started to migrate my project to Paralell Studio.
So far, the setup is ok, and I have sample projects up and running, that use Intel C++ compiler.
Now, I have problems with the main project. It is a DirectX 3d software, and it uses a wide set of libraries.
Program loads and starts and I can tell that it is running, however, it is not working as expected, because none of the objects are being positioned.

Since positions are stored in long double variable type, I was thinking that this could be related, as I see other posts about long doubles here in the forum.

It even fails with all the .cpp files using Microsoft compiler (while Intel C++ is still the project compiler, just used for linking).

So, once again, do long doubles need a special treatment?
Are there any other variable type with special treatment? (variable definitions, assignements, operations..)

Thanks in advance

SergeyKostrov · ‎06-11-2012

Quoting mpl3d

...Regarding the FPU settings I do need to set it like this

#pragma fenv_access (on)
unsigned int cw = _control87(0, 0);
_clear87();
_control87(0x00000000, 0x00030000);
...

Yes, it will work. However, you have to be very carefull when using constants in '_control87' functioninstead macros
defined in 'float.h' header file. It will work for Microsoft, Intel and MinGW C++ compilers, and it won't work for Borland and
Turbo C++ compilers. A better versionlooks like:

...
_control87( _CW_DEFAULT, _MCW_PC );
...

Using a_PC_64 macrois useless when sizeof( long double ) is equal to 8. It won't make a precision of FP-calculations better and
is actually the same as if you woulduse a_PC_53 macro.

View solution in original post

SergeyKostrov · ‎05-31-2012

Hi,

Quoting mpl3d

Do long doubles need a special treatment?..

Could you provide more technical details? A simple example of calculations that reproduces the problem
would be useful.

Thank you in advance.

Best regards,
Sergey

TimP · ‎06-01-2012

As the other post indicates, it seems the support for long double as distinct from double may have been removed from Parallel Studio, and possibly from the most recent versions of ICL. This might have been done to follow Microsoft's lead.

SergeyKostrov · ‎06-01-2012

Quoting TimP (Intel)

As the other post indicates, it seems the support for long double as distinct from double may have been removed from Parallel Studio...

I think that the problem is possibly related to the Precision Control of aFloating Point Unit ( FPU ). Hereare
results ofa quick test I've done:

...
Sub-Test - long double
24-bit : [ 0.1 * 0.1 = 0.00999999977648258 ] <- 0.1*0.1!= 0.01
53-bit : [ 0.1 * 0.1 = 0.01000000000000000 ]
64-bit : [ 0.1 * 0.1 = 0.01000000000000000 ]
Default : [ 0.1 * 0.1 = 0.01000000000000000 ]
...

You can see as soon as a 24-bit precision is set by a '_control87'functionaccuracy of calculations drops.

Aprecision changecould be possiblydone by some library the user 'mpl3d' uses.

TimP · ‎06-02-2012

Recent icc and icl used -pc80 to reset precision mode to 64-bit. That option worked together with /Qlong-double. Microsoft libraries never supported 64-bit precision mode.

SergeyKostrov · ‎06-04-2012

Quoting TimP (Intel)

Recent icc and icl used -pc80 to reset precision mode to 64-bit.That option worked together with /Qlong-double. Microsoft libraries never supported 64-bit precision mode.

Since'sizeof( long double )' equals to 8 with Microsoft C/C++ compiler how couldthey supportthe 64-bit
precision? Even a ~20-year old Turbo C++ compiler supports it because in its case 'sizeof( long double )' equals to 10.

MPere15 · ‎06-10-2012

Hello, many thanks for all your tips.
I'm afraid I have not been able to isolate the problem yet.

The project uses a lot of different stuff, and it seems that it is getting some sort of dead lock when checking start-up conditions.

Regarding the FPU settings I do need to set it like this

#pragma fenv_access (on)
unsigned int cw = _control87(0, 0);
_clear87();
_control87(0x00000000, 0x00030000);

It marks a difference here with the Microsoft compiler. When not set, I do encounter float jerkyness in some scenarios.

I thought that would give me 80-bit precision, but later I found that it could be giving only 53-bit.
It is true that the documentation is a little bit misleading (at least it is to me, specially regarding fenv_access). Here are some links:

http://msdn.microsoft.com/en-us/library/e9b52ceh(vs.71).aspx
http://msdn.microsoft.com/es-es/library/e7s85ffb(VS.90).aspx
http://msdn.microsoft.com/es-es/library/bfwa91s0(v=VS.90).aspx
http://msdn.microsoft.com/en-us/library/aa289157(VS.71).aspx

In any case, I would need to set this for the Intel compiler too. Should I understand by your comments that this is deprecated?
Many thanks.

SergeyKostrov · ‎06-11-2012

Quoting mpl3d

Hello, many thanks for all your tips.
I'm afraid I have not been able to isolate the problem yet.

The project uses a lot of different stuff, and it seems that it is getting some sort of dead lock when checking start-up conditions.

Regarding the FPU settings I do need to set it like this

#pragma fenv_access (on)
unsigned int cw = _control87(0, 0);
_clear87();
_control87(0x00000000, 0x00030000);

It marks a difference here with the Microsoft compiler. When not set, I do encounter float jerkyness in some scenarios...

Could you give some example? It really impossible toguess what is going on in some of your scenarios
without a solid example. Also, what libraries do you use?

SergeyKostrov · ‎06-11-2012

Quoting mpl3d

...Regarding the FPU settings I do need to set it like this

#pragma fenv_access (on)
unsigned int cw = _control87(0, 0);
_clear87();
_control87(0x00000000, 0x00030000);
...

Yes, it will work. However, you have to be very carefull when using constants in '_control87' functioninstead macros
defined in 'float.h' header file. It will work for Microsoft, Intel and MinGW C++ compilers, and it won't work for Borland and
Turbo C++ compilers. A better versionlooks like:

...
_control87( _CW_DEFAULT, _MCW_PC );
...

Using a_PC_64 macrois useless when sizeof( long double ) is equal to 8. It won't make a precision of FP-calculations better and
is actually the same as if you woulduse a_PC_53 macro.

SergeyKostrov · ‎06-11-2012

Quoting mpl3d

...
_control87(0x00000000, 0x00030000);
...

Here is a piece of codes from 'float.h' header file:

[cpp]... #define _MCW_PC 0x00030000 /* Precision Control */ #define _PC_64 0x00000000 /* 64 bits */ #define _PC_53 0x00010000 /* 53 bits */ #define _PC_24 0x00020000 /* 24 bits */ ... ... #define DBL_DIG 15 /* # of decimal digits of precision */ #define DBL_EPSILON 2.2204460492503131e-016 /* smallest such that 1.0+DBL_EPSILON != 1.0 */ #define DBL_MANT_DIG 53 /* # of bits in mantissa */ #define DBL_MAX 1.7976931348623158e+308 /* max value */ #define DBL_MAX_10_EXP 308 /* max decimal exponent */ #define DBL_MAX_EXP 1024 /* max binary exponent */ #define DBL_MIN 2.2250738585072014e-308 /* min positive value */ #define DBL_MIN_10_EXP (-307) /* min decimal exponent */ #define DBL_MIN_EXP (-1021) /* min binary exponent */ #define _DBL_RADIX 2 /* exponent radix */ #define _DBL_ROUNDS 1 /* addition rounding: near */ ... #define LDBL_DIG DBL_DIG /* # of decimal digits of precision */ #define LDBL_EPSILON DBL_EPSILON /* smallest such that 1.0+LDBL_EPSILON != 1.0 */ #define LDBL_MANT_DIG DBL_MANT_DIG /* # of bits in mantissa */ #define LDBL_MAX DBL_MAX /* max value */ #define LDBL_MAX_10_EXP DBL_MAX_10_EXP /* max decimal exponent */ #define LDBL_MAX_EXP DBL_MAX_EXP /* max binary exponent */ #define LDBL_MIN DBL_MIN /* min positive value */ #define LDBL_MIN_10_EXP DBL_MIN_10_EXP /* min decimal exponent */ #define LDBL_MIN_EXP DBL_MIN_EXP /* min binary exponent */ #define _LDBL_RADIX DBL_RADIX /* exponent radix */ #define _LDBL_ROUNDS DBL_ROUNDS /* addition rounding: near */ ... [/cpp]Asyou canseeall LDBL_* macrosbased on DBL_* macros.

SergeyKostrov · ‎06-11-2012

Quoting mpl3d

...
_control87(0x00000000, 0x00030000);
...

In case of a portable software I recommend to isolate as many as possible macros from 'float.h' header file.
Here is an example how it could be done:

[cpp]... For Microsoft ( Desktop & CE ) & Intel C++ compilers #if ( defined ( _WIN32_MSC ) || defined ( _WIN32CE_MSC ) || defined ( _WIN32_ICC ) ) #define _RTFPU_MCW_PC MCW_PC #define _RTFPU_CW_DEFAULT _CW_DEFAULT #define _RTFPU_CW_ALLBITSON 0xFFFFF #define _RTFPU_MCW_EM _MCW_EM ... #define _RTFPU_PC_53 _PC_53 #define _RTFPU_CW_PC53_RCNEAR ( _PC_53+_RC_NEAR+_EM_INVALID+_EM_ZERODIVIDE+_EM_OVERFLOW+... ) ... #define _RTFPU_EM_INEXACT _EM_INEXACT // 0x00000001 #define _RTFPU_EM_ZERODIVIDE _EM_ZERODIVIDE // 0x00000008 #endif For MinGW C++ compiler #if defined ( _WIN32_MGW ) #define _RTFPU_MCW_PC _MCW_PC #define _RTFPU_CW_DEFAULT ( _PC_53+_RC_NEAR+_EM_INVALID+_EM_ZERODIVIDE+_EM_OVERFLOW+... ) #define _RTFPU_CW_ALLBITSON 0xFFFFF #define _RTFPU_MCW_EM _MCW_EM ... #define _RTFPU_PC_53 _PC_53 #define _RTFPU_CW_PC53_RCNEAR ( _PC_53+_RC_NEAR+_EM_INVALID+_EM_ZERODIVIDE+_EM_OVERFLOW+... ) ... #define _RTFPU_EM_INEXACT _EM_INEXACT // 0x00000001 #define _RTFPU_EM_ZERODIVIDE _EM_ZERODIVIDE // 0x00000008 #endif For Borland C++ compiler #if defined ( _WIN32_BCC ) #define _RTFPU_MCW_PC MCW_PC #define _RTFPU_CW_DEFAULT CW_DEFAULT #define _RTFPU_CW_ALLBITSON 0xFFFFF #define _RTFPU_MCW_EM MCW_EM ... #define _RTFPU_PC_53 PC_53 #define _RTFPU_CW_PC53_RCNEAR ( PC_53+ RC_NEAR+ EM_INVALID+ EM_ZERODIVIDE+ EM_OVERFLOW+... ) ... #define _RTFPU_EM_INEXACT EM_INEXACT // 0x00000020 #define _RTFPU_EM_ZERODIVIDE EM_ZERODIVIDE // 0x00000004 #endif For Turbo C++ compiler #if defined ( _COS16_TCC ) #define _RTFPU_MCW_PC MCW_PC #define _RTFPU_CW_DEFAULT CW_DEFAULT #define _RTFPU_CW_ALLBITSON 0xFFFF #define _RTFPU_MCW_EM MCW_EM ... #define _RTFPU_PC_53 PC_53 #define _RTFPU_CW_PC53_RCNEAR ( PC_53+ RC_NEAR+ EM_INVALID+ EM_ZERODIVIDE+ EM_OVERFLOW+... ) ... #define _RTFPU_EM_INEXACT EM_INEXACT // 0x00000020 #define _RTFPU_EM_ZERODIVIDE EM_ZERODIVIDE // 0x00000004 #endif ...

Can you see that values for some macros, like EM_INEXACT or EM_ZERODIVIDE, different?
[/cpp]

SergeyKostrov · ‎06-11-2012

Quoting mpl3d

...
unsigned int cw = _control87(0, 0);
...

Regarding a Default state of FPU.Here is example ofDefault states of FPU:

// Microsoft & Intel C++ compilers ( Visual Studio 20xx )
Default: 0x9001F
0.1 * 0.1 = 1.000000000000000e-002

// Microsoft C++ compiler ( Visual Studio 98 )
Default: 0x9001F
0.1 * 0.1 = 1.000000000000000e-002

// MinGW C++ compiler
Default: 0x9001F
0.1 * 0.1 = 1.000000000000000e-002

// Borland C++ compiler
Default: 0x1372
0.1 * 0.1 = 1.000000000000000e-02

// Turbo C++ compiler
Default: 0x1372
0.1 * 0.1 = 1.000000000000000e-02

Remember, thata differentstate ofFPU could create inconsistencies and this is what you have at the moment.

MPere15 · ‎06-12-2012

@Sergey, many thanks for all your posts.

They certainly clarify my initial confusion with the Intel C++ compiler regarding float treatment.

I'm sorry that I've not been able to isolate the issue yet.

Full code is 150k lines, and start-up might take something like 5k-10k lines, and I have not had the time to debug these lines.

It seems that, at some point during start-up, a comparison involving floats is not being succesful (although it is in the Microsoft compiler), and therefore the code flow doesn't enter to create some crucial stuff. I'll find out which is the problematic function/operation and I'll let you know.

Many thanks.

SergeyKostrov · ‎06-21-2012

Quoting mpl3d

...
I'm sorry that I've not been able to isolate the issue yet.

Full code is 150k lines, and start-up might take something like 5k-10k lines, and I have not had the time to debug these lines...

Hi,

Is there any progress with investigation?

Best regards,
Sergey

MPere15 · ‎06-26-2012

Sergey,

Many thanks for your interest, I see that you stay tuned.
No further investigation yet (no time).
I'll post any conclusion and I'll let you know, I'm sorry that I'm too much overloaded.

Best regards,
Manuel

MPere15 · ‎07-17-2012

License expired.
I had the time to test today but the trial has expired.

Last thing I was able to find out is that the math functions that I use were returning 0.

These are in a static .lib library, and they pretty much wrap the native DX math functions that work with 3d vectors.

I was isolating the issue and playing with the float related compiler options, and then it compiled no more.

Oh well, I'll try to set it up in a different machine one of these days. Sorry by now.

SergeyKostrov · ‎07-17-2012

Quoting mpl3d

...Last thing I was able to find out is that the math functions that I use were returning 0.

These are in a static .lib library, and they pretty much wrap the native DX math functions that work with 3d vectors...

Could you provide more technical details ( headers, sources )?

MPere15 · ‎07-18-2012

@Sergey,

These are the function prototypes:

////////////////////////////////////////

bool dbMakeVector3 ( int iID );
bool dbDeleteVector3 ( int iID );
void dbSetVector3 ( int iID, float fX, float fY, float fZ );
float dbXVector3 ( int iID );
float dbYVector3 ( int iID );
float dbZVector3 ( int iID );
////////////////////////////////////////

And I'm testing like this:

////////////////////////////////////////
float test_valueX = 0.0f;
float test_valueY = 0.0f;
float test_valueZ = 0.0f;

//Use math funcions
dbMakeVector3 ( 1 ); //Create vector 3
dbSetVector3 ( 1, 1.0f, 2.0f, 3.0f); //Set vector components

//Debug
test_valueX = dbXVector3 ( 1 ); //Recover component X value
test_valueY = dbYVector3 ( 1 ); //Recover component Y value
test_valueZ = dbZVector3 ( 1 ); //Recover component Z value
////////////////////////////////////////

I was going to pack a VS2008 sample project, but then I realized that DirectX SDK (August 2007) must be installed to be able to compile, and I don't want to bother you that far.

However, I've uploaded the library '3DMaths.lib' here, perhaps you would like to inspect it:

http://dc613.4shared.com/download/ihywqMym/3DMaths.lib

Also comment that Intel Development Support is helping me out to expand the trial period, so hopefully I'll be able to be back on this issue soon.
Also, they pointed me to this link:

http://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler/

Looking forward to know if finally those static libraries still can be used with the Intel C++ compiler. I'm very interested to test parallel performance on my application. Now this time I won't let the trial expire again, sorry for that.

Many thanks all along.
Best regards,
Manuel

SergeyKostrov · ‎07-18-2012

Hi Manuel,

Quoting mpl3d

...I was going to pack a VS2008 sample project, but then I realized that DirectX SDK (August 2007) must be installed to be able to compile, and I don't want to bother you that far...

I have DirectX SDK (June 2007) installed on one of my computers.

So, if you make aVisual Studio 2005 or 2008 testproject I'll be able to look at it. Please don't over-complicatethe test project
because there is a two-month difference between DirectX SDKs. I also take a look at libraries you've attached.

Best regards,
Sergey

MPere15 · ‎07-21-2012

Hi Sergey,

Many thanks for your support.
I've posted a link to a sample project in a private post, since it is not my aim to distribute such libraries.
Please let me know if you downloaded it succesfully.

Best regards,
Manuel

SergeyKostrov · ‎07-21-2012

Quoting mpl3d

...However, I've uploaded the library '3DMaths.lib' here, perhaps you would like to inspect it:

http://dc613.4shared.com/download/ihywqMym/3DMaths.lib

[SergeyK] Downloaded and I'm about to start investigation.

Also comment that Intel Development Support is helping me out to expand the trial period, so hopefully I'll be able to be back on this issue soon.
Also, they pointed me to this link:

http://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler/

[SergeyK] I know that article and I would rate it as 'One Of The Best'.